Performance Analysis and Optimization
=====================================

This document outlines the performance characteristics of the Inhabit model, identifies current bottlenecks, and provides a roadmap for future optimizations based on the comprehensive code analysis.

Overview of Performance Characteristics
---------------------------------------

The Inhabit model processes large-scale longitudinal survey data (SOEP) and projects it over multiple decades. The computational complexity is primarily driven by:

1.  **High-Dimensionality**: The inhabit matrix cross-tabulates multiple household and dwelling dimensions, leading to a large number of unique combinations.
2.  **Iterative Processes**: The allocation algorithm runs multiple iterations per year to match searching households to available dwellings.
3.  **Recursive Calibration**: The Iterative Proportional Fitting (IPF) algorithm requires multiple passes to converge on census targets.
4.  **Temporal Depth**: Simulations often span 20-40 years, magnifying any per-year inefficiencies.

Identified Bottlenecks
----------------------

Based on profiling and code review, the following areas have been identified as the primary sources of overhead:

Iterative Row Processing
~~~~~~~~~~~~~~~~~~~~~~~~
Several core functions utilize ``pandas.DataFrame.iterrows()`` or similar row-wise iteration patterns. In Python, this is significantly slower than vectorized operations as it involves converting each row to a Series object.
*   **Impact**: High. Specifically visible in the allocation loop and custom disaggregation functions.
*   **Recommendation**: Replace row-wise logic with vectorized NumPy operations or grouped aggregations where possible.

Repeated Mask Creation
~~~~~~~~~~~~~~~~~~~~~~
The model frequently creates boolean masks (e.g., ``df[df['col'] == value]``) inside nested loops to filter data for specific combinations.
*   **Impact**: Medium-High. Repeatedly indexing and searching the entire DataFrame is computationally expensive.
*   **Recommendation**: Use multi-indexing or pre-filter DataFrames into dictionaries of sub-groups.

Redundant I/O Operations
~~~~~~~~~~~~~~~~~~~~~~~~
Reading and writing CSV/Excel files within the simulation loop adds significant latency due to disk I/O overhead.
*   **Impact**: Medium. Particularly noticeable when saving intermediate results for every year.
*   **Recommendation**: Implement batch I/O or only save results at the end of the simulation.

Frequent DataFrame Copying
~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions often use ``df.copy()`` to avoid side effects. While safe, excessive copying of large DataFrames consumes memory and adds processing time.
*   **Impact**: Low-Medium.
*   **Recommendation**: Use in-place operations (``inplace=True``) where state management allows, or pass references more strategically.

Optimization Roadmap
--------------------

A staged approach is recommended to improve model performance without compromising the integrity of the results.

Stage 1: Vectorization (Short Term)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*   Refactor ``inhabit_matrix.py`` disaggregation functions to use vectorized mapping instead of row-wise processing.
*   Update ``allocation.py`` handlers to operate on entire columns of attributes rather than individual rows.

Stage 2: Algorithmic Efficiency (Medium Term)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*   **IPF Optimization**: The census calibration already implements "Factor Reuse," which provides a ~100x speedup for multi-year runs. Ensure this pattern is utilized across all calibration steps.
*   **Caching**: Implement memoization for preference matrices that don't change between simulation years.

Stage 3: Parallelization (Long Term)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*   **Multiprocessing**: Utilize Python's ``concurrent.futures`` or ``multiprocessing`` to run independent regional simulations (e.g., Rural vs. Urban) or independent scenarios in parallel.
*   **Dask/Polars**: For extremely large datasets, consider migrating core data processing from Pandas to Polars or Dask to take advantage of multi-core hardware and lazy evaluation.

Memory Management
-----------------

The model can be memory-intensive when running long-term projections. To minimize the memory footprint:
1.  **Downcasting**: Cast numeric columns to the smallest possible type (e.g., ``float32`` instead of ``float64``).
2.  **Categorical Data**: Convert string-based dimensions (like building type or region) to the Pandas ``category`` dtype.
3.  **Garbage Collection**: Explicitly delete large temporary DataFrames within the simulation loop to free up RAM.

Benchmarking
------------

Developers are encouraged to use the ``@misc.timer_func`` decorator on core functions to track execution time during development. Any significant refactor should be accompanied by a benchmark comparison against the baseline performance.