Performance Analysis and Optimization

This document outlines the performance characteristics of the Inhabit model, identifies current bottlenecks, and provides a roadmap for future optimizations based on the comprehensive code analysis.

Overview of Performance Characteristics

The Inhabit model processes large-scale longitudinal survey data (SOEP) and projects it over multiple decades. The computational complexity is primarily driven by:

High-Dimensionality: The inhabit matrix cross-tabulates multiple household and dwelling dimensions, leading to a large number of unique combinations.
Iterative Processes: The allocation algorithm runs multiple iterations per year to match searching households to available dwellings.
Recursive Calibration: The Iterative Proportional Fitting (IPF) algorithm requires multiple passes to converge on census targets.
Temporal Depth: Simulations often span 20-40 years, magnifying any per-year inefficiencies.

Identified Bottlenecks

Based on profiling and code review, the following areas have been identified as the primary sources of overhead:

Iterative Row Processing

Several core functions utilize pandas.DataFrame.iterrows() or similar row-wise iteration patterns. In Python, this is significantly slower than vectorized operations as it involves converting each row to a Series object. * Impact: High. Specifically visible in the allocation loop and custom disaggregation functions. * Recommendation: Replace row-wise logic with vectorized NumPy operations or grouped aggregations where possible.

Repeated Mask Creation

The model frequently creates boolean masks (e.g., df[df['col'] == value]) inside nested loops to filter data for specific combinations. * Impact: Medium-High. Repeatedly indexing and searching the entire DataFrame is computationally expensive. * Recommendation: Use multi-indexing or pre-filter DataFrames into dictionaries of sub-groups.

Redundant I/O Operations

Reading and writing CSV/Excel files within the simulation loop adds significant latency due to disk I/O overhead. * Impact: Medium. Particularly noticeable when saving intermediate results for every year. * Recommendation: Implement batch I/O or only save results at the end of the simulation.

Frequent DataFrame Copying

Functions often use df.copy() to avoid side effects. While safe, excessive copying of large DataFrames consumes memory and adds processing time. * Impact: Low-Medium. * Recommendation: Use in-place operations (inplace=True) where state management allows, or pass references more strategically.

Optimization Roadmap

A staged approach is recommended to improve model performance without compromising the integrity of the results.

Stage 1: Vectorization (Short Term)

Refactor inhabit_matrix.py disaggregation functions to use vectorized mapping instead of row-wise processing.
Update allocation.py handlers to operate on entire columns of attributes rather than individual rows.

Stage 2: Algorithmic Efficiency (Medium Term)

IPF Optimization: The census calibration already implements “Factor Reuse,” which provides a ~100x speedup for multi-year runs. Ensure this pattern is utilized across all calibration steps.
Caching: Implement memoization for preference matrices that don’t change between simulation years.

Stage 3: Parallelization (Long Term)

Multiprocessing: Utilize Python’s concurrent.futures or multiprocessing to run independent regional simulations (e.g., Rural vs. Urban) or independent scenarios in parallel.
Dask/Polars: For extremely large datasets, consider migrating core data processing from Pandas to Polars or Dask to take advantage of multi-core hardware and lazy evaluation.

Memory Management

The model can be memory-intensive when running long-term projections. To minimize the memory footprint: 1. Downcasting: Cast numeric columns to the smallest possible type (e.g., float32 instead of float64). 2. Categorical Data: Convert string-based dimensions (like building type or region) to the Pandas category dtype. 3. Garbage Collection: Explicitly delete large temporary DataFrames within the simulation loop to free up RAM.

Benchmarking

Developers are encouraged to use the @misc.timer_func decorator on core functions to track execution time during development. Any significant refactor should be accompanied by a benchmark comparison against the baseline performance.