Conversation
Five coordinated changes that together cut peak RSS during demand flex by ~18 GB for large utilities (ConEd ~15k buildings): 1. Eliminate per-building loop + pd.concat in process_residential_hourly_demand_response_shift: replace with groupby.transform for Q_orig and a dict lookup for load_shift, avoiding a full merge that doubled memory for tens-of-millions-of-row slices. Return (shifted_net, hourly_shift, tracker) numpy arrays instead of DataFrames. 2. Eliminate tou_df (full TOU cohort copy) in apply_runtime_tou_demand_response: each season slice is now extracted just-in-time from the output DataFrame, and shifts are written back in-place rather than collected for a final concat. 3. Add inplace=True mode to apply_runtime_tou_demand_response: callers that already hold a copy can skip the internal copy entirely. 4. In apply_demand_flex, make one copy upfront, precompute the per-TOU-key original weighted system loads (tiny 8760-row Series) before copying, then del raw_load_elec so the original is released before the shift begins. Phase 2.5 uses the precomputed Series instead of the full original DataFrame. 5. In run_scenario.py, del raw_load_elec after the flex branch so the caller's reference is also freed before bs.simulate() runs. All changes validated numerically: CenHud runs 13-16 produce bit-identical outputs (zero max abs/rel diff) vs the pre-optimization gold baseline across all 8 artifacts (BAT, bills, elasticity tracker, metadata, tariff config). Made-with: Cursor
Phase 2.5 computes per-TOU-subclass MC deltas for revenue requirement splitting between HP and non-HP customer classes. This work is only needed when a run has multiple tariff subclasses (e.g. ConEd runs 13-16 with hp/nonhp). Single-tariff runs (e.g. NiMo, CenHud) can skip it entirely, saving memory and compute on the full effective_load_elec scan. Pass run_includes_subclasses from ScenarioSettings through to apply_demand_flex so it can guard the Phase 2.5 block. Made-with: Cursor
Add a pre-run cross-check to validate_config.py: for each run in the scenario YAML, compare the explicit run_includes_subclasses flag against whether path_tariffs_electric has more than one key (the canonical source of truth). Print a warning to stderr if they disagree so config mistakes are caught before CAIRO starts. Made-with: Cursor
The TOU schedule had peak period (period 1/3) starting at hour index 16 (4pm); correct start is hour 15 (3pm). Shift the on-peak block back one hour in both weekday and weekend schedules for all seasons in the flex and flex_calibrated tariffs. Made-with: Cursor
Set python.analysis.diagnosticMode to openFilesOnly so Pyright does not index the entire workspace. On a shared EC2 instance this prevents the language server from consuming 4+ GB of RAM that competes with CAIRO runs. Made-with: Cursor
New utils/post/compare_cairo_runs.py compares two S3 CAIRO run directories file-by-file (BAT values, bills, elasticity tracker, metadata, tariff config) using configurable rtol/atol tolerances. Exits non-zero if any diff exceeds tolerance so it can be used in CI or as a manual regression check. Used during this branch to validate that memory optimizations produced bit-identical outputs against the CenHud gold baseline (zero max diff across all 8 artifacts for runs 13-16). Made-with: Cursor
4 tasks
- cairo.py: cast get_level_values result to DatetimeIndex before accessing .month; ty stubs don't expose .month on the generic Index return type, suppress with type: ignore[attr-defined] - compare_cairo_runs.py: guard df_chal.height behind an explicit is-not-None check; suppress overly-wide Polars .max() return type on float() conversion with type: ignore[arg-type] Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #378
This PR fixes OOM kills on ConEd and NiMo runs 15-16 (and intermittently 13-14) by reducing peak RSS during the demand-flex pipeline and adding correctness/config guardrails.
What's in this PR
Memory optimizations (committed earlier in
1348c4d, included here):process_residential_hourly_demand_response_shift: replace per-building groupby loop +pd.concatwithgroupby.transform+ dict lookup, returning numpy arrays instead of DataFrames. This was the primary memory bottleneck for large utilities (~15k buildings).tou_df(full TOU cohort copy) andshifted_chunksinapply_runtime_tou_demand_response: extract each season slice just-in-time and write back in-place.inplace=Truemode soapply_demand_flexcan skip a redundant DataFrame copy.raw_load_elec, thendel raw_load_electo free the original before shifting begins.del raw_load_elecinrun_scenario.pyafter the flex branch so the caller's reference is also freed beforebs.simulate().Validated numerically: CenHud runs 13-16 produce zero diff vs pre-optimization gold baseline across all 8 output artifacts.
Phase 2.5 bypass (
1e3b26b): Skip the per-TOU-subclass MC delta computation whenrun_includes_subclasses=False. Phase 2.5 scans the full effective load DataFrame for each TOU key — unnecessary for single-tariff runs (NiMo, CenHud, etc.) that don't split revenue requirements by subclass.Config validation (
8daef7d):validate_config.pynow warns ifrun_includes_subclassesdisagrees with the number of keys inpath_tariffs_electric, catching YAML inconsistencies before a run starts.ConEd TOU schedule fix (
5779d9a): Correct the HP seasonal TOU peak window inconed_hp_seasonalTOU_flex.json— peak period started at hour 16 (4pm) but should start at hour 15 (3pm).IDE memory (
eb1ad7a): Setpython.analysis.diagnosticMode: openFilesOnlyin.vscode/settings.jsonto prevent Pyright from consuming 4+ GB on shared EC2 instances.Validation tool (
ad02c9c):utils/post/compare_cairo_runs.py— CLI to compare two CAIRO run directories on S3 numerically. Used throughout this branch to confirm outputs are unchanged after each optimization step.Reviewer focus
process_residential_hourly_demand_response_shift(dict lookup +groupby.transform) is the highest-impact change — worth a close read to confirm the zero-sum writeback is correct.run_includes_subclassesflag thatrun_scenario.pyalready uses to decide whether to split revenue requirements — so the logic is consistent.Made with Cursor