Fix base scenario coin leak and aggregate method handling

enarjord · claude · enarjord · commit 20eb3eadd13d · 2026-02-24T09:45:09.000-05:00
Two bugs fixed:

1. apply_scenario fell back to master_coins (union of all scenarios)
   for scenarios without explicit coins. Now falls back to base_coins
   (the original approved_coins from config) via new optional
   base_coins/base_ignored parameters threaded through suite_runner,
   optimize_suite, and run_backtest_scenario.

2. calc_fitness always used _mean stats for scoring objectives,
   ignoring backtest.aggregate config (e.g. "max" for
   high_exposure_hours_max_long). SuiteEvaluator now overrides
   flat_stats with correctly aggregated values before calling
   calc_fitness. pareto_store.py script reads the aggregate config
   from entries, uses it in _suite_metrics_to_stats fallback paths,
   and applies ratio-based correction to stored objectives.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -13,6 +13,8 @@ All notable user-facing changes will be documented in this file.
 - **Combined OHLCV normalization source selection** - Volume normalization in combined backtests now uses each coin's OHLCV source exchange (`ohlcv_source`) instead of the market-settings exchange when `backtest.market_settings_sources` differs from OHLCV routing.
 - **Config template/format preservation** - Added `live.enable_archive_candle_fetch` to the template defaults and ensured `backtest.market_settings_sources` is preserved during config formatting.
 - **Live no-fill minute EMA continuity** - When finalized 1m candles are missing because no trades occurred, live runtime now materializes synthetic zero-candles in memory (not on disk), preventing avoidable `MissingEma` loop errors on illiquid symbols. If real candles arrive later, they overwrite synthetic runtime candles and invalidate EMA cache automatically.
+- **Suite base scenario inherited all scenario coins** - Scenarios without explicit `coins` (e.g. the `"base"` scenario) fell back to `master_coins` — the union of every scenario's coin list — instead of the original `approved_coins` from the config. Now `apply_scenario` falls back to `base_coins` (the config's `approved_coins`) when a scenario omits its own coin list.
+- **Aggregate methods ignored in optimizer scoring and Pareto analysis** - `calc_fitness` always looked up the `_mean` stat for every scoring metric, ignoring the `backtest.aggregate` config (e.g. `"high_exposure_hours_max_long": "max"`). The optimizer now overrides `flat_stats` with correctly aggregated values before computing objectives. The standalone `pareto_store.py` script also reads the aggregate config from each entry and corrects stored objectives via a ratio adjustment, and uses the config when resolving limit filters.
 
 ### Fixed
 - **Backtest HLCV cache reuse across configs** - Configs that differ only in trading parameters (EMA spans, warmup ratio) now share the same HLCV cache slot. Previously, different EMA spans produced different `warmup_minutes`, which was included in the cache hash, causing unnecessary re-downloads. The cache now uses a ratchet-up strategy: warmup sufficiency is checked at load time, and the cache is overwritten only when a larger warmup is needed.
diff --git a/src/optimize.py b/src/optimize.py
@@ -1218,6 +1218,11 @@ def evaluate(self, individual, overrides_list):
         aggregate_stats = aggregate_summary.get("stats", {})
 
         flat_stats = flatten_metric_stats(aggregate_stats)
+        # Override _mean with correctly aggregated values so calc_fitness
+        # respects the aggregate config (e.g. "max" instead of "mean").
+        aggregated_values = aggregate_summary.get("aggregated", {})
+        for metric, agg_value in aggregated_values.items():
+            flat_stats[f"{metric}_mean"] = agg_value
         objectives, total_penalty = self.base.calc_fitness(flat_stats)
         objectives_map = {f"w_{i}": val for i, val in enumerate(objectives)}
 
diff --git a/src/optimize_suite.py b/src/optimize_suite.py
@@ -248,6 +248,8 @@ def _build_lazy_mss_slice(
                 available_exchanges=dataset_available_exchanges,
                 available_coins=available_coins,
                 base_coin_sources=suite_coin_sources,
+                base_coins=base_coins_list,
+                base_ignored=base_ignored_list,
             )
         except ValueError as exc:
             logging.warning("Skipping scenario %s: %s", scenario.label, exc)
diff --git a/src/pareto_store.py b/src/pareto_store.py
@@ -72,7 +72,23 @@ def _resolve_limit_value(
     return stats_flat.get(key)
 
 
-def _suite_metrics_to_stats(entry: Dict[str, Any]) -> Tuple[Dict[str, float], Dict[str, float]]:
+def _resolve_aggregate_mode(
+    metric: str, aggregate_cfg: Optional[Dict[str, str]]
+) -> str:
+    """Return the aggregate mode for *metric* given an aggregate config dict."""
+    if not aggregate_cfg:
+        return "mean"
+    mode = aggregate_cfg.get(metric)
+    if mode is None and "_" in metric:
+        base = metric.rsplit("_", 1)[0]
+        mode = aggregate_cfg.get(base)
+    return str(mode or aggregate_cfg.get("default", "mean")).lower()
+
+
+def _suite_metrics_to_stats(
+    entry: Dict[str, Any],
+    aggregate_cfg: Optional[Dict[str, str]] = None,
+) -> Tuple[Dict[str, float], Dict[str, float]]:
     aggregated_values: Dict[str, float] = {}
     stats_flat: Dict[str, float] = {}
     suite_metrics = entry.get("suite_metrics") or {}
@@ -83,13 +99,20 @@ def _suite_metrics_to_stats(entry: Dict[str, Any]) -> Tuple[Dict[str, float], Di
                 stats_flat.update(flatten_metric_stats({metric: stats}))
             agg = payload.get("aggregated")
             if agg is None and stats:
-                agg = stats.get("mean")
+                mode = _resolve_aggregate_mode(metric, aggregate_cfg)
+                agg = stats.get(mode, stats.get("mean"))
             if agg is not None:
                 aggregated_values[metric] = agg
     elif "aggregate" in suite_metrics:
         aggregate = suite_metrics.get("aggregate") or {}
         agg_stats = aggregate.get("stats") or {}
         aggregated_values = aggregate.get("aggregated") or {}
+        if not aggregated_values and agg_stats and aggregate_cfg:
+            for metric, metric_stats in agg_stats.items():
+                mode = _resolve_aggregate_mode(metric, aggregate_cfg)
+                val = metric_stats.get(mode, metric_stats.get("mean"))
+                if val is not None:
+                    aggregated_values[metric] = val
         stats_flat = flatten_metric_stats(agg_stats)
     return stats_flat, aggregated_values
 
@@ -574,15 +597,37 @@ def parse_limit_expr(expr: str) -> LimitSpec:
                 metric_names = entry.get("optimize", {}).get("scoring", [])
                 metric_name_map = {f"w_{i}": name for i, name in enumerate(metric_names)}
             metrics_block = entry.get("metrics", {}) or {}
-            objectives = metrics_block.get("objectives", metrics_block)
+            objectives = dict(metrics_block.get("objectives", metrics_block))
+            aggregate_cfg = entry.get("backtest", {}).get("aggregate")
             stats_flat: Dict[str, float] = {}
             aggregated_values: Dict[str, float] = {}
             if "stats" in metrics_block:
                 stats_flat = flatten_metric_stats(metrics_block["stats"])
             if "suite_metrics" in entry:
-                stats_flat_suite, aggregated_values_suite = _suite_metrics_to_stats(entry)
+                stats_flat_suite, aggregated_values_suite = _suite_metrics_to_stats(
+                    entry, aggregate_cfg=aggregate_cfg,
+                )
                 stats_flat.update(stats_flat_suite)
                 aggregated_values.update(aggregated_values_suite)
+                # Correct objectives for scoring metrics whose aggregate method
+                # is not "mean".  The stored w_i was computed as metric_mean * weight;
+                # the correct value is metric_agg * weight.  We apply a ratio
+                # correction (agg / mean) so we don't need the scoring weights.
+                constraint_violation = metrics_block.get("constraint_violation", 0.0)
+                if aggregate_cfg and not constraint_violation:
+                    scoring_keys = entry.get("optimize", {}).get("scoring", [])
+                    for idx, sk in enumerate(scoring_keys):
+                        mode = _resolve_aggregate_mode(sk, aggregate_cfg)
+                        if mode == "mean":
+                            continue
+                        w_key = f"w_{idx}"
+                        stored = objectives.get(w_key)
+                        if stored is None:
+                            continue
+                        agg_val = aggregated_values.get(sk)
+                        mean_val = stats_flat.get(f"{sk}_mean")
+                        if agg_val is not None and mean_val and mean_val != 0.0:
+                            objectives[w_key] = stored * (agg_val / mean_val)
             if not w_keys:
                 all_w_keys = sorted(k for k in objectives if k.startswith("w_"))
 
diff --git a/src/suite_runner.py b/src/suite_runner.py
@@ -618,6 +618,8 @@ def apply_scenario(
     available_coins: set[str],
     base_coin_sources: Optional[Dict[str, str]] = None,
     *,
+    base_coins: Optional[List[str]] = None,
+    base_ignored: Optional[List[str]] = None,
     quiet: bool = False,
 ) -> Tuple[Dict[str, Any], List[str]]:
     cfg = deepcopy(base_config)
@@ -635,9 +637,11 @@ def apply_scenario(
         tracker.update(["backtest", "end_date"], backtest_section.get("end_date"), new_end)
         backtest_section["end_date"] = new_end
 
-    scenario_coins = list(scenario.coins) if scenario.coins is not None else list(master_coins)
+    default_coins = base_coins if base_coins is not None else master_coins
+    default_ignored = base_ignored if base_ignored is not None else master_ignored
+    scenario_coins = list(scenario.coins) if scenario.coins is not None else list(default_coins)
     scenario_ignored = (
-        list(scenario.ignored_coins) if scenario.ignored_coins is not None else list(master_ignored)
+        list(scenario.ignored_coins) if scenario.ignored_coins is not None else list(default_ignored)
     )
 
     filtered_coins = [coin for coin in scenario_coins if coin in available_coins]
@@ -834,6 +838,8 @@ async def run_backtest_scenario(
     results_root: Optional[Path],
     disable_plotting: bool,
     base_coin_sources: Optional[Dict[str, str]] = None,
+    base_coins: Optional[List[str]] = None,
+    base_ignored: Optional[List[str]] = None,
 ) -> ScenarioResult:
     from backtest import (
         build_backtest_payload,
@@ -849,6 +855,8 @@ async def run_backtest_scenario(
         available_exchanges=available_exchanges,
         available_coins=available_coins,
         base_coin_sources=base_coin_sources,
+        base_coins=base_coins,
+        base_ignored=base_ignored,
     )
     scenario_config["disable_plotting"] = disable_plotting
 
@@ -1474,6 +1482,8 @@ async def run_backtest_suite_async(
             available_exchanges=dataset_available_exchanges,
             available_coins=available_coins,
             base_coin_sources=suite_coin_sources,
+            base_coins=base_coins,
+            base_ignored=base_ignored,
             quiet=True,
         )
         coin_exchange = _compute_effective_coin_exchange(
@@ -1516,6 +1526,8 @@ async def run_backtest_suite_async(
             suite_dir,
             disable_plotting=disable_plotting,
             base_coin_sources=suite_coin_sources,
+            base_coins=base_coins,
+            base_ignored=base_ignored,
         )
         results.append(result)
         logging.info(
diff --git a/tests/test_aggregate_methods.py b/tests/test_aggregate_methods.py

Original file line number	Diff line number	Diff line change
`@@ -248,6 +248,8 @@ def _build_lazy_mss_slice(`
`248`	`248`	`available_exchanges=dataset_available_exchanges,`
`249`	`249`	`available_coins=available_coins,`
`250`	`250`	`base_coin_sources=suite_coin_sources,`
	`251`	`+ base_coins=base_coins_list,`
	`252`	`+ base_ignored=base_ignored_list,`
`251`	`253`	`)`
`252`	`254`	`except ValueError as exc:`
`253`	`255`	`logging.warning("Skipping scenario %s: %s", scenario.label, exc)`