You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix prediction/backtest inconsistencies in ensemble scripts through improved data alignment and normalization logic, and update multilingual documentation.
Copy file name to clipboardExpand all lines: docs/en/02_BRUTE_FORCE_GUIDE.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,6 +51,14 @@ python quantpits/scripts/brute_force_ensemble.py --use-groups --group-config con
51
51
-**Weight Optimization**: Comparative trials on Top 10 single models simulating Max Sharpe / Risk Parity optimization mappings.
52
52
-**Comprehensive Reporting**: Generates autonomous summaries of MVP models and superior fusions.
53
53
54
+
> [!NOTE]
55
+
> **Understanding Metric Discrepancies: Single Models vs. Ensemble Backtests**
56
+
>
57
+
> When evaluating model performance within fusion and brute-force architectures, strict **Z-Score Normalization** and **Data Alignment** processing govern the engine. Therefore, because of TopK position bounding, backtest results of a single model here may exhibit reasonable, micro-level disparities from the raw metrics evaluated naturally post-training (e.g. via `run_analysis.py`):
58
+
> 1.**Isolated Normalization**: Each model calculates its daily cross-sectional Z-scores purely on its *own* non-null predicted universe. Scaling remains mathematically uniform, and a single model's signal scale cannot be skewed by other models' data coverage gaps prior to scoring.
59
+
> 2.**Delayed Intersection**: Strict intersection dropping (`dropna(how='any')`) is executed strictly at the exact combo scoring phase and is limited precisely to the subset of models within that specific combo iteration. This guarantees irrelevant sub-models don't unilaterally shrink the evaluated combination universe.
60
+
> 3.**Benchmarking Alignment**: The sub-model evaluation leaderboard dynamically slices historical records to match the precise temporal boundaries established by the current ensemble matrix index. This constructs a perfect "apples-to-apples" comparison avoiding overlapping timeframe distortion.
Copy file name to clipboardExpand all lines: docs/en/03_ENSEMBLE_FUSION_GUIDE.md
+8Lines changed: 8 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -193,6 +193,14 @@ output/
193
193
> [!TIP]
194
194
> The `default` combo will redundantly output a nameless `ensemble_{date}.csv` artifact, guaranteeing absolute zero-modification compliance for downstream utilities like `order_gen.py`.
195
195
196
+
> [!NOTE]
197
+
> **Understanding Metric Discrepancies: Single Models vs. Ensemble Backtests**
198
+
>
199
+
> When evaluating model performance within fusion and brute-force architectures, strict **Z-Score Normalization** and **Data Alignment** processing govern the engine. Therefore, because of TopK position bounding, backtest results of a single model here may exhibit reasonable, micro-level disparities from the raw metrics evaluated naturally post-training (e.g. via `run_analysis.py`):
200
+
> 1.**Isolated Normalization**: Each model calculates its daily cross-sectional Z-scores purely on its *own* non-null predicted universe. Scaling remains mathematically uniform, and a single model's signal scale cannot be skewed by other models' data coverage gaps prior to scoring.
201
+
> 2.**Delayed Intersection**: Strict intersection dropping (`dropna(how='any')`) is executed strictly at the exact combo scoring phase and is limited precisely to the subset of models within that specific combo iteration. This guarantees irrelevant sub-models don't unilaterally shrink the evaluated combination universe.
202
+
> 3.**Benchmarking Alignment**: The sub-model evaluation leaderboard dynamically slices historical records to match the precise temporal boundaries established by the current ensemble matrix index. This constructs a perfect "apples-to-apples" comparison avoiding overlapping timeframe distortion.
0 commit comments