-
Notifications
You must be signed in to change notification settings - Fork 93
Description
Summary
The backtesting framework reports drawdown metrics that are inconsistent with the total_value series in the portfolio_snapshots it produces. Independently computing max drawdown from the snapshot total_value field yields significantly different results.
Expected values (from framework)
{
"max_drawdown": 0.1384753616852372,
"max_drawdown_absolute": 406.69151474500177,
"max_daily_drawdown": 0.1384753616852372,
"max_drawdown_duration": 240
}Actual values (computed from portfolio_snapshots[].total_value)
| Metric | Framework reports | Computed from snapshots | Delta |
|---|---|---|---|
max_drawdown |
13.85% | 9.19% | +4.66pp |
max_drawdown_absolute |
406.69 | 230.50 | +176.19 |
max_daily_drawdown |
13.85% | 7.84% | +6.01pp |
max_drawdown_duration |
240 days | 241 days | -1 day |
Analysis
1. Framework uses a different equity curve than total_value
The framework's absolute drawdown (406.69) divided by its fractional drawdown (0.1385) implies a peak equity of ~2,937. However, the actual peak total_value in the snapshots is ~2,508. This means the framework is computing drawdown from a different equity series than the one stored in total_value.
Possible causes:
- The framework may be summing fields differently (e.g.
unallocated + pending_value + unrealizedinstead of using the pre-computedtotal_value). - The framework may be revaluing positions at current market prices independently of the snapshot, producing a different equity curve.
- There may be a mismatch between the equity curve used internally for metrics and the one serialized to
portfolio_snapshots.
2. max_daily_drawdown equals max_drawdown — likely a bug
The framework reports max_daily_drawdown = 0.1384753616852372, which is identical to max_drawdown. This is almost certainly wrong:
max_daily_drawdownshould represent the largest single-period (day-to-day) decline, which is typically much smaller than the peak-to-trough drawdown.- From the snapshot data, the largest single-day drop is 7.84%, not 13.85%.
- If
max_daily_drawdowntruly equalsmax_drawdown, it would mean the entire 13.85% drawdown happened in a single snapshot interval — contradicting the reportedmax_drawdown_durationof 240 days.
Likely cause: max_daily_drawdown is being assigned the same value as max_drawdown instead of being computed independently as the worst single-period return.
3. max_drawdown_duration is close but off by 1 day
The duration (240 vs 241 days) is within rounding tolerance and may be an off-by-one in how the framework counts the start/end day (inclusive vs exclusive). This is minor.
How to reproduce
Using backtest_run_three.json:
import json
from datetime import datetime
with open("backtest_run_three.json") as f:
data = json.load(f)
snaps = sorted(data["portfolio_snapshots"], key=lambda s: s["created_at"])
# Max drawdown from total_value
peak = 0
max_dd = 0
max_dd_abs = 0
for s in snaps:
tv = s["total_value"]
if tv > peak:
peak = tv
if peak > 0:
dd = (peak - tv) / peak
if dd > max_dd:
max_dd = dd
max_dd_abs = peak - tv
print(f"max_drawdown: {max_dd}") # 0.0919 — NOT 0.1385
print(f"max_drawdown_absolute: {max_dd_abs}") # 230.50 — NOT 406.69Suggested fix
- Verify that the equity curve used for drawdown calculation matches the
total_valuewritten toportfolio_snapshots. If they diverge, either fix the metric calculation or fix the snapshot serialization. - Fix
max_daily_drawdownto compute the worst single-period return independently:max_daily_dd = max( (snaps[i-1]["total_value"] - snaps[i]["total_value"]) / snaps[i-1]["total_value"] for i in range(1, len(snaps)) if snaps[i-1]["total_value"] > 0 )
- Review the off-by-one in
max_drawdown_duration(inclusive vs exclusive day counting).