|
| 1 | +# Router IBKR Speed Investigation (Futures + Crypto) — 2026-01-27 |
| 2 | + |
| 3 | +Goal: make **IBKR through the router** (production routing JSON) **≥20× faster first**, then **50–100×** (warm-cache), without sacrificing correctness. |
| 4 | + |
| 5 | +Primary symptom: router IBKR futures backtests were taking **hours for ~1 week** because the router path was calling the downloader `ibkr/iserver/marketdata/history` in a hot loop (often ~1 request per simulated bar). |
| 6 | + |
| 7 | +This doc is a **speed ledger** + **methodology**. Every perf change must: |
| 8 | +- record benchmark results here (before/after), |
| 9 | +- include YAPPI evidence, |
| 10 | +- and add/adjust tests so the improvement sticks. |
| 11 | + |
| 12 | +## 0) Alignment / invariants |
| 13 | + |
| 14 | +**Production routing JSON (canonical)** |
| 15 | +```json |
| 16 | +{"default":"thetadata","crypto":"ibkr","future":"ibkr","cont_future":"ibkr"} |
| 17 | +``` |
| 18 | + |
| 19 | +Notes: |
| 20 | +- Router aliases `"futures"` → `"future"` but does **not** imply `"cont_future"`. |
| 21 | +- Success metric is not “feels faster”: we require **history submits ~O(1)** (single digits) for warm-cache runs. |
| 22 | + |
| 23 | +**Hard perf targets (warm-cache)** |
| 24 | +- 1 day: ≤ 10s end-to-end |
| 25 | +- 1 week: ≤ 60s end-to-end |
| 26 | +- `ibkr/iserver/marketdata/history` submits: **single digits per run** (per symbol/timeframe), not proportional to bars |
| 27 | + |
| 28 | +## 1) Standard benchmark suite |
| 29 | + |
| 30 | +We iterate on **1-day windows** (fast feedback) and validate milestones on **1-week windows**. |
| 31 | + |
| 32 | +Benchmarks: |
| 33 | +1) GC client strategy |
| 34 | +2) NQ client strategy |
| 35 | + |
| 36 | +Profiling: |
| 37 | +- Always run a non-profile baseline and then a YAPPI run. |
| 38 | +- YAPPI time ≠ wall time (overhead), use it only for hotspot ranking. |
| 39 | + |
| 40 | +## 2) Standard commands (prod-like runner) |
| 41 | + |
| 42 | +We use `scripts/run_backtest_prodlike.py` for “production-like” runs (downloader + S3 caching). |
| 43 | + |
| 44 | +Recommended investigation flags: |
| 45 | +- use the production routing JSON |
| 46 | +- set a dedicated cache folder under `~/Documents/Development/` |
| 47 | +- use S3 cache **read-only** during investigations to avoid mutating shared caches: |
| 48 | + - `env LUMIBOT_CACHE_MODE=readonly ...` |
| 49 | + |
| 50 | +Example: |
| 51 | +```bash |
| 52 | +/Users/robertgrzesik/bin/safe-timeout 900s env LUMIBOT_CACHE_MODE=readonly \ |
| 53 | + python3 scripts/run_backtest_prodlike.py \ |
| 54 | + --main "/Users/robertgrzesik/Documents/Development/backtest_strategies/nq_double_ema_test/main.py" \ |
| 55 | + --start 2026-01-20 --end 2026-01-27 \ |
| 56 | + --data-source '{"default":"thetadata","crypto":"ibkr","future":"ibkr","cont_future":"ibkr"}' \ |
| 57 | + --use-dotenv-s3-keys \ |
| 58 | + --cache-folder "/Users/robertgrzesik/Documents/Development/backtest_cache/router_speed" \ |
| 59 | + --profile yappi \ |
| 60 | + --label nq_router_week1_yappi |
| 61 | +``` |
| 62 | + |
| 63 | +YAPPI analysis helper: |
| 64 | +- `scripts/analyze_yappi_csv.py` |
| 65 | + |
| 66 | +## 3) Speed ledger |
| 67 | + |
| 68 | +### Columns |
| 69 | +- `ts` (local wall clock) |
| 70 | +- `git` (short SHA) |
| 71 | +- `bench` (gc/nq) |
| 72 | +- `mode` (router-json/router-default) |
| 73 | +- `window` (1d/1w) |
| 74 | +- `elapsed_s` |
| 75 | +- `queue_submits` |
| 76 | +- `history_submits` (subset) |
| 77 | +- `top_paths` (top 3–5) |
| 78 | +- `yappi_csv` |
| 79 | +- `change` |
| 80 | + |
| 81 | +### Baseline runs (pre-fix evidence; Jan 26, 2026) |
| 82 | + |
| 83 | +These runs are preserved to show the “before” state: downloader-in-hot-loop behavior. |
| 84 | + |
| 85 | +| ts | git | bench | mode | window | elapsed_s | queue_submits | history_submits | top_paths | yappi_csv | change | |
| 86 | +|---|---|---|---|---:|---:|---:|---:|---|---|---| |
| 87 | +| 2026-01-26 | (unknown) | gc | router-default | 1d | 1129 | 378 | 233 | `ibkr/iserver/marketdata/history` dominant | `.../20260126_180122_gc_ema_day1_yappi/..._profile_yappi.csv` | baseline (slow; queue wait dominates) | |
| 88 | +| 2026-01-26 | (unknown) | nq | router-default + S3 keys | 1d | timeout@1800s | 378 | 378 | all history | `.../20260126_201209_nq_2el_day1_s3warm_yappi/..._profile_yappi.csv` | baseline (timed out; ~1 history/minute) | |
| 89 | + |
| 90 | +### Phase 1 results (router IBKR prefetch enabled; local changes on top of `version/4.4.39`) |
| 91 | + |
| 92 | +These runs use: |
| 93 | +- routing: `{"default":"thetadata","crypto":"ibkr","future":"ibkr","cont_future":"ibkr"}` |
| 94 | +- local cache: `/Users/robertgrzesik/Documents/Development/backtest_cache/router_speed` |
| 95 | +- S3 cache: dev bucket/prefix, **read-only** (`LUMIBOT_CACHE_MODE=readonly`) during measurement |
| 96 | + |
| 97 | +| ts | git | bench | mode | window | elapsed_s | queue_submits | history_submits | top_paths | yappi_csv | change | |
| 98 | +|---|---|---|---|---:|---:|---:|---:|---|---|---| |
| 99 | +| 2026-01-27 | a8f17429+local | nq | router-json | 1d (2026-01-20→21) | 26.6 | 1 | 0 | `ibkr/iserver/secdef/search` | (none) | warm-cache: effectively queue-free | |
| 100 | +| 2026-01-27 | a8f17429+local | nq | router-json | 1w (2026-01-20→27) | 51.0 | 2 | 2 | `ibkr/iserver/marketdata/history` | (none) | bounded history fetches only (no per-bar thrash) | |
| 101 | +| 2026-01-27 | a8f17429+local | nq | router-json | 1w (2026-01-20→27) | 25.5 | 0 | 0 | (none) | `/Users/robertgrzesik/Documents/Development/backtest_runs/20260127_001202_nq_router_20260120_week1_yappi/logs/NQDoubleEMATestStrategy_2026-01-27_00-12_VFcBmM_profile_yappi.csv` | YAPPI: ~0 network IO; dominated by pandas/numpy | |
| 102 | +| 2026-01-27 | a8f17429+local | gc | router-json | 1d (2026-01-20→21) | 14.7 | 1 | 0 | `ibkr/iserver/secdef/search` | (none) | warm-cache: bounded | |
| 103 | +| 2026-01-27 | a8f17429+local | gc | router-json | 1w (2026-01-20→27) | 163.0 | 5 | 5 | `ibkr/iserver/marketdata/history` | (none) | cold-ish: initial history fetches dominate | |
| 104 | +| 2026-01-27 | a8f17429+local | gc | router-json | 1w (2026-01-20→27) | 12.6 | 0 | 0 | (none) | `/Users/robertgrzesik/Documents/Development/backtest_runs/20260127_001638_gc_router_20260120_week1_yappi/logs/GoldFuturesEMACrossover_2026-01-27_00-16_o66T9X_profile_yappi.csv` | warm-cache: dominated by pandas/numpy | |
| 105 | + |
| 106 | +## 4) Root cause + fix summary |
| 107 | + |
| 108 | +**Root cause (router path, before fix):** |
| 109 | +- `_IbkrRoutingAdapter` fetched IBKR history per-window (often per simulated bar), instead of prefetching the full backtest window once. |
| 110 | + |
| 111 | +**Fix (Phase 1):** |
| 112 | +- Router IBKR adapter now prefetches `(start - warmup) → backtest_end` once per series key for: |
| 113 | + - futures / cont_future (minute/hour/day) |
| 114 | + - crypto (minute/hour/day special cases) |
| 115 | +- Subsequent calls slice from the in-memory DataFrame. |
| 116 | + |
| 117 | +See implementation: `lumibot/backtesting/routed_backtesting.py` (router IBKR adapter). |
| 118 | + |
| 119 | +## 5) Tests / regression gates |
| 120 | + |
| 121 | +Deterministic unit tests prevent regression back to “fetch in the hot loop”: |
| 122 | +- `tests/backtest/test_routed_backtesting_ibkr_prefetch.py` |
| 123 | + - futures/cont_future minute: prefetch once + slice |
| 124 | + - crypto minute: prefetch once + slice |
| 125 | + |
0 commit comments