Merge pull request #953 from Lumiwealth/version/4.4.39

grzesir · web-flow · commit 18d57e699315 · 2026-01-27T02:32:53.000-05:00
Version 4.4.39
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,12 +1,14 @@
 # Changelog
 
-## 4.4.39 - Unreleased
+## 4.4.39 - 2026-01-27
 
 ### Added
 
 ### Changed
 
 ### Fixed
+- Backtesting router (IBKR futures/cont_future/crypto): prefetch full backtest window once per series and slice from memory to avoid per-iteration history fetches (major warm-cache speedup).
+- Indicators: prevent `plot_indicators()` hovertext generation from crashing when `detail_text` is missing/NaN/NA (e.g., mixed indicator points with and without `detail_text`).
 
 ## 4.4.38 - 2026-01-26
 
diff --git a/docs/ACCEPTANCE_BACKTESTS.md b/docs/ACCEPTANCE_BACKTESTS.md
@@ -14,6 +14,11 @@ This document is the **canonical manual acceptance suite** for LumiBot backtesti
 
 **Speed:** acceptance warm-cache runs complete in bounded wall time and are **queue-free** (no downloader submits), proving the cache and data semantics are stable.
 
+**Resilience:** acceptance runs must also prove the “end of backtest” pipeline is stable:
+- stats summary must not crash (CAGR/datetime edge cases, NaN handling, etc.),
+- tearsheet/plot generation should either succeed or fail in a controlled way (no masking simulation success with a generic “failed” run),
+- and the run must still emit actionable artifacts (`trades.csv`, `stats.csv`, `logs.csv`) even when optional post-processing fails.
+
 ## IBKR acceptance backtests (Crypto + Futures)
 
 This repo’s acceptance harness (`tests/backtest/test_acceptance_backtests_ci.py`) includes deterministic, cache-backed:
diff --git a/docs/BACKTESTING_ARCHITECTURE.md b/docs/BACKTESTING_ARCHITECTURE.md
@@ -16,6 +16,8 @@ LumiBot is a trading and backtesting framework. This document focuses on the **b
 
 **Speed:** warm-cache runs are queue-free and complete in bounded wall time, with evidence (request counts, cache hit rate, iterations/sec, and wall-time split: data wait vs compute vs artifacts).
 
+**Resilience:** simulation completion must not be masked by post-processing failures (stats/tearsheets/plots). When post-processing fails, the run should still produce as many artifacts as possible and classify the failure (simulation vs postprocess vs upload), so operators can trust the trade stream even when reporting breaks.
+
 If the backtest execution model (data semantics, fill model, order handling, fees, pricing) diverges meaningfully from how real brokers behave, the backtest is not trustworthy.
 
 We optimize for:
@@ -27,7 +29,7 @@ We optimize for:
 - Handoffs: `docs/handoffs/`
 - Investigations: `docs/investigations/`
 - Performance + parity + startup: `docs/BACKTESTING_PERFORMANCE.md`
-- Latest session handoff: `docs/handoffs/2025-12-26_THETADATA_SESSION_HANDOFF.md`
+- Latest session handoff (IBKR speed + resilience): `docs/handoffs/2026-01-26_IBKR_SPEED_RESILIENCE_MASTER_HANDOFF.md`
 
 ## Directory Structure
 
diff --git a/docs/BACKTESTING_PERFORMANCE.md b/docs/BACKTESTING_PERFORMANCE.md
@@ -2,7 +2,7 @@
 
 > A practical, evidence-driven guide to **measuring**, **debugging**, and **improving** backtesting performance end‑to‑end (strategy → data → cache → artifacts → UI), while preserving broker‑like correctness.
 
-**Last Updated:** 2026-01-07  
+**Last Updated:** 2026-01-26  
 **Status:** Active  
 **Audience:** Developers, AI Agents (engineering docs)  
 
@@ -20,6 +20,11 @@
 
 **Speed:** warm-cache runs are queue-free and complete in bounded wall time, with evidence (request counts, cache hit rate, iterations/sec, and wall-time split: data wait vs compute vs artifacts).
 
+**Resilience:** backtests should not “fail” solely because post-processing (stats/tearsheets/plots) crashed. When post-processing fails, the run should still:
+- preserve the trade stream (`trades.csv`) and portfolio stats (`stats.csv`) when available,
+- classify the failure (simulation vs postprocess vs upload),
+- and emit actionable diagnostics rather than silently omitting artifacts.
+
 ## Overview
 
 Backtesting performance problems in LumiBot rarely have a single cause. “Slow backtests” usually come from one (or more) of:
diff --git a/docs/IBKR_FUTURES_BACKTESTING.md b/docs/IBKR_FUTURES_BACKTESTING.md
@@ -74,6 +74,11 @@ contracts (or for `cont_future` stitching over expired months), LumiBot relies o
 
 See: `docs/investigations/2026-01-18_IBKR_EXPIRED_FUTURES_CONID_BACKFILL.md`.
 
+Operational note:
+- If `ibkr/conids.json` in the active S3 cache namespace is only a few hundred bytes (or missing keys like
+  `future|GC|USD|COMEX|20250226`), `cont_future` backtests will fail for “past year” windows once contracts expire.
+  Seed/promote the registry in S3 using the runbook in the investigation doc above.
+
 ### Multiplier + minTick (mandatory for correct PnL and tick rounding)
 
 For realistic futures accounting:
diff --git a/docs/README.md b/docs/README.md
@@ -18,6 +18,15 @@ This folder contains **human-authored** documentation for the LumiBot trading an
 
 **Speed:** a backtest is “fast” when warm-cache runs are queue-free and complete in bounded wall time, with evidence (request counts, cache hit rate, iterations/sec, and wall-time split: data wait vs compute vs artifacts).
 
+**Resilience:** a backtest is “resilient” when:
+- simulation completion is not masked by post-processing failures (tearsheets/stats/plots),
+- artifacts are as complete as possible even after failures (e.g., `trades.csv` and `stats.csv` still upload),
+- failure modes are classified (simulation vs postprocess vs upload), and
+- run metadata makes debugging easy (include `lumibot_version` in `settings.json` / `completion.json` whenever possible).
+
+If you’re coordinating IBKR speed + crash hardening work, start with:
+- `docs/handoffs/2026-01-26_IBKR_SPEED_RESILIENCE_MASTER_HANDOFF.md`
+
 ---
 
 ## File Index
diff --git a/docs/handoffs/2026-01-26_RELEASE_4.4.38_AND_START_4.4.39.md b/docs/handoffs/2026-01-26_RELEASE_4.4.38_AND_START_4.4.39.md
@@ -0,0 +1,31 @@
+# Release 4.4.38 + Start 4.4.39
+Release housekeeping after deploying 4.4.38 and creating the next shared version branch.
+
+**Last Updated:** 2026-01-26  
+**Status:** Active  
+**Audience:** Developers + AI Agents  
+
+---
+
+## Overview
+
+4.4.38 has been merged to `dev` and deployed. The next shared collaboration branch `version/4.4.39` is now created off `dev` for ongoing work.
+
+## What Shipped (4.4.38)
+
+- PR: https://github.com/Lumiwealth/lumibot/pull/952
+- Merge commit on `dev`: `f91b5722`
+- Version bump commit (sets `setup.py` to `4.4.38` and finalizes `CHANGELOG.md` entry): `d4b5d50a`
+- Key feature: IBKR futures auto exchange routing + hardened conid registry updates (plus roll-rule fixes for GC/MGC/CL/MCL).
+
+## Post-Merge Housekeeping (Done)
+
+- Local `dev` fast-forwarded to `origin/dev`.
+- New shared branch created and pushed:
+  - Branch: `version/4.4.39`
+  - Create PR (if desired): https://github.com/Lumiwealth/lumibot/pull/new/version/4.4.39
+
+## What’s Next (Known Issues)
+
+- **IBKR futures performance**: still slower than desired; target improvements should preserve accuracy and avoid extra downloader calls when warm-cache is available.
+- **TQQQ bot crashes**: investigate separately (collect logs + repro interval, then patch platform code as needed).
diff --git a/docs/investigations/2026-01-18_IBKR_EXPIRED_FUTURES_CONID_BACKFILL.md b/docs/investigations/2026-01-18_IBKR_EXPIRED_FUTURES_CONID_BACKFILL.md
@@ -129,6 +129,68 @@ This avoids needing TWS except for the initial historical backfill window.
 Note: IBKR’s public Symbol Lookup response includes `conid` + `localSymbol` for *currently listed* futures contracts.
 That can be used as an additional “no-auth” source for forward refresh, but it does not solve expired-contract discovery.
 
+## Operations: seed the S3 conid registry (prod/dev) without rerunning TWS
+
+If backtests are failing with errors like:
+
+- `IBKR did not return a conid for <ROOT> expiring <YYYYMMDD> on <EXCHANGE>`
+
+…and the target expiration is **expired** (no longer returned by `/ibkr/trsrv/futures`), you must ensure the S3-mirrored
+registry contains it. You do **not** need to rerun TWS if a backfill registry already exists (for example the one in
+`data/ibkr_tws_backfill_cache_dev_v2/ibkr/conids.json`).
+
+### Safety checklist
+
+- Always **download a backup** of the current S3 object before overwriting.
+- Always **union-merge** keys (do not replace blindly).
+- Prefer `aws --profile BotManager ...` when running from this machine.
+
+### Targets (current)
+
+- Prod conids: `s3://lumibot-cache-prod/prod/cache/v1/ibkr/conids.json`
+- Dev conids: `s3://lumibot-cache-dev/dev/cache/v1/ibkr/conids.json`
+
+Additional cache namespaces may exist (for example `dev/cache/v44/...`). Seed every namespace that is actively used by
+backtests.
+
+### Example (merge-before-upload)
+
+```bash
+# Backup
+aws --profile BotManager s3 cp \
+  s3://lumibot-cache-prod/prod/cache/v1/ibkr/conids.json \
+  ./prod_conids.before.json
+
+# Merge (seed wins except where remote already has a key)
+python3 - <<'PY'
+import json
+from pathlib import Path
+
+seed = json.loads(Path("data/ibkr_tws_backfill_cache_dev_v2/ibkr/conids.json").read_text())
+remote = json.loads(Path("prod_conids.before.json").read_text())
+
+merged = dict(seed)
+merged.update(remote)  # remote wins on conflict
+
+Path("prod_conids.merged.json").write_text(json.dumps(merged, sort_keys=True, separators=(",", ":")))
+print("merged_keys", len(merged))
+PY
+
+# Upload
+aws --profile BotManager s3 cp \
+  ./prod_conids.merged.json \
+  s3://lumibot-cache-prod/prod/cache/v1/ibkr/conids.json
+```
+
+### When you still need TWS
+
+You still need a one-time TWS backfill when the desired expiration is **older than any conids you’ve captured** (for
+example, if your registry only starts in 2024 and customers want 2015). In that case:
+
+- run `scripts/backfill_ibkr_futures_conids_tws.py` (with `includeExpired=True`)
+- upload the resulting `ibkr/conids.json` to a new cache namespace (e.g. `v2`, `v3`, …)
+- validate, then promote/seed the production namespace using the merge flow above
+
 ## Verification
 
 ### Correctness
diff --git a/docs/investigations/2026-01-27_ROUTER_IBKR_SPEED.md b/docs/investigations/2026-01-27_ROUTER_IBKR_SPEED.md
@@ -0,0 +1,125 @@
+# Router IBKR Speed Investigation (Futures + Crypto) — 2026-01-27
+
+Goal: make **IBKR through the router** (production routing JSON) **≥20× faster first**, then **50–100×** (warm-cache), without sacrificing correctness.
+
+Primary symptom: router IBKR futures backtests were taking **hours for ~1 week** because the router path was calling the downloader `ibkr/iserver/marketdata/history` in a hot loop (often ~1 request per simulated bar).
+
+This doc is a **speed ledger** + **methodology**. Every perf change must:
+- record benchmark results here (before/after),
+- include YAPPI evidence,
+- and add/adjust tests so the improvement sticks.
+
+## 0) Alignment / invariants
+
+**Production routing JSON (canonical)**
+```json
+{"default":"thetadata","crypto":"ibkr","future":"ibkr","cont_future":"ibkr"}
+```
+
+Notes:
+- Router aliases `"futures"` → `"future"` but does **not** imply `"cont_future"`.
+- Success metric is not “feels faster”: we require **history submits ~O(1)** (single digits) for warm-cache runs.
+
+**Hard perf targets (warm-cache)**
+- 1 day: ≤ 10s end-to-end
+- 1 week: ≤ 60s end-to-end
+- `ibkr/iserver/marketdata/history` submits: **single digits per run** (per symbol/timeframe), not proportional to bars
+
+## 1) Standard benchmark suite
+
+We iterate on **1-day windows** (fast feedback) and validate milestones on **1-week windows**.
+
+Benchmarks:
+1) GC client strategy
+2) NQ client strategy
+
+Profiling:
+- Always run a non-profile baseline and then a YAPPI run.
+- YAPPI time ≠ wall time (overhead), use it only for hotspot ranking.
+
+## 2) Standard commands (prod-like runner)
+
+We use `scripts/run_backtest_prodlike.py` for “production-like” runs (downloader + S3 caching).
+
+Recommended investigation flags:
+- use the production routing JSON
+- set a dedicated cache folder under `~/Documents/Development/`
+- use S3 cache **read-only** during investigations to avoid mutating shared caches:
+  - `env LUMIBOT_CACHE_MODE=readonly ...`
+
+Example:
+```bash
+/Users/robertgrzesik/bin/safe-timeout 900s env LUMIBOT_CACHE_MODE=readonly \
+  python3 scripts/run_backtest_prodlike.py \
+    --main "/Users/robertgrzesik/Documents/Development/backtest_strategies/nq_double_ema_test/main.py" \
+    --start 2026-01-20 --end 2026-01-27 \
+    --data-source '{"default":"thetadata","crypto":"ibkr","future":"ibkr","cont_future":"ibkr"}' \
+    --use-dotenv-s3-keys \
+    --cache-folder "/Users/robertgrzesik/Documents/Development/backtest_cache/router_speed" \
+    --profile yappi \
+    --label nq_router_week1_yappi
+```
+
+YAPPI analysis helper:
+- `scripts/analyze_yappi_csv.py`
+
+## 3) Speed ledger
+
+### Columns
+- `ts` (local wall clock)
+- `git` (short SHA)
+- `bench` (gc/nq)
+- `mode` (router-json/router-default)
+- `window` (1d/1w)
+- `elapsed_s`
+- `queue_submits`
+- `history_submits` (subset)
+- `top_paths` (top 3–5)
+- `yappi_csv`
+- `change`
+
+### Baseline runs (pre-fix evidence; Jan 26, 2026)
+
+These runs are preserved to show the “before” state: downloader-in-hot-loop behavior.
+
+| ts | git | bench | mode | window | elapsed_s | queue_submits | history_submits | top_paths | yappi_csv | change |
+|---|---|---|---|---:|---:|---:|---:|---|---|---|
+| 2026-01-26 | (unknown) | gc | router-default | 1d | 1129 | 378 | 233 | `ibkr/iserver/marketdata/history` dominant | `.../20260126_180122_gc_ema_day1_yappi/..._profile_yappi.csv` | baseline (slow; queue wait dominates) |
+| 2026-01-26 | (unknown) | nq | router-default + S3 keys | 1d | timeout@1800s | 378 | 378 | all history | `.../20260126_201209_nq_2el_day1_s3warm_yappi/..._profile_yappi.csv` | baseline (timed out; ~1 history/minute) |
+
+### Phase 1 results (router IBKR prefetch enabled; local changes on top of `version/4.4.39`)
+
+These runs use:
+- routing: `{"default":"thetadata","crypto":"ibkr","future":"ibkr","cont_future":"ibkr"}`
+- local cache: `/Users/robertgrzesik/Documents/Development/backtest_cache/router_speed`
+- S3 cache: dev bucket/prefix, **read-only** (`LUMIBOT_CACHE_MODE=readonly`) during measurement
+
+| ts | git | bench | mode | window | elapsed_s | queue_submits | history_submits | top_paths | yappi_csv | change |
+|---|---|---|---|---:|---:|---:|---:|---|---|---|
+| 2026-01-27 | a8f17429+local | nq | router-json | 1d (2026-01-20→21) | 26.6 | 1 | 0 | `ibkr/iserver/secdef/search` | (none) | warm-cache: effectively queue-free |
+| 2026-01-27 | a8f17429+local | nq | router-json | 1w (2026-01-20→27) | 51.0 | 2 | 2 | `ibkr/iserver/marketdata/history` | (none) | bounded history fetches only (no per-bar thrash) |
+| 2026-01-27 | a8f17429+local | nq | router-json | 1w (2026-01-20→27) | 25.5 | 0 | 0 | (none) | `/Users/robertgrzesik/Documents/Development/backtest_runs/20260127_001202_nq_router_20260120_week1_yappi/logs/NQDoubleEMATestStrategy_2026-01-27_00-12_VFcBmM_profile_yappi.csv` | YAPPI: ~0 network IO; dominated by pandas/numpy |
+| 2026-01-27 | a8f17429+local | gc | router-json | 1d (2026-01-20→21) | 14.7 | 1 | 0 | `ibkr/iserver/secdef/search` | (none) | warm-cache: bounded |
+| 2026-01-27 | a8f17429+local | gc | router-json | 1w (2026-01-20→27) | 163.0 | 5 | 5 | `ibkr/iserver/marketdata/history` | (none) | cold-ish: initial history fetches dominate |
+| 2026-01-27 | a8f17429+local | gc | router-json | 1w (2026-01-20→27) | 12.6 | 0 | 0 | (none) | `/Users/robertgrzesik/Documents/Development/backtest_runs/20260127_001638_gc_router_20260120_week1_yappi/logs/GoldFuturesEMACrossover_2026-01-27_00-16_o66T9X_profile_yappi.csv` | warm-cache: dominated by pandas/numpy |
+
+## 4) Root cause + fix summary
+
+**Root cause (router path, before fix):**
+- `_IbkrRoutingAdapter` fetched IBKR history per-window (often per simulated bar), instead of prefetching the full backtest window once.
+
+**Fix (Phase 1):**
+- Router IBKR adapter now prefetches `(start - warmup) → backtest_end` once per series key for:
+  - futures / cont_future (minute/hour/day)
+  - crypto (minute/hour/day special cases)
+- Subsequent calls slice from the in-memory DataFrame.
+
+See implementation: `lumibot/backtesting/routed_backtesting.py` (router IBKR adapter).
+
+## 5) Tests / regression gates
+
+Deterministic unit tests prevent regression back to “fetch in the hot loop”:
+- `tests/backtest/test_routed_backtesting_ibkr_prefetch.py`
+  - futures/cont_future minute: prefetch once + slice
+  - crypto minute: prefetch once + slice
+
diff --git a/lumibot/backtesting/routed_backtesting.py b/lumibot/backtesting/routed_backtesting.py
@@ -254,13 +254,75 @@ def _fetch_df(
     ) -> pd.DataFrame | None:
         asset_type = str(getattr(asset, "asset_type", "") or "").lower()
 
+        # PERF: warm-cache minute strategies can call `get_historical_prices()` tens of thousands of
+        # times. In the router data source, IBKR history fetches must be amortized by prefetching
+        # the full backtest window once, then slicing in-memory thereafter (same principle as the
+        # IBKR-only backtesting data source).
+
+        if (
+            asset_type in {"future", "cont_future"}
+            and ts_unit in {"minute", "hour", "day"}
+            and canonical_key not in self._fully_loaded_series
+        ):
+            try:
+                from lumibot.backtesting.interactive_brokers_rest_backtesting import InteractiveBrokersRESTBacktesting
+
+                prev_open = InteractiveBrokersRESTBacktesting._previous_us_futures_session_open(self._router.datetime_start)
+            except Exception:
+                prev_open = None
+
+            try:
+                if prev_open is not None:
+                    prefetch_start = min(start_datetime, prev_open)
+                else:
+                    prefetch_start = min(start_datetime, self._router.datetime_start - timedelta(days=1))
+            except Exception:
+                prefetch_start = start_datetime
+
+            prefetch_end = self._router.datetime_end or end_dt
+
+            df = ibkr_helper.get_price_data(
+                asset=asset,
+                quote=quote_asset,
+                timestep=ts_unit,
+                start_dt=prefetch_start,
+                end_dt=prefetch_end,
+                exchange=None,
+                include_after_hours=True,
+            )
+            if df is None or df.empty:
+                return None
+            self._fully_loaded_series.add(canonical_key)
+            return df
+
+        if asset_type == "crypto" and ts_unit in {"minute", "hour"} and canonical_key not in self._fully_loaded_series:
+            try:
+                prefetch_start = min(start_datetime, self._router.datetime_start)
+            except Exception:
+                prefetch_start = start_datetime
+            prefetch_end = self._router.datetime_end or end_dt
+
+            df = ibkr_helper.get_price_data(
+                asset=asset,
+                quote=quote_asset,
+                timestep=ts_unit,
+                start_dt=prefetch_start,
+                end_dt=prefetch_end,
+                exchange=None,
+                include_after_hours=True,
+            )
+            if df is None or df.empty:
+                return None
+            self._fully_loaded_series.add(canonical_key)
+            return df
+
         if asset_type == "crypto" and ts_unit == "day" and canonical_key not in self._fully_loaded_series:
             try:
                 lookback_days = max(7, int(length) + 5)
             except Exception:
                 lookback_days = 7
             prefetch_start = min(start_datetime, self._router.datetime_start - timedelta(days=lookback_days))
-            prefetch_end = self._router.datetime_end
+            prefetch_end = self._router.datetime_end or end_dt
 
             df = ibkr_helper.get_price_data(
                 asset=asset,
diff --git a/lumibot/tools/indicators.py b/lumibot/tools/indicators.py
diff --git a/setup.py b/setup.py
diff --git a/tests/backtest/backtest_performance_history.csv b/tests/backtest/backtest_performance_history.csv
diff --git a/tests/backtest/test_routed_backtesting_ibkr_prefetch.py b/tests/backtest/test_routed_backtesting_ibkr_prefetch.py
diff --git a/tests/test_indicators_detail_text_edge_cases.py b/tests/test_indicators_detail_text_edge_cases.py