Conversation
📝 WalkthroughWalkthroughThe pull request introduces extensive caching mechanisms throughout the backtesting framework and data sources. Key additions include: trading calendar and market data caching in the broker and helpers; tracked orders/positions caching with optional asset filtering; portfolio value computation caching; progress bar throttling; daily last price caching in Yahoo data source; asset type and minimal dictionary precomputation; mutation tracking via revision counters on SafeList/SafeOrderDict; and dividend cash batching to reduce position updates. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Pylint (4.0.5)lumibot/trading_builtins/safe_list.py************* Module pylintrc ... [truncated 5882 characters] ... ule": "lumibot.trading_builtins.safe_list", lumibot/data_sources/yahoo_data.py************* Module pylintrc ... [truncated 21269 characters] ... ", lumibot/strategies/strategy_executor.py************* Module pylintrc ... [truncated 137526 characters] ... g_session",
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use your project's `ruff` configuration to improve the quality of Python code reviews.Add a Ruff configuration file to your project to customize how CodeRabbit runs |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
lumibot/entities/position.py (1)
113-120:⚠️ Potential issue | 🟠 MajorKeep dust filtering out of
Position.quantity.Line 117's hard-coded
1e-6epsilon is larger than the supported precision of many crypto assets. Returning0.0from the accessor can make a real position disappear from sell/valuation paths; if this is only to hide float noise, do it at the presentation boundary instead.Suggested fix
`@property` def quantity(self): - result = self._quantity_float - - # If result is less than 0.000001, return 0.0 to avoid rounding errors. - if abs(result) < 0.000001: - return 0.0 - - return result + return self._quantity_float🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/entities/position.py` around lines 113 - 120, The Position.quantity accessor is removing small balances using a hard-coded epsilon (abs(result) < 1e-6) which can hide real crypto positions; remove the dust-filtering check from Position.quantity so it simply returns self._quantity_float, and if you need to suppress float noise implement a separate presentation helper (e.g., format_quantity or a UI-layer clamp) or make a configurable method like Position.quantity_clamped(epsilon) for display/valuation only.lumibot/trading_builtins/safe_list.py (1)
13-17:⚠️ Potential issue | 🟠 MajorOwn the backing list before exposing
revision.
self.__items = initialkeeps the caller's list alias. Any out-of-band mutation of that list bypassesrevision += 1, so caches that trustrevisioncan go stale immediately. At minimum, copyinitialhere; the same invariant should hold anywhere the raw list is exposed.Suggested fix
if initial is None: initial = [] self.__lock = lock - self.__items = initial + self.__items = list(initial) self.revision = 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/trading_builtins/safe_list.py` around lines 13 - 17, The backing list assigned to self.__items keeps the caller's alias (self.__items = initial), allowing out-of-band mutations to bypass revision updates; change the assignment to take ownership by copying the input (e.g., self.__items = list(initial) or initial.copy()) and audit any places that expose the raw list (methods/properties referencing __items) to return a copy instead so mutations always go through the SafeList APIs that increment revision.lumibot/backtesting/backtesting_broker.py (1)
390-400:⚠️ Potential issue | 🟠 MajorFallback path drops the
assetfilter.On Line 393, the fallback returns all active orders for the strategy, even when
assetwas passed. That can over-select orders and affect downstream cancellation/fill logic.💡 Proposed fix
except Exception: # Fallback to the slower path if internal buckets are unavailable. orders = self.get_tracked_orders(strategy=strategy) - return [o for o in orders if o.is_active()] if orders else [] + if not orders: + return [] + return [ + o for o in orders + if o.is_active() and (asset is None or getattr(o, "asset", None) == asset) + ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/backtesting/backtesting_broker.py` around lines 390 - 400, The fallback return in backtesting_broker.py currently ignores the asset filter and returns all active orders for the strategy; update the fallback so it also filters by asset when asset is provided. After retrieving orders via get_tracked_orders(strategy=strategy), return only orders where o.is_active() is True and (asset is None or getattr(o, "asset", None) == asset) — use the existing get_tracked_orders, is_active, strategy and asset attributes to implement this filter.
🧹 Nitpick comments (5)
lumibot/entities/asset.py (1)
274-307: These one-time caches need an immutability invariant.
_cached_hash,_cached_asset_type_key, and_cached_minimal_dictare computed once and then reused by__eq__,__hash__, andto_minimal_dict(), but the corresponding fields are still publicly mutable. Please either freeze those fields after construction or add cache refreshes in setters, and verify that no caller mutates them today.As per coding guidelines, add comments explaining 'why/invariants' for non-obvious logic.
Also applies to: 372-384, 489-489
lumibot/strategies/strategy.py (1)
1315-1335: Consider adding a brief comment explaining the single-entry cache design.The
cache.clear()before storing means only one configuration is cached at a time. While this is a safe and correct approach to prevent stale data, adding a brief inline comment would clarify the intent for future maintainers.📝 Suggested comment for clarity
cache = getattr(self, "_positions_cache", None) if cache is None: cache = {} self._positions_cache = cache +# Single-entry cache: clear to avoid stale results from previous revisions/parameters cache.clear() cache[cache_key] = tuple(result) return list(result)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/strategies/strategy.py` around lines 1315 - 1335, Add a brief inline comment explaining that the cache is intentionally single-entry: the code clears the entire self._positions_cache (cache.clear()) before storing a new entry for cache_key (constructed from filled_positions_revision, include_cash, quote_asset) so only the latest configuration returned by broker.get_tracked_positions(self.name) is kept; place this comment near the cache = getattr(self, "_positions_cache", None) / cache.clear() block inside the method that builds `result` to make the single-entry design explicit for future maintainers.lumibot/strategies/_strategy.py (1)
835-842: Persist the computed asset-type key to realize the intended cache hit path.Both hot paths read
_cached_asset_type_key, but neither writes it after computation, so normalization work is repeated every call.♻️ Proposed refactor
def _asset_type_key(asset_obj): cached_asset_type_key = getattr(asset_obj, "_cached_asset_type_key", None) if cached_asset_type_key is not None: return cached_asset_type_key raw_asset_type = getattr(asset_obj, "asset_type", "") raw_asset_type = getattr(raw_asset_type, "value", raw_asset_type) - return str(raw_asset_type).lower() + asset_type_key = str(raw_asset_type).lower() + try: + setattr(asset_obj, "_cached_asset_type_key", asset_type_key) + except Exception: + pass + return asset_type_key @@ base_asset_type = getattr(base_asset, "_cached_asset_type_key", None) if base_asset_type is None: base_asset_type = getattr(base_asset, "asset_type", None) base_asset_type = getattr(base_asset_type, "value", base_asset_type) base_asset_type = str(base_asset_type).lower() + try: + setattr(base_asset, "_cached_asset_type_key", base_asset_type) + except Exception: + passAlso applies to: 1005-1010
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/strategies/_strategy.py` around lines 835 - 842, The helper _asset_type_key computes a normalized string but never stores it, so the _cached_asset_type_key read path never hits; after computing raw_asset_type and normalizing it to str(...).lower(), assign that value back to asset_obj._cached_asset_type_key before returning to enable caching. Apply the identical change to the other duplicate helper (the second occurrence around the similar logic) so both functions persist the computed key onto the asset_obj attribute.lumibot/tools/helpers.py (1)
315-320: Use atomic file replacement for disk cache writes.Current
to_pickle()directly to the target path can leave partial files if interrupted. A temp-write +os.replace()avoids corrupted cache artifacts.💡 Proposed fix
if disk_cache_path: try: os.makedirs(os.path.dirname(disk_cache_path), exist_ok=True) - days.to_pickle(disk_cache_path) + tmp_path = f"{disk_cache_path}.tmp.{os.getpid()}" + days.to_pickle(tmp_path) + os.replace(tmp_path, disk_cache_path) except Exception: passif disk_cache_path: try: os.makedirs(os.path.dirname(disk_cache_path), exist_ok=True) - days.to_pickle(disk_cache_path) + tmp_path = f"{disk_cache_path}.tmp.{os.getpid()}" + days.to_pickle(tmp_path) + os.replace(tmp_path, disk_cache_path) except Exception: passAlso applies to: 332-337
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/tools/helpers.py` around lines 315 - 320, The current disk cache write uses days.to_pickle(disk_cache_path) which can leave partial files if interrupted; change both occurrences (the block around days.to_pickle and the similar block at lines 332-337) to write to a temporary file in the same directory (e.g., using tempfile.NamedTemporaryFile or mkstemp), flush/close (and optionally fsync) the temp file, then atomically rename/replace it to disk_cache_path with os.replace(); still create the parent dir with os.makedirs as shown and catch/log exceptions, but avoid writing directly to disk_cache_path to prevent corrupted cache artifacts.lumibot/strategies/strategy_executor.py (1)
227-233: Document the1.9second throttle invariant.
1.9is now a behavior boundary, but it's not obvious why this should be1.9instead of2, a config value, or a named constant. A short invariant comment or constant name would make this much easier to maintain.As per coding guidelines, "add comments explaining 'why/invariants' for non-obvious logic."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/strategies/strategy_executor.py` around lines 227 - 233, The 1.9-second throttle is a non-obvious magic number; replace the literal 1.9 with a named constant (e.g., SNAPSHOT_THROTTLE_SECONDS or DEFAULT_SNAPSHOT_INTERVAL) or make it configurable, update the check in the block that computes should_capture_snapshot using data_source, last_logging_time, and _last_logging_time, and add a short comment explaining the invariant/why it is 1.9 (e.g., to avoid double-logging within ~2s for downstream consumers) so future readers understand the behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lumibot/backtesting/backtesting_broker.py`:
- Around line 618-627: The branch in get_time_to_close that handles idx >=
len(self._trading_days) returns None but does not mark end-of-data; call
self._mark_end_of_trading_days(now) before setting
self._time_to_close_cache_datetime and self._time_to_close_cache_value = None
and returning None so the end-of-trading-days state is consistent with
get_time_to_open/_get_next_trading_day and the cache reflects the terminal
condition for subsequent calls.
In `@lumibot/data_sources/yahoo_data.py`:
- Around line 287-312: The daily fast-path uses raw self._datetime instead of
the already-normalized current_dt from to_default_timezone(), which can shift
the end_filter across day boundaries; inside _get_filtered_end_index replace dt
= self._datetime.replace(...) with dt = current_dt.replace(...) (i.e., use
current_dt when computing dt and end_filter) so the timezone-normalized datetime
is used for searchsorted calculations.
In `@lumibot/entities/order.py`:
- Around line 1403-1409: The minimal order payload currently returns raw
self.quantity which can be a Decimal or NumPy scalar and will break json.dumps
in DataSourceBacktesting._update_datetime; update the "qty" value in the result
dict (where result is built in lumibot.entities.order, referencing
self.quantity) to be JSON-serializable by coercing it (e.g., use
float(self.quantity) or int(...) where appropriate) and fall back to 0 when
falsy so the "qty" field is always a JSON-serializable number.
In `@lumibot/strategies/strategy_executor.py`:
- Around line 239-254: Wrap the entire snapshot-building block (calls to
strategy.get_positions, getattr(self.strategy, "_initial_budget"),
broker.get_active_tracked_orders, broker.get_tracked_orders and any
to_minimal_dict conversions) in a broad try/cept that prevents exceptions from
escaping safe_sleep()/_process_pandas_daily_data(); on any exception, set
payload["positions"] and payload["orders"] to None (leave
payload["initial_budget"] via getattr if available) so the backtest continues
with a minimal payload; also defensively catch exceptions around each
to_minimal_dict conversion when building lists so a single bad item won't fail
the whole snapshot.
---
Outside diff comments:
In `@lumibot/backtesting/backtesting_broker.py`:
- Around line 390-400: The fallback return in backtesting_broker.py currently
ignores the asset filter and returns all active orders for the strategy; update
the fallback so it also filters by asset when asset is provided. After
retrieving orders via get_tracked_orders(strategy=strategy), return only orders
where o.is_active() is True and (asset is None or getattr(o, "asset", None) ==
asset) — use the existing get_tracked_orders, is_active, strategy and asset
attributes to implement this filter.
In `@lumibot/entities/position.py`:
- Around line 113-120: The Position.quantity accessor is removing small balances
using a hard-coded epsilon (abs(result) < 1e-6) which can hide real crypto
positions; remove the dust-filtering check from Position.quantity so it simply
returns self._quantity_float, and if you need to suppress float noise implement
a separate presentation helper (e.g., format_quantity or a UI-layer clamp) or
make a configurable method like Position.quantity_clamped(epsilon) for
display/valuation only.
In `@lumibot/trading_builtins/safe_list.py`:
- Around line 13-17: The backing list assigned to self.__items keeps the
caller's alias (self.__items = initial), allowing out-of-band mutations to
bypass revision updates; change the assignment to take ownership by copying the
input (e.g., self.__items = list(initial) or initial.copy()) and audit any
places that expose the raw list (methods/properties referencing __items) to
return a copy instead so mutations always go through the SafeList APIs that
increment revision.
---
Nitpick comments:
In `@lumibot/strategies/_strategy.py`:
- Around line 835-842: The helper _asset_type_key computes a normalized string
but never stores it, so the _cached_asset_type_key read path never hits; after
computing raw_asset_type and normalizing it to str(...).lower(), assign that
value back to asset_obj._cached_asset_type_key before returning to enable
caching. Apply the identical change to the other duplicate helper (the second
occurrence around the similar logic) so both functions persist the computed key
onto the asset_obj attribute.
In `@lumibot/strategies/strategy_executor.py`:
- Around line 227-233: The 1.9-second throttle is a non-obvious magic number;
replace the literal 1.9 with a named constant (e.g., SNAPSHOT_THROTTLE_SECONDS
or DEFAULT_SNAPSHOT_INTERVAL) or make it configurable, update the check in the
block that computes should_capture_snapshot using data_source,
last_logging_time, and _last_logging_time, and add a short comment explaining
the invariant/why it is 1.9 (e.g., to avoid double-logging within ~2s for
downstream consumers) so future readers understand the behavior.
In `@lumibot/strategies/strategy.py`:
- Around line 1315-1335: Add a brief inline comment explaining that the cache is
intentionally single-entry: the code clears the entire self._positions_cache
(cache.clear()) before storing a new entry for cache_key (constructed from
filled_positions_revision, include_cash, quote_asset) so only the latest
configuration returned by broker.get_tracked_positions(self.name) is kept; place
this comment near the cache = getattr(self, "_positions_cache", None) /
cache.clear() block inside the method that builds `result` to make the
single-entry design explicit for future maintainers.
In `@lumibot/tools/helpers.py`:
- Around line 315-320: The current disk cache write uses
days.to_pickle(disk_cache_path) which can leave partial files if interrupted;
change both occurrences (the block around days.to_pickle and the similar block
at lines 332-337) to write to a temporary file in the same directory (e.g.,
using tempfile.NamedTemporaryFile or mkstemp), flush/close (and optionally
fsync) the temp file, then atomically rename/replace it to disk_cache_path with
os.replace(); still create the parent dir with os.makedirs as shown and
catch/log exceptions, but avoid writing directly to disk_cache_path to prevent
corrupted cache artifacts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8dbe4f79-a68c-4bbc-b801-92106b021789
📒 Files selected for processing (17)
lumibot/backtesting/backtesting_broker.pylumibot/brokers/broker.pylumibot/data_sources/data_source_backtesting.pylumibot/data_sources/yahoo_data.pylumibot/entities/asset.pylumibot/entities/order.pylumibot/entities/position.pylumibot/strategies/_strategy.pylumibot/strategies/strategy.pylumibot/strategies/strategy_executor.pylumibot/tools/helpers.pylumibot/trading_builtins/safe_list.pytests/test_cash.pytests/test_helpers.pytests/test_strategy_dividend_cash_batch.pytests/test_strategy_executor_progress_payload.pytests/test_yahoo_daily_last_price_fastpath.py
|
Addressed the safe CodeRabbit items in 3afcb7a:
I did not apply the suggested Position.quantity dust-filter change in this PR because that one is behavior-sensitive and can affect parity for tiny crypto balances. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lumibot/backtesting/backtesting_broker.py (1)
375-398:⚠️ Potential issue | 🟠 MajorOverride narrows API contract:
strategyparameter must acceptNoneand Strategy objects like the base class.The override signature
def get_active_tracked_orders(self, strategy: str, asset=None)requiresstrategyas a string, while the base classBroker.get_active_tracked_orders(self, strategy=None, asset=None)acceptsNoneand Strategy objects. This violates Liskov Substitution Principle and will causeTypeErrorin any polymorphic code calling without arguments or passing a Strategy instance.The base class uses
_strategy_name_from_input()(broker.py:834) to handle Strategy objects and None; the override must do the same. Apply the proposed fix to match the base contract:
- Change
strategy: strtostrategy=None- Call
strategy_name = self._strategy_name_from_input(strategy)to normalize input- Update the comparison at line 402 to check
if strategy_name is not None and getattr(order, "strategy", None) != strategy_name🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/backtesting/backtesting_broker.py` around lines 375 - 398, The override get_active_tracked_orders currently restricts strategy to str; change its signature to accept strategy=None to match Broker.get_active_tracked_orders, then normalize the input by calling strategy_name = self._strategy_name_from_input(strategy) near the top of the method, and use strategy_name (not the original strategy) when filtering orders (replace the asset/strategy comparison with a check like: if strategy_name is not None and getattr(order, "strategy", None) != strategy_name) while keeping the existing asset check and the fallback to get_tracked_orders.lumibot/strategies/strategy_executor.py (1)
21-22:⚠️ Potential issue | 🟡 MinorDuplicate import of
Asset.Line 21 already imports
Assetfromlumibot.entities. Line 22 is redundant.from lumibot.entities import Asset, Order -from lumibot.entities import Asset🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/strategies/strategy_executor.py` around lines 21 - 22, Remove the duplicate import of Asset in strategy_executor.py: keep a single import from lumibot.entities that includes both Asset and Order (referencing the existing import lines that use Asset and Order) and delete the redundant "from lumibot.entities import Asset" line so there is only one import statement for these symbols.
🧹 Nitpick comments (5)
lumibot/data_sources/yahoo_data.py (3)
200-205: Hot-pathinfologging will erode the speedup.This block runs on every bar pull;
infologging here adds avoidable overhead. Preferdebug(or guard by log level) for this path.Suggested fix
- logger.info( + logger.debug( "Inside _pull_source_symbol_bars for %s: self._datetime = %s, requesting length %s", asset.symbol, self._datetime, length, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/data_sources/yahoo_data.py` around lines 200 - 205, The hot-path logging in _pull_source_symbol_bars uses logger.info and should be changed to a lower-overhead level or guarded: replace the logger.info call in _pull_source_symbol_bars with logger.debug (or wrap the logger.info call in a log-level check such as logger.isEnabledFor(logging.INFO)) so the per-bar overhead is eliminated; update the logging invocation that currently references asset.symbol, self._datetime, and length accordingly.
376-386: Please document the daily-path invariant explicitly in-code.The daily fast path caches and returns an open-price-derived value for the previous fully-closed session. Add a short invariant comment/docstring in
_get_last_daily_open_priceto prevent future regressions/misuse.As per coding guidelines:
**/*.{py,ts,tsx,js}: add comments explaining 'why/invariants' for non-obvious logic.Also applies to: 424-449
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/data_sources/yahoo_data.py` around lines 376 - 386, Add an explicit invariant comment/docstring to the daily fast-path logic and the helper _get_last_daily_open_price describing that this path returns an open-price-derived value for the previous fully-closed trading session (i.e., it uses the prior session's open as the canonical "last price" and is cached per-day). Update the docstring for function _get_last_daily_open_price and add a short inline comment in the fast-path block (the code that checks isinstance(timestep, str) and "day" in timestep.lower()) clarifying the prior-session/open-derived semantics, cache key behavior, and why the cache is invalidated when current_date changes so future maintainers won't misuse the API; apply the same explanatory comment to the equivalent block referenced around the other occurrence (lines 424-449).
267-271: Uselogger.exceptioninstead oftraceback.print_exc()in the fetch loop.Direct traceback printing bypasses structured logging and is noisier in production.
logger.exception(...)keeps stack traces in the logger pipeline and is the idiomatic approach for exception logging in Python.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/data_sources/yahoo_data.py` around lines 267 - 271, In _pull_source_symbol_bars replace the explicit traceback.print_exc() and the logger.warning call with a single logger.exception call so the stack trace is captured in the logging pipeline; specifically, inside the except Exception as e block for symbol 'sym' use logger.exception("_pull_source_symbol_bars: Error fetching data for symbol %s", sym) (or include str(e) in the message) instead of calling traceback.print_exc(), ensuring the logger variable is used to record the full exception and stack trace.tests/test_safe_list.py (1)
6-12: Add regression cases forrevisionmutation semantics.This test is good, but it doesn’t cover the newly introduced
revisionbehavior (increment on mutation, no increment on no-op paths) orSafeOrderDictrevision updates.As per coding guidelines
tests/**/*.py: Add unit tests for any new functionality.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_safe_list.py` around lines 6 - 12, Add regression unit tests that assert the new revision mutation semantics: for SafeList, verify that mutating operations (e.g., append) increment the object's revision (check SafeList.revision before and after a real change) and that no-op paths (calling a method that results in no change) do not increment revision; likewise add tests for SafeOrderDict to ensure its revision increments on actual mutations (insertion/update/delete) and remains unchanged for no-op operations. Use the existing test pattern (create instance with RLock(), capture revision, perform operation, then assert revision changed or not and that contents match expected) and add descriptive test functions like test_safe_list_revision_on_mutation, test_safe_list_revision_noop, test_safe_order_dict_revision_on_mutation, and test_safe_order_dict_revision_noop to cover both behaviors.lumibot/strategies/strategy.py (1)
2487-2515: Optional: cache failed sleeptime parses too.Line 2497/2501 returns
Nonewithout updating the cache, so invalid values re-run regex/parsing on every call. CachingNonefor the current input would make this fully memoized.♻️ Suggested micro-optimization
- if isinstance(value, str): + if isinstance(value, str): normalized = value.strip().upper().replace(" ", "") if not normalized: - return None + self._sleeptime_seconds_cache_input = value + self._sleeptime_seconds_cache_value = None + return None match = re.match(r"^(\d+(?:\.\d+)?)([A-Z]*)$", normalized) if not match: - return None + self._sleeptime_seconds_cache_input = value + self._sleeptime_seconds_cache_value = None + return None qty = float(match.group(1)) suffix = match.group(2) or "M" @@ result = qty * multiplier self._sleeptime_seconds_cache_input = value self._sleeptime_seconds_cache_value = result return result - return None + self._sleeptime_seconds_cache_input = value + self._sleeptime_seconds_cache_value = None + return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lumibot/strategies/strategy.py` around lines 2487 - 2515, The sleeptime parsing early-returns that yield None don't update the memoization, causing repeated work; before each return None in the method (specifically the branches checking if not normalized and if not match), set self._sleeptime_seconds_cache_input = value and self._sleeptime_seconds_cache_value = None so invalid inputs are cached; also consider adding the same cache assignment for any other code paths that implicitly return None to ensure full memoization of failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lumibot/data_sources/yahoo_data.py`:
- Around line 249-251: The cache key for stored symbol data uses only asset,
causing mixed-timestep collisions; update _get_source_symbol_data and any cache
accesses that read/write self._data_store to include interval (and any other
relevant request params like start/end or adjusted params) in the cache key
(e.g., use a tuple or formatted string key like (asset, interval)) so lookups
and stores are consistent for different timesteps; ensure both the early return
that checks self._data_store and the later assignment that writes into
self._data_store use the same composite key so 1d/15m/1m requests do not share
cached data.
---
Outside diff comments:
In `@lumibot/backtesting/backtesting_broker.py`:
- Around line 375-398: The override get_active_tracked_orders currently
restricts strategy to str; change its signature to accept strategy=None to match
Broker.get_active_tracked_orders, then normalize the input by calling
strategy_name = self._strategy_name_from_input(strategy) near the top of the
method, and use strategy_name (not the original strategy) when filtering orders
(replace the asset/strategy comparison with a check like: if strategy_name is
not None and getattr(order, "strategy", None) != strategy_name) while keeping
the existing asset check and the fallback to get_tracked_orders.
In `@lumibot/strategies/strategy_executor.py`:
- Around line 21-22: Remove the duplicate import of Asset in
strategy_executor.py: keep a single import from lumibot.entities that includes
both Asset and Order (referencing the existing import lines that use Asset and
Order) and delete the redundant "from lumibot.entities import Asset" line so
there is only one import statement for these symbols.
---
Nitpick comments:
In `@lumibot/data_sources/yahoo_data.py`:
- Around line 200-205: The hot-path logging in _pull_source_symbol_bars uses
logger.info and should be changed to a lower-overhead level or guarded: replace
the logger.info call in _pull_source_symbol_bars with logger.debug (or wrap the
logger.info call in a log-level check such as logger.isEnabledFor(logging.INFO))
so the per-bar overhead is eliminated; update the logging invocation that
currently references asset.symbol, self._datetime, and length accordingly.
- Around line 376-386: Add an explicit invariant comment/docstring to the daily
fast-path logic and the helper _get_last_daily_open_price describing that this
path returns an open-price-derived value for the previous fully-closed trading
session (i.e., it uses the prior session's open as the canonical "last price"
and is cached per-day). Update the docstring for function
_get_last_daily_open_price and add a short inline comment in the fast-path block
(the code that checks isinstance(timestep, str) and "day" in timestep.lower())
clarifying the prior-session/open-derived semantics, cache key behavior, and why
the cache is invalidated when current_date changes so future maintainers won't
misuse the API; apply the same explanatory comment to the equivalent block
referenced around the other occurrence (lines 424-449).
- Around line 267-271: In _pull_source_symbol_bars replace the explicit
traceback.print_exc() and the logger.warning call with a single logger.exception
call so the stack trace is captured in the logging pipeline; specifically,
inside the except Exception as e block for symbol 'sym' use
logger.exception("_pull_source_symbol_bars: Error fetching data for symbol %s",
sym) (or include str(e) in the message) instead of calling
traceback.print_exc(), ensuring the logger variable is used to record the full
exception and stack trace.
In `@lumibot/strategies/strategy.py`:
- Around line 2487-2515: The sleeptime parsing early-returns that yield None
don't update the memoization, causing repeated work; before each return None in
the method (specifically the branches checking if not normalized and if not
match), set self._sleeptime_seconds_cache_input = value and
self._sleeptime_seconds_cache_value = None so invalid inputs are cached; also
consider adding the same cache assignment for any other code paths that
implicitly return None to ensure full memoization of failures.
In `@tests/test_safe_list.py`:
- Around line 6-12: Add regression unit tests that assert the new revision
mutation semantics: for SafeList, verify that mutating operations (e.g., append)
increment the object's revision (check SafeList.revision before and after a real
change) and that no-op paths (calling a method that results in no change) do not
increment revision; likewise add tests for SafeOrderDict to ensure its revision
increments on actual mutations (insertion/update/delete) and remains unchanged
for no-op operations. Use the existing test pattern (create instance with
RLock(), capture revision, perform operation, then assert revision changed or
not and that contents match expected) and add descriptive test functions like
test_safe_list_revision_on_mutation, test_safe_list_revision_noop,
test_safe_order_dict_revision_on_mutation, and
test_safe_order_dict_revision_noop to cover both behaviors.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cca65e1d-27a3-48f3-8aff-2f6342b75ba0
📒 Files selected for processing (9)
lumibot/backtesting/backtesting_broker.pylumibot/data_sources/yahoo_data.pylumibot/strategies/strategy.pylumibot/strategies/strategy_executor.pylumibot/trading_builtins/safe_list.pytests/test_backtesting_broker.pytests/test_minimal_serialization.pytests/test_safe_list.pytests/test_strategy_executor_progress_payload.py
| if asset in self._data_store: | ||
| data = self._data_store[asset] | ||
| else: | ||
| # Try each symbol format until we get data | ||
| data = None | ||
| successful_symbol = None | ||
|
|
||
| for sym in symbols_to_try: | ||
| logger.info(f"Attempting to fetch data for symbol: {sym}") | ||
| try: | ||
| # Fetch data using the helper without restricting dates here | ||
| data = YahooHelper.get_symbol_data( | ||
| sym, | ||
| interval=interval, | ||
| auto_adjust=self.auto_adjust, | ||
| last_needed_datetime=self.datetime_end, # Keep this if needed for caching logic | ||
| ) | ||
| if data is not None and data.shape[0] > 0: | ||
| logger.info(f"Successfully fetched data for symbol: {sym}") | ||
| successful_symbol = sym | ||
| break | ||
| except Exception as e: | ||
| logger.warning(f"_pull_source_symbol_bars: Error fetching data for symbol {sym}: {str(e)}") | ||
| # Print the traceback for debugging | ||
| import traceback | ||
| traceback.print_exc() | ||
|
|
||
|
|
||
|
|
||
| if data is None or data.shape[0] == 0: | ||
| # Use self.datetime_start and self.datetime_end in the error message for clarity | ||
| message = f"{self.SOURCE} did not return data for symbol {asset.symbol}. Tried: {symbols_to_try}. Make sure this symbol is valid and data exists for the period {self.datetime_start} to {self.datetime_end}." | ||
| logger.error(message) | ||
| return None | ||
|
|
||
| data = self._append_data(asset, data) | ||
| return self._data_store[asset] | ||
|
|
There was a problem hiding this comment.
Cache key misses interval, so mixed-timestep requests can return wrong data.
_get_source_symbol_data() computes interval but cache lookup/storage uses only asset. If the same asset is requested with different timesteps (1d, 15m, 1m), cached frames/arrays can be reused incorrectly.
Suggested fix
- def _append_data(self, asset, data):
+ def _append_data(self, cache_key, data):
@@
- self._data_store[asset] = data
- self._data_index_values[asset] = data.index.values
- self._data_open_values[asset] = data["open"].to_numpy(copy=False)
+ self._data_store[cache_key] = data
+ self._data_index_values[cache_key] = data.index.values
+ self._data_open_values[cache_key] = data["open"].to_numpy(copy=False)
return data
@@
- if asset in self._data_store:
- return self._data_store[asset]
+ cache_key = (asset, interval)
+ if cache_key in self._data_store:
+ return self._data_store[cache_key]
@@
- data = self._append_data(asset, data)
+ data = self._append_data(cache_key, data)Also applies to: 282-285
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lumibot/data_sources/yahoo_data.py` around lines 249 - 251, The cache key for
stored symbol data uses only asset, causing mixed-timestep collisions; update
_get_source_symbol_data and any cache accesses that read/write self._data_store
to include interval (and any other relevant request params like start/end or
adjusted params) in the cache key (e.g., use a tuple or formatted string key
like (asset, interval)) so lookups and stores are consistent for different
timesteps; ensure both the early return that checks self._data_store and the
later assignment that writes into self._data_store use the same composite key so
1d/15m/1m requests do not share cached data.
Summary
Validation
uv run --with-requirements requirements_dev.txt pytest -q tests/test_cash.py tests/test_strategy_get_positions_cache.py tests/test_backtesting_broker_assignment_config_cache.py tests/test_position_quantity_cache.py tests/test_strategy_dividend_cash_batch.py tests/test_strategy_executor_progress_payload.py tests/test_yahoo_daily_last_price_fastpath.py tests/test_backtesting_crypto_cash_unit.py tests/test_backtesting_broker.py tests/test_asset.py tests/test_helpers.py112 passed, 3 warnings, 22 subtests passeddiversified_leverage_with_thresholdstock backtest still returnstotal_return = 23.9603731077846441.50s-1.53sband on the local machine, with a best observed batch mean of about1.50sSummary by CodeRabbit
New Features
get_active_tracked_orders()now supports optional asset filtering for targeted order retrieval.Performance