Replies: 1 comment
-
|
Answered authoritatively on edtechre/pybroker#231 by the maintainer:
Closing the loop: V3 proceeds as option A — parallelize only Leaving this discussion open as a discoverable record for anyone who finds it searching for the same design question. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Context
While scoping V3 parallel-training work (#231 thread), one design question kept surfacing that deserves its own place to breathe: should walkforward windows share a single
Portfolioinstance (current behavior) or get independentPortfolioinstances per window?This is a design / intent question, not a bug report. I'd like to understand the reasoning behind the current choice before proposing any change.
Today's behavior
_run_walkforward(strategy.py:1360-1445) threads a singlePortfolioinstance throughbacktest_executionsfor every window. The following state accumulates window N -> N+1:cash,equity,market_value,margin,pnl,feesorders(deque),trades(deque),bars(equity curve),position_barslong_positions/short_positions(open positions cross window boundaries unclosed)win_rate/loss_rate/_wins_order_id,_entry_id,_trade_id_stop_data,_stop_recordssessions: defaultdict(dict)(lives outsidePortfoliobut is also window-shared)Combined with the
walkforward_split(strategy.py:672-815) behavior — test periods are contiguous in time, only the train range rolls back — this produces a single continuous trading simulation with periodic model refresh on a rolling training window. Textbook Pardo-style walkforward optimization.Why I'm raising it — parallelism / distribution
I'd prefer independent
Portfolioinstances per window because it makes parallelism and distribution straightforward:joblib/concurrent.futures.Executor/ Ray / Dask them across cores or nodes without reasoning about shared mutable state.The shared-
Portfoliodesign blocks all of that without significant refactoring.Open question 1 — is there an intentional reason for the current design?
Before proposing a change I want to understand whether the shared-
Portfoliobehavior is a deliberate semantics choice or a consequence of the original implementation. Candidates I can see:sessionscontinuity. Thesessionsdict is designed for users to stash per-symbol state across the execution. If each window got a freshPortfolio, what happens tosessions? Reset? Preserved separately?TestResultexposes a singleportfolio,positions,orders,trades,metricsetc. Shape change implications if we go independent-per-window.If any of (1)-(4) is load-bearing, the "reuse one Portfolio" choice is correct and independent-per-window should be opt-in, not the default.
Open question 2 — if independent, how do windows merge back?
Even if we accept independent Portfolios per window, the result eventually needs to collapse into something users can work with. This is an open question for me too:
TestResult-shaped objects plus an aggregate. Users who want the single continuous view run withoutparallel_windows=True; users who want per-window statistics opt in. This is the ML cross-validation shape.I don't have a strong opinion between these three. The "right" choice probably depends on what question users are asking —
walkforwardas a realism check vswalkforwardas an ML evaluation protocol.Scope of this discussion
Not asking for code changes today. Asking:
Portfoliodesign an intentional semantics commitment? Which of the reasons above (or another) drove it?If the answers converge on "yes, go ahead, opt-in, concatenate" (or similar), I'm happy to draft the API shape as a follow-up PR and we can iterate there. If the shared-
Portfoliodesign is load-bearing and should stay, I'll scope V3 to training-only parallelism (#231 option A) and close this out.Cross-reference: #231 (performance campaign; V3 discussion).
Beta Was this translation helpful? Give feedback.
All reactions