Skip to content

Conversation

@Satvik-Singh192
Copy link
Contributor

Description

This pull request implements performance optimizations for the core engine by introducing accelerated execution paths using Numba and Cython, along with a full benchmarking suite.

Key Enhancements

  • Profiled the existing backtest pipeline to identify hot paths (core loop & PnL computation).
  • Added numba_opt.py implementing Numba JIT-accelerated versions of critical functions.
  • Created benchmarks/bench_opt.py to benchmark:
    • Vanilla (pure Python)
    • Numba-optimized
    • Cython-optimized
  • Added CI benchmark workflow (manual/canary) that publishes benchmark artifacts for easy inspection.

These improvements significantly enhance performance for large-scale experiments and production-grade scenarios.


Semver Changes

  • Patch (bug fix, no new features)
  • Minor (new features, no breaking changes)
  • Major (breaking changes)

Issues

This pull request closes the following issue(s):


Checklist

  • I have read the Contributing Guidelines.
  • I have profiled and validated the performance improvements.
  • I have added and verified Numba and Cython accelerated implementations.
  • I have included benchmark scripts and ensured stable reproducibility.
  • I have added relevant documentation and example usage where necessary.
  • All tests pass and linting tools (ruff/black) have been run successfully.

@netlify
Copy link

netlify bot commented Nov 15, 2025

Deploy Preview for strong-duckanoo-898b2c ready!

Name Link
🔨 Latest commit 818058b
🔍 Latest deploy log https://app.netlify.com/projects/strong-duckanoo-898b2c/deploys/69188aef7d3bc5000810fe56
😎 Deploy Preview https://deploy-preview-143--strong-duckanoo-898b2c.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@Satvik-Singh192
Copy link
Contributor Author

@ayushkrtiwari please review my Pull request sir

@ayushkrtiwari ayushkrtiwari added Semver:minor minor version changes Type:Hard senior developers, max points labels Nov 15, 2025
Copilot finished reviewing on behalf of ayushkrtiwari November 15, 2025 14:24
@ayushkrtiwari ayushkrtiwari merged commit a27eeda into OPCODE-Open-Spring-Fest:main Nov 15, 2025
17 of 20 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces performance optimizations for the backtest engine by adding Numba JIT-compiled and Cython-compiled alternatives to critical computation functions, along with comprehensive benchmarking and profiling tools.

Key Changes:

  • Added Numba-accelerated implementations of core backtest operations (strategy returns, turnover, portfolio value calculations)
  • Introduced Cython-optimized versions for additional performance comparison
  • Created benchmarking suite to measure performance gains across different dataset sizes
  • Added profiling script to identify performance bottlenecks

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
src/quant_research_starter/backtest/numba_opt.py Implements JIT-compiled versions of backtest operations with Numba decorators and fallback support
src/quant_research_starter/backtest/cython_opt.pyx Provides Cython-optimized implementations for strategy returns and turnover calculations
src/quant_research_starter/benchmarks/bench_opt.py Benchmarking script comparing vanilla, Numba, and Cython implementations across multiple dataset sizes
src/quant_research_starter/backtest/profile_backtest.py Profiling utility to identify performance hotspots in the backtest pipeline
src/quant_research_starter/backtest/setup_cython.py Build configuration for compiling Cython extensions
.github/workflows/benchmark.yml CI workflow for automated benchmark execution and artifact storage
Comments suppressed due to low confidence (1)

src/quant_research_starter/benchmarks/bench_opt.py:77

  • This assignment to 'current_weights' is unnecessary as it is redefined before this value is used.
    current_weights = np.zeros(n_assets, dtype=np.float64)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


for i in prange(n_days):
ret_sum = 0.0
for j in prange(n_assets):
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested prange usage in Numba may cause performance issues. Using prange for both the outer loop (line 32) and inner loop (line 34) can lead to thread contention and may not provide the expected parallelization benefits. Consider using prange only for the outer loop and regular range for the inner loop, or using a flattened parallelization strategy.

Suggested change
for j in prange(n_assets):
for j in range(n_assets):

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +51
stats.sort_stats("tottime")
stats.print_stats(20)

print("\nTop 20 functions by total time:")
s2 = StringIO()
stats = pstats.Stats(profiler, stream=s2)
stats.sort_stats("tottime")
stats.print_stats(20)
print(s2.getvalue())
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate profiler statistics generation. Lines 36-41 and lines 47-51 both generate and print statistics sorted by "tottime", but the first set (lines 43-44) regenerates the stats object unnecessarily. The second pstats.Stats creation on line 48 overwrites the previous stats configuration, making lines 43-44 redundant. Consider removing lines 43-44 or restructuring to avoid duplication.

Copilot uses AI. Check for mistakes.
return prices, signals


def benchmark_vanilla(prices: pd.DataFrame, signals: pd.DataFrame) -> float:
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint for benchmark_vanilla is incorrect. The function returns a tuple (elapsed, results) but is annotated to return float. The return type should be tuple[float, dict] or similar.

Suggested change
def benchmark_vanilla(prices: pd.DataFrame, signals: pd.DataFrame) -> float:
def benchmark_vanilla(prices: pd.DataFrame, signals: pd.DataFrame) -> tuple[float, dict]:

Copilot uses AI. Check for mistakes.
return rank_based_weights(signals, max_leverage, long_pct, short_pct)


def benchmark_cython(prices: pd.DataFrame, signals: pd.DataFrame) -> float:
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint for benchmark_cython is incorrect. The function returns a tuple (elapsed, results) or (None, None) but is annotated to return float. The return type should be tuple[float | None, dict | None] or similar.

Suggested change
def benchmark_cython(prices: pd.DataFrame, signals: pd.DataFrame) -> float:
def benchmark_cython(prices: pd.DataFrame, signals: pd.DataFrame) -> tuple[float | None, dict | None]:

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +82
current_weights = np.zeros(n_assets, dtype=np.float64)
for date in returns_df.index:
signal_row = aligned_signals.loc[date].values.astype(np.float64)
weights = compute_rank_weights_numba(signal_row, 1.0, 0.9, 0.1)
current_weights = weights.copy()
weights_list.append(current_weights)
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inefficient sequential use of .copy(). Line 81 uses .copy() but the result weights is immediately assigned to current_weights, making the copy unnecessary. The code should either use current_weights = weights without the copy, or remove the intermediate weights variable entirely.

Suggested change
current_weights = np.zeros(n_assets, dtype=np.float64)
for date in returns_df.index:
signal_row = aligned_signals.loc[date].values.astype(np.float64)
weights = compute_rank_weights_numba(signal_row, 1.0, 0.9, 0.1)
current_weights = weights.copy()
weights_list.append(current_weights)
# current_weights = np.zeros(n_assets, dtype=np.float64)
for date in returns_df.index:
signal_row = aligned_signals.loc[date].values.astype(np.float64)
weights = compute_rank_weights_numba(signal_row, 1.0, 0.9, 0.1)
weights_list.append(weights)

Copilot uses AI. Check for mistakes.
n_assets = len(signals)
weights = np.zeros(n_assets)

valid_mask = np.zeros(n_assets, dtype=np.bool_)
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecated np.bool_ dtype usage. Line 93 uses dtype=np.bool_ which is deprecated in NumPy 1.20+ and will be removed. Use dtype=np.bool or dtype=bool instead.

Suggested change
valid_mask = np.zeros(n_assets, dtype=np.bool_)
valid_mask = np.zeros(n_assets, dtype=bool)

Copilot uses AI. Check for mistakes.
return elapsed, results


def benchmark_numba(prices: pd.DataFrame, signals: pd.DataFrame) -> float:
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint for benchmark_numba is incorrect. The function returns a tuple (elapsed, results) or (None, None) but is annotated to return float. The return type should be tuple[float | None, dict | None] or similar.

Suggested change
def benchmark_numba(prices: pd.DataFrame, signals: pd.DataFrame) -> float:
def benchmark_numba(prices: pd.DataFrame, signals: pd.DataFrame) -> tuple[float | None, dict | None]:

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +105
def compute_rank_weights_numba(signals, max_leverage, long_pct, short_pct):
"""Helper to compute rank weights using Numba."""
return rank_based_weights(signals, max_leverage, long_pct, short_pct)
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper function compute_rank_weights_numba is redundant - it's a trivial wrapper that simply calls rank_based_weights with the same parameters. Consider calling rank_based_weights directly on line 80 to reduce unnecessary indirection.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +182
returns_df = prices.pct_change().dropna()
aligned_signals = signals.loc[returns_df.index]

returns_arr = returns_df.values
n_days, n_assets = returns_arr.shape

weights_list = []
current_weights = np.zeros(n_assets, dtype=np.float64)
for date in returns_df.index:
signal_row = aligned_signals.loc[date].values.astype(np.float64)
weights = compute_rank_weights_numba(signal_row, 1.0, 0.9, 0.1)
current_weights = weights.copy()
weights_list.append(current_weights)

weights = np.array(weights_list, dtype=np.float64)
weights_prev = np.vstack([np.zeros((1, n_assets), dtype=np.float64), weights[:-1]])

turnover = compute_turnover(weights, weights_prev)
strat_ret = compute_strategy_returns(
weights_prev, returns_arr.astype(np.float64), turnover, 0.001
)
portfolio_value = compute_portfolio_value(strat_ret, 1_000_000.0)

elapsed = time.perf_counter() - start
results = {
"portfolio_value": pd.Series(portfolio_value, index=returns_df.index),
"returns": pd.Series(
np.diff(portfolio_value) / portfolio_value[:-1], index=returns_df.index[1:]
),
}
return elapsed, results


def compute_rank_weights_numba(signals, max_leverage, long_pct, short_pct):
"""Helper to compute rank weights using Numba."""
return rank_based_weights(signals, max_leverage, long_pct, short_pct)


def benchmark_cython(prices: pd.DataFrame, signals: pd.DataFrame) -> float:
"""Benchmark Cython-accelerated implementation."""
if not CYTHON_AVAILABLE:
return None, None

start = time.perf_counter()

returns_df = prices.pct_change().dropna()
aligned_signals = signals.loc[returns_df.index]

returns_arr = returns_df.values.astype(np.float64)
n_days, n_assets = returns_arr.shape

weights_list = []
current_weights = np.zeros(n_assets, dtype=np.float64)
for date in returns_df.index:
signal_row = aligned_signals.loc[date].values.astype(np.float64)
weights = compute_rank_weights_cython(signal_row, 1.0)
current_weights = weights
weights_list.append(current_weights)

weights = np.array(weights_list, dtype=np.float64)
weights_prev = np.vstack([np.zeros((1, n_assets), dtype=np.float64), weights[:-1]])

turnover = compute_turnover_cython(weights, weights_prev)
strat_ret = compute_strategy_returns_cython(
weights_prev, returns_arr, turnover, 0.001
)

portfolio_value = compute_portfolio_value(strat_ret, 1_000_000)

elapsed = time.perf_counter() - start
results = {
"portfolio_value": pd.Series(portfolio_value, index=returns_df.index),
"returns": pd.Series(
np.diff(portfolio_value) / portfolio_value[:-1], index=returns_df.index[1:]
),
}
return elapsed, results


def compute_rank_weights_cython(signals, max_leverage):
"""Helper to compute rank weights (simplified for Cython benchmark)."""
valid_mask = ~np.isnan(signals)
valid_signals = signals[valid_mask]
if len(valid_signals) == 0:
return np.zeros_like(signals)

ranks = np.argsort(np.argsort(valid_signals)) + 1
long_threshold = np.percentile(ranks, 90)
short_threshold = np.percentile(ranks, 10)

weights = np.zeros_like(signals)
valid_idx = 0
for i in range(len(signals)):
if valid_mask[i]:
if ranks[valid_idx] >= long_threshold:
weights[i] = 1.0
elif ranks[valid_idx] <= short_threshold:
weights[i] = -1.0
valid_idx += 1

long_count = (weights > 0).sum()
short_count = (weights < 0).sum()

if long_count > 0:
weights[weights > 0] = 1.0 / long_count
if short_count > 0:
weights[weights < 0] = -1.0 / short_count

total_leverage = abs(weights).sum()
if total_leverage > max_leverage:
weights *= max_leverage / total_leverage

return weights
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is significant code duplication between benchmark_numba (lines 70-91), benchmark_cython (lines 115-137), and partially in compute_rank_weights_cython (lines 151-182). The data preprocessing logic (computing returns, aligning signals, building weights arrays) is repeated. Consider extracting this common logic into a shared helper function to improve maintainability.

Copilot uses AI. Check for mistakes.
n_days, n_assets = returns_arr.shape

weights_list = []
current_weights = np.zeros(n_assets, dtype=np.float64)
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to 'current_weights' is unnecessary as it is redefined before this value is used.

Suggested change
current_weights = np.zeros(n_assets, dtype=np.float64)
# current_weights = np.zeros(n_assets, dtype=np.float64) # Removed unnecessary assignment

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Semver:minor minor version changes Type:Hard senior developers, max points

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants