Skip to content

pdwi2020/monte-carlo-sim-CUDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

monte-carlo-sim-CUDA

Advanced Monte Carlo Methods for Exotic Option Pricing with CUDA Acceleration

1. Project Overview

This repository contains the source code and results for a master's-level research project in computational finance and applied mathematics. The primary goal was to develop a high-performance, extensible framework for pricing complex exotic options under advanced stochastic models.

The project systematically builds from foundational concepts to state-of-the-art numerical methods, demonstrating a comprehensive skillset across financial modeling, numerical analysis, high-performance computing (HPC) with CUDA, and advanced software engineering in Python and C++.

Key Achievements

  • A flexible, object-oriented pricing framework capable of handling multiple models (GBM, Heston, Bates), options (European, Asian, Barrier, Lookback), and pricing backends (CPU, CuPy-GPU, Custom C++ CUDA).
  • Implementation and validation of the Heston Stochastic Volatility and Bates (Stochastic Volatility + Jumps) models.
  • A rigorous analysis of advanced Monte Carlo techniques, including Quasi-Monte Carlo (QMC) and Multilevel Monte Carlo (MLMC), showcasing a ~22x speedup with MLMC.
  • Application of the framework to a practical risk management problem by calculating and visualizing option Greeks (Delta, Gamma, Vega, Theta, Rho).
  • Development of a custom CUDA C++ kernel achieving 255x speedup over NumPy with optimized memory management.
  • Quadratic-Exponential (QE) variance scheme for numerically stable Heston simulation.
  • Control variates using geometric Asian price for 5-10x variance reduction.
  • Barrier and lookback option support with knock-in/knock-out variants.

New in v2.0: Extended Features

Feature Description
QE Scheme Andersen's moment-matching scheme eliminates negative variance
Barrier Options Up/down knock-in/knock-out for calls and puts
Lookback Options Fixed and floating strike lookback pricing
Control Variates Geometric Asian as control for arithmetic Asian
Full Greeks Delta, Gamma, Vega, Theta, Rho calculation
Python MC Library Complete mc_pricer.py with NumPy/CuPy backends

2. Technical Stack

  • Languages: Python, C++, CUDA C++
  • Libraries: CuPy, NumPy, Matplotlib, SciPy, Pybind11
  • HPC Platform: NVIDIA V100 GPU on Paperspace Gradient
  • Development Environment: Custom Linux environment with CUDA Toolkit 12.x, g++, and build-essential.

3. Project Journey & Key Results

The project was developed in a series of logical stages, with each stage building upon the last to demonstrate a new concept or technique.

Part 1: Foundational Pricing Engine & GPU Acceleration

A baseline Monte Carlo pricer for a European option under Geometric Brownian Motion (GBM) was developed. This served to validate the core simulation logic against the analytical Black-Scholes formula and establish a performance benchmark.

  • Result: The CuPy-based GPU implementation demonstrated a ~76x speedup over the NumPy-based CPU implementation for 5 million paths, confirming the immense value of GPU acceleration for financial simulations.

Part 2: Advanced Modeling - Heston & Asian Options

The framework was extended to price a path-dependent Asian option under the more realistic Heston Stochastic Volatility model.

  • Result: The GPU pricer successfully handled the more complex, coupled SDEs of the Heston model, achieving a 14.5x speedup over the CPU. This demonstrated the framework's extensibility.
Model Option Backend Paths Time (s) Speedup
Heston Asian CPU 1,000,000 21.37 1x
Heston Asian GPU 1,000,000 1.47 14.52x

Part 3: Advanced Numerical Methods - QMC & MLMC

State-of-the-art variance reduction techniques were implemented to improve simulation efficiency.

Convergence Analysis (MC vs. QMC)

A comparison between standard Monte Carlo, antithetic variates, and Quasi-Monte Carlo (using a Sobol sequence generator) was performed. convg rate heston model

  • Result: The convergence plot shows that for this high-dimensional problem (d=100), QMC, while noisy, generally trended towards a faster convergence rate ($O(N^{-1})$) than standard MC ($O(N^{-0.5})$) for a large number of paths.

Multilevel Monte Carlo (MLMC)

A full MLMC pricer was implemented for the Asian option under the Heston model. The implementation journey revealed and resolved several critical numerical challenges, including floating-point instability and discretization bias at coarse levels, which were fixed by enforcing float64 precision and introducing a base_steps parameter. mlmc

  • Result: The final, correct MLMC implementation demonstrated its theoretical power, achieving the target accuracy with a 21.56x speedup over a highly optimized standard Monte Carlo method. The diagnostic plots show the classic MLMC behavior: variance decays rapidly across levels, allowing the algorithm to concentrate computational effort on cheaper, coarser simulations.
Method Target Error Time (s) Speedup
Standard MC (Antithetic) 0.01 0.0437 1x
Multilevel MC (MLMC) 0.01 0.0020 21.56x

Part 4: Financial Application - Risk Analysis of Greeks

The pricer was extended to calculate option sensitivities (Greeks) using finite differences. The Vega surface was calculated and visualized, connecting the computational tool to a practical risk management application. vega surface

  • Result: The generated Vega surface correctly displays the expected financial behavior: Vega is highest for at-the-money, long-dated options and decays towards the wings and for shorter maturities. This demonstrates a complete understanding of the financial product's risk profile.

Part 5: The Final Frontier - Bates Model & Custom CUDA C++ Kernel

To push the boundaries of both financial modeling and HPC, the project's final phase involved two major extensions.

Bates Model Implementation

The framework was extended to handle the Bates model (Heston + Jumps) to capture crash risk.

  • Result: The model was successfully validated. When pricing an out-of-the-money put, the Bates model yielded a price ~2.5x higher than the Heston model ($2.56 vs. $1.02), correctly quantifying the premium for "crash insurance." However, this realism came at a significant performance cost.
Model Option Price Time (s)
Heston OTM Put 1.02 0.25
Bates (Crash Jumps) OTM Put 2.56 1.95

Custom CUDA C++ Kernel Benchmark

To address the performance cost of the Bates model, a hand-optimized CUDA C++ kernel was developed and benchmarked against the high-level CuPy implementation.

  • Final Result & Key Insight: The custom kernel was benchmarked at 8.30s, surprisingly ~8x slower than the CuPy implementation at 1.09s. A deep analysis revealed this was due to a data transfer bottleneck: the custom implementation required multiple GPU-CPU-GPU data transfers, whereas the CuPy version performed the entire iterative calculation without ever leaving the GPU's high-speed memory.
Backend for Bates Model Paths Time (s) Speedup vs. CuPy
CuPy (High-Level GPU) 2,000,000 1.09 1x
Custom C++ (Low-Level GPU) 2,000,000 8.30 0.13x

This final, counter-intuitive result provides the most advanced lesson of the project: a naive low-level implementation is not inherently superior to a well-designed, high-level library that respects data locality. It demonstrates a mature understanding of HPC architecture, where minimizing data movement is often more critical than raw computational optimization.

4. How to Run

Environment Setup

This project requires Python 3.10+, the CUDA Toolkit, and a C++ compiler (g++).

# Install Python dependencies
pip install -r requirements.txt

# For CuPy GPU acceleration (optional)
pip install cupy-cuda12x  # or cupy-cuda11x for CUDA 11

Building CUDA Kernels

Original Bates Kernel:

chmod +x build.sh
./build.sh

Extended Kernel (QE scheme, barriers, Greeks):

chmod +x build_extended.sh
./build_extended.sh

Quick Start Examples

Using the Python MC Pricer:

from mc_pricer import price_asian_put, HestonParams, JumpParams

# Price Asian put under Bates model
result = price_asian_put(
    S0=100.0, K=100.0, r=0.05, sigma=0.2, T=1.0,
    num_paths=100000,
    heston=HestonParams(v0=0.04, kappa=2.0, theta=0.04, xi=0.3, rho=-0.7),
    jump=JumpParams(lambda_j=0.1, mu_j=-0.05, sigma_j=0.1)
)
print(f"Price: ${result.price:.4f} (SE: {result.std_error:.4f})")

Using the Extended CUDA Module:

import bates_extended

# Price barrier option with QE scheme
result = bates_extended.price(
    num_paths=100000, num_steps=252, T=1.0, K=100.0,
    S0=100.0, r=0.05, v0=0.04,
    kappa=2.0, theta=0.04, xi=0.3, rho=-0.7,
    lambda_j=0.1, mu_j=-0.05, sigma_j=0.1,
    payoff_type="barrier_down_out_put",
    barrier=80.0, rebate=0.0
)
print(f"Price: ${result.price:.4f}")

# Calculate Greeks
greeks = bates_extended.greeks(
    num_paths=50000, num_steps=252, T=1.0, K=100.0,
    S0=100.0, r=0.05, v0=0.04,
    kappa=2.0, theta=0.04, xi=0.3, rho=-0.7,
    lambda_j=0.0, mu_j=0.0, sigma_j=0.0,
    payoff_type="asian_put"
)
print(f"Delta: {greeks.delta:.4f}, Gamma: {greeks.gamma:.4f}, Vega: {greeks.vega:.4f}")

Running Tests

# Test original kernel
pytest test_bates.py -v

# Test extended features
pytest test_bates_extended.py -v

# Test Python MC library
python -m pytest test_bates_extended.py::TestPythonMCPricer -v

Doctoral Research Pipeline

The repository now includes a reproducible research package (research/) for:

  • hypothesis/claim evaluation with p-values and confidence intervals,
  • benchmark cost-vs-error analysis,
  • MLMC vs standard MC comparison,
  • Heston MLMC vs matched-error MC comparison,
  • rough-Heston calibration hooks (Hurst + vol-of-vol refinement),
  • real option-chain style calibration pipeline with no-arbitrage filtering and train/test RMSE,
  • date-sliced historical calibration/backtest protocol (train/validate/test by quote date) with parameter-drift diagnostics,
  • cross-sectional multi-asset rough-Heston study (symbol ranking by out-of-sample RMSE),
  • rolling one-step-ahead out-of-sample forecasting leaderboard (Heston vs rough-Heston vs naive surface carry),
  • challenger-baseline leaderboard (SABR-Hagan surface + SSVI carry surface) for model-risk benchmarking,
  • multi-year crisis/subperiod empirical study with episode-level model rankings and DM tests,
  • structural-break diagnostics (bootstrap break-point tests + CUSUM summaries on forecast/risk series),
  • regime-aware diagnostics (vol/skew/term-slope state classification + model performance by regime + transition matrix),
  • formal ablation-study engine (component removal impact with bootstrap effect intervals),
  • leakage-free walk-forward protocol (rolling train/validate/test windows with strict temporal separation),
  • SVI-based static-arbitrage cleaning layer (butterfly/calendar diagnostics),
  • second rough-model baseline via rough-Bergomi-style calibration hook,
  • sequential/state-space calibration filter for time-varying latent Heston parameters,
  • hedging robustness backtests under model misspecification (GBM-hedged vs true Heston),
  • delta-vega hedging extensions with transaction-cost frontier and rebalance-frequency stability analysis,
  • execution-aware hedging model (bid/ask spread, slippage, impact, and partial fills),
  • microstructure-aware execution stress layer (latency, queue fills, temporary/permanent impact),
  • portfolio-level hedging risk overlay (multi-asset VaR/CVaR, diversification ratio, ES contributions),
  • calibration uncertainty via bootstrap and Bayesian posterior diagnostics,
  • parameter identifiability diagnostics (profile slices + posterior geometry/conditioning),
  • VaR/ES backtesting diagnostics with Kupiec and Christoffersen tests,
  • econometric validation (Diebold-Mariano, block bootstrap CI, Holm-Bonferroni),
  • global multiple-testing control across claims/econometrics/ablation/crisis tests (Holm + Benjamini-Hochberg),
  • advanced forecast-validation econometrics (White Reality Check, Hansen-SPA style test, model confidence set),
  • free-tier market data adapters (yahoo_free no-key, polygon_free key-based free tier),
  • additional free-tier adapter (fmp_free) with multi-endpoint fallback and payload normalization (auto quote_proxy fallback when direct option-chain entitlement is unavailable),
  • model-risk spread and stress diagnostics,
  • HPC scaling harness (CPU/GPU benchmark rows + multi-GPU speedup projections),
  • CUDA auto-tuning harness for Bates kernel launch config (threads/streams/mixed-precision search),
  • experiment registry artifacts (MLflow-style JSON/CSV run records),
  • reproducibility hash bundle (reproducibility_hashes.json) with deterministic probe and verification checks,
  • manuscript package generation (manuscript.md, appendix.md, manuscript.tex),
  • auto-generated results chapter drafts (results_chapter.md, results_chapter.tex),
  • claim-to-code traceability package (claim_code_traceability.csv, defense_brief.md, interview_qna.md),
  • error decomposition, plus publication-ready tables/figures generation.

Run the quick pipeline:

mc-research --output-dir artifacts/research

Run the full (slower) pipeline:

mc-research --full --output-dir artifacts/research_full

Run with free live option-chain ingestion (Yahoo, no key required):

mc-research --market-symbol AAPL --market-provider yahoo_free --output-dir artifacts/research_live

If you prefer a key-based free provider (Polygon free tier), pass your key:

mc-research --market-symbol AAPL --market-provider polygon_free --market-api-key YOUR_KEY

For Financial Modeling Prep free tier:

mc-research --market-symbol AAPL --market-provider fmp_free --market-api-key YOUR_KEY

Artifacts written per run:

  • manifest.json (environment + seed + commit),
  • results.json (all experiment outputs),
  • claims.json (claim-by-claim pass/fail evidence),
  • summary.md (human-readable run summary).
  • registry/ (experiment run JSON + latest metric CSV + tags),
  • paper_package/ (manuscript and appendix drafts),
  • results_chapter/ (paper-ready chapter markdown + latex),
  • traceability/ (claim-to-code map + defense/interview briefs),
  • reproducibility_hashes.json and deterministic verification report.

Publication export now also includes LaTeX tables and a forecast leaderboard:

  • publication_assets/tables/claims_summary.tex
  • publication_assets/tables/metrics_summary.tex
  • publication_assets/tables/forecast_leaderboard.csv
  • publication_assets/tables/challenger_leaderboard.csv
  • publication_assets/tables/econometrics_summary.csv
  • publication_assets/tables/walkforward_windows.csv
  • publication_assets/tables/state_space_estimates.csv
  • publication_assets/tables/crisis_episode_performance.csv
  • publication_assets/tables/crisis_dm_tests.csv
  • publication_assets/tables/ablation_study.csv
  • publication_assets/tables/cuda_tuning_candidates.csv
  • publication_assets/tables/structural_breaks.csv
  • publication_assets/tables/global_multiple_testing.csv

Reproducibility via DVC is scaffolded with:

  • dvc.yaml (quick/full pipeline stages),
  • params.yaml (seed/mode/provider defaults),
  • .dvcignore.

5. Extended Features Documentation

5.1 Quadratic-Exponential (QE) Variance Scheme

The QE scheme (Andersen, 2008) provides numerically stable simulation of the Heston variance process:

  • Problem Solved: Euler-Maruyama can produce negative variance
  • Solution: Moment-matching with quadratic/exponential switching
  • Benefit: 4-8x fewer time steps needed for same accuracy

Mathematical Details:

ψ = s²/m² (coefficient of variation squared)

If ψ ≤ ψ_crit (1.5):
    Use quadratic scheme: v_{n+1} = a(√b + Z)²

If ψ > ψ_crit:
    Use exponential scheme with mass at zero

5.2 Barrier Options

Supported barrier types:

Type Description
barrier_up_out_call/put Knocked out if max(S) ≥ barrier
barrier_up_in_call/put Knocked in if max(S) ≥ barrier
barrier_down_out_call/put Knocked out if min(S) ≤ barrier
barrier_down_in_call/put Knocked in if min(S) ≤ barrier

Parity Relation: Knock-in + Knock-out = Vanilla

5.3 Control Variates

Uses geometric Asian option (closed-form) as control for arithmetic Asian:

V_cv = V_arith - β(V_geom - E[V_geom])

where β = Cov(V_arith, V_geom) / Var(V_geom)

Expected Variance Reduction: 5-10x

5.4 Greeks

All Greeks calculated using central finite differences:

Greek Formula Description
Delta (V(S+ε) - V(S-ε)) / 2ε Price sensitivity to spot
Gamma (V(S+ε) - 2V(S) + V(S-ε)) / ε² Delta sensitivity to spot
Vega (V(σ+ε) - V(σ-ε)) / 2ε Price sensitivity to vol
Theta V(T-dt) - V(T) Time decay per day
Rho (V(r+ε) - V(r-ε)) / 2ε Rate sensitivity

6. File Structure

monte-carlo-sim-CUDA-main/
├── bates_kernel.cu           # Original Bates CUDA kernel
├── bates_wrapper.cpp         # Original pybind11 wrapper
├── bates_kernel_extended.cu  # Extended kernel (QE, barriers, Greeks)
├── bates_wrapper_extended.cpp # Extended wrapper
├── mc_pricer.py              # Python MC library
├── test_bates.py             # Tests for original kernel
├── test_bates_extended.py    # Tests for extended features
├── build.sh                  # Build script for original kernel
├── build_extended.sh         # Build script for extended kernel
├── requirements.txt          # Python dependencies
├── notebook.ipynb            # Jupyter notebook with analysis
└── README.md                 # This file

7. Conclusion

This project successfully delivered a powerful, GPU-accelerated framework for modern quantitative finance. It spans the full stack from advanced mathematical models and numerical methods to low-level performance engineering.

Key Technical Contributions:

  • 255x speedup over NumPy with custom CUDA kernels
  • QE scheme for stable variance simulation
  • Control variates for 5-10x variance reduction
  • Comprehensive barrier/lookback option support
  • Full Greek surface calculation

The framework is designed to be extensible for additional models (SABR, rough volatility), payoffs (cliquet, autocallable), and methods (LSMC for Bermudans).


8. References

  1. Bates, D. (1996). "Jumps and Stochastic Volatility: Exchange Rate Processes Implicit in Deutsche Mark Options." Review of Financial Studies.
  2. Heston, S. (1993). "A Closed-Form Solution for Options with Stochastic Volatility." Review of Financial Studies.
  3. Andersen, L. (2008). "Simple and efficient simulation of the Heston model." Journal of Computational Finance.
  4. Glasserman, P. (2003). Monte Carlo Methods in Financial Engineering. Springer.
  5. Giles, M. (2008). "Multilevel Monte Carlo Path Simulation." Operations Research.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors