Skip to content

mohansree14/Bayesian-Deep-Learning-for-Probabilistic-Financial-Forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Financial Time Series Forecasting with Uncertainty Quantification

Python 3.8+ License: MIT Code style: black PyTorch Streamlit

A comprehensive machine learning project for stock price prediction using LSTM, MC Dropout, Bayesian Neural Networks, and Transformers with probabilistic uncertainty estimation.

Author: Mohansree Vijayakumar
Email: mohansreesk14@gmail.com


πŸ“‹ Table of Contents


🎯 Project Overview

This project implements state-of-the-art deep learning models for financial time series forecasting with a focus on uncertainty quantification. The system provides not just point predictions, but confidence intervals that widen during volatile market periods, enabling better risk-aware trading decisions.

Key Features

  • βœ… Multiple Model Architectures: LSTM, MC Dropout LSTM, Bayesian Neural Networks (Pyro), Transformers
  • βœ… Uncertainty Quantification: Probabilistic forecasting with confidence intervals
  • βœ… 21 Technical Indicators: Comprehensive feature engineering
  • βœ… Hyperparameter Optimization: Optuna-based automated tuning
  • βœ… Backtesting System: Strategy evaluation with uncertainty-aware position sizing
  • βœ… Interactive Dashboard: Streamlit web application
  • βœ… Professional Documentation: Detailed reports and visualizations

πŸ“ Project Structure

ML-Intern/
β”œβ”€β”€ app/                          # Streamlit web application
β”‚   └── streamlit_app.py
β”œβ”€β”€ configs/                      # Model and experiment configurations
β”‚   β”œβ”€β”€ lstm_baseline.yaml
β”‚   β”œβ”€β”€ mc_dropout.yaml
β”‚   β”œβ”€β”€ bnn_vi.yaml
β”‚   └── transformer_baseline.yaml
β”œβ”€β”€ data/                         # Data storage (gitignored)
β”‚   β”œβ”€β”€ raw/
β”‚   └── processed/
β”œβ”€β”€ reports/                      # Documentation and analysis
β”‚   β”œβ”€β”€ figures/                  # Generated visualizations
β”‚   β”œβ”€β”€ tables/                   # Reference tables
β”‚   β”œβ”€β”€ PROJECT_ANALYSIS_REPORT.md
β”‚   β”œβ”€β”€ QUICK_REFERENCE.md
β”‚   β”œβ”€β”€ FIGURES_DOCUMENTATION.md
β”‚   └── HPO_SEARCH_SPACE.md
β”œβ”€β”€ results/                      # Experiment results (gitignored)
β”œβ”€β”€ scripts/                      # Data preparation scripts
β”‚   β”œβ”€β”€ fetch_data.py
β”‚   └── make_dataset.py
β”œβ”€β”€ src/                          # Source code
β”‚   β”œβ”€β”€ data/                     # Data processing
β”‚   β”‚   β”œβ”€β”€ indicators.py
β”‚   β”‚   └── preprocess.py
β”‚   β”œβ”€β”€ evaluation/               # Model evaluation
β”‚   β”‚   β”œβ”€β”€ backtesting.py
β”‚   β”‚   └── metrics.py
β”‚   β”œβ”€β”€ models/                   # Model implementations
β”‚   β”‚   β”œβ”€β”€ lstm.py
β”‚   β”‚   β”œβ”€β”€ mc_dropout_lstm.py
β”‚   β”‚   β”œβ”€β”€ bnn_vi_pyro.py
β”‚   β”‚   └── transformer.py
β”‚   β”œβ”€β”€ training/                 # Training pipeline
β”‚   β”‚   └── train_loop.py
β”‚   └── utils/                    # Utilities
β”‚       β”œβ”€β”€ config.py
β”‚       β”œβ”€β”€ logging_mlflow.py
β”‚       └── visualization.py
β”œβ”€β”€ tests/                        # Unit tests
β”‚   β”œβ”€β”€ test_metrics.py
β”‚   β”œβ”€β”€ test_models.py
β”‚   └── test_preprocess.py
β”œβ”€β”€ utils/                        # Utility scripts
β”‚   β”œβ”€β”€ generate_figures.py       # Generate documentation figures
β”‚   β”œβ”€β”€ generate_hpo_table.py     # Generate HPO table image
β”‚   └── generate_indicators_table.py  # Generate indicators table
β”œβ”€β”€ train.py                      # Main training script
β”œβ”€β”€ evaluate.py                   # Model evaluation script
β”œβ”€β”€ backtest.py                   # Backtesting script
β”œβ”€β”€ hparam_search.py             # Hyperparameter optimization
β”œβ”€β”€ run_project.py               # Project launcher (tests + app)
β”œβ”€β”€ requirements.txt             # Python dependencies
└── README.md                    # This file

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • pip package manager

Installation

  1. Clone or download this repository
  2. Install dependencies:
pip install -r requirements.txt

Running the Project

Option 1: Run Everything (Tests + Web App)

python run_project.py

Option 2: Run Web Application Only

streamlit run app/streamlit_app.py

or

.\run_streamlit.bat  # Windows batch file
.\run_streamlit.ps1  # PowerShell script

Option 3: Access the Live Streamlit App Visit our hosted Streamlit app: https://bayesian-financial-forecasting.streamlit.app/

πŸ“Š Usage Examples

1. Train a Model

# LSTM Baseline
python train.py --config configs/lstm_baseline.yaml

# MC Dropout LSTM
python train.py --config configs/mc_dropout_lstm.yaml

# Bayesian Neural Network
python train_bnn.py --config configs/bnn_vi.yaml

# Transformer
python train.py --config configs/transformer_baseline.yaml

2. Hyperparameter Optimization

python hparam_search.py --config configs/lstm_baseline.yaml --study-name my_study

3. Evaluate Model

python evaluate.py --config configs/lstm_baseline.yaml --checkpoint results/experiment_XXXXX/best_model.pt

4. Run Backtesting

python backtest.py --config configs/mc_dropout.yaml --checkpoint results/experiment_XXXXX/best_model.pt

5. Generate Documentation Figures

# Generate all figures (6, 7, 8, 9, 10)
python utils/generate_figures.py

# Generate HPO table
python utils/generate_hpo_table.py

# Generate technical indicators table
python utils/generate_indicators_table.py

πŸ—οΈ Model Architectures

1. LSTM Baseline

  • Standard LSTM for time series forecasting
  • Dropout regularization
  • Configuration: configs/lstm_baseline.yaml

2. MC Dropout LSTM

  • Monte Carlo Dropout for uncertainty estimation
  • Multiple forward passes at inference
  • Provides prediction mean and variance
  • Configuration: configs/mc_dropout_lstm.yaml

3. Bayesian Neural Network (Pyro)

  • Full Bayesian treatment with Pyro
  • Variational inference
  • Posterior distribution over weights
  • Configuration: configs/bnn_vi.yaml

4. Transformer

  • Attention-based architecture
  • Multi-head self-attention
  • Positional encoding for sequences
  • Configuration: configs/transformer_baseline.yaml

πŸ“ˆ Technical Indicators (21 Total)

The system uses 21 technical indicators across 6 categories:

  • Price Features (5): Open, High, Low, Close, Adj Close
  • Volume (1): Trading volume
  • Returns (2): Daily return, Log return
  • Trend (4): SMA(10), SMA(20), EMA(12), EMA(26)
  • Momentum (6): RSI, MACD, MACD Signal, MACD Histogram, Stochastic %K, %D
  • Volatility (3): Bollinger Bands (Upper, Middle, Lower)

See reports/tables/TECHNICAL_INDICATORS_TABLE.md for detailed formulas.

πŸ”¬ Hyperparameter Search Space

Optuna-based optimization with the following search space:

Parameter Type Range Distribution
Hidden Size Integer 64-256 (step 64) Uniform
Num Layers Integer 1-3 Uniform
Dropout Float 0.0-0.4 Uniform
Learning Rate Float 1e-4 to 5e-3 Log-Uniform
Batch Size Categorical [32, 64, 128] Discrete

See reports/HPO_SEARCH_SPACE.md for complete details.

πŸ“Š Key Results

Performance Highlights

  • Validation Loss: ~0.009 MSE (LSTM Baseline)
  • Uncertainty Calibration: Bands widen 1.48x during COVID-19 crash
  • Risk-Adjusted Returns:
    • Uncertainty-aware strategy: 22.7% lower volatility
    • 22.4% smaller maximum drawdown
    • Sharpe ratio improvement: 0.88 vs 0.78

Visualizations

All figures available in reports/figures/:

  • Figure 6: AAPL Price History (2015-2024)
  • Figure 7: Feature Correlation Heatmap
  • Figure 8: Training/Validation Loss Curves
  • Figure 9: Uncertainty Bands (COVID-19 volatility demonstration)
  • Figure 10: Cumulative Returns Comparison

πŸ“š Documentation

Comprehensive Reports

Quick References

πŸ§ͺ Testing

Run unit tests:

# All tests
python -m pytest tests/

# Specific test file
python -m pytest tests/test_models.py

# With coverage
python -m pytest tests/ --cov=src

Or use the project launcher:

python run_project.py  # Runs tests first, then launches app

πŸ› οΈ Configuration

All experiments use YAML configuration files in configs/:

# Example: configs/lstm_baseline.yaml
data:
  tickers: ["AAPL"]
  data_dir: "data/processed"
  train_ratio: 0.7
  valid_ratio: 0.15

model:
  type: "lstm"
  hidden_size: 128
  num_layers: 2
  dropout: 0.1

training:
  epochs: 50
  batch_size: 64
  learning_rate: 0.001
  early_stopping_patience: 10

seed: 42

πŸ“¦ Dependencies

Core libraries:

  • PyTorch: Deep learning framework
  • Pandas/NumPy: Data manipulation
  • yfinance: Financial data fetching
  • Streamlit: Web dashboard
  • Optuna: Hyperparameter optimization
  • Pyro: Bayesian deep learning
  • Matplotlib/Seaborn: Visualization
  • MLflow: Experiment tracking (optional)

See requirements.txt for complete list.

🎯 Key Innovations

  1. Uncertainty Quantification: Not just predictions, but confidence intervals
  2. Volatility-Aware Trading: Dynamic position sizing based on uncertainty
  3. Comprehensive Technical Analysis: 21 engineered features
  4. Multiple Architectures: Compare LSTM, Bayesian, Transformer approaches
  5. Professional Documentation: Publication-ready reports and figures

πŸ“ Citation

If you use this project in your research or work, please cite:

Financial Time Series Forecasting with Uncertainty Quantification
ML Internship Project, 2025
GitHub: [Your Repository URL]

🀝 Contributing

This is an educational project. Suggestions and improvements are welcome!

πŸ“„ License

This project is for educational and research purposes.

πŸ™ Acknowledgments

  • Data Source: Yahoo Finance (yfinance library)
  • Frameworks: PyTorch, Pyro, Streamlit, Optuna
  • Inspiration: Modern deep learning research in finance and uncertainty quantification

πŸ“ž Contact

For questions or collaboration opportunities, please reach out via GitHub issues.


Last Updated: October 14, 2025
Version: 1.0
Status: Production-Ready βœ…

About

Machine learning project for stock price prediction with uncertainty quantification using LSTM, MC Dropout, Bayesian Neural Networks, and Transformers. Features 21 technical indicators, hyperparameter optimization, backtesting, and interactive Streamlit dashboard.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages