Skip to content

muralikarteek7/Transformer-VAE-Stock-Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Transformer-VAE-Stock-Predictor

A deep generative model combining Transformer-based Variational Autoencoders (VAE) and Normalizing Flows to generate and forecast financial time series data, conditioned on market volatility indicators.


Overview

Financial markets exhibit non-linear patterns, heavy-tailed distributions, and volatility clustering — making classical methods like ARIMA or GARCH insufficient for accurate modeling.

This project proposes a Transformer-VAE architecture with RealNVP normalizing flows, conditioned on the VIX volatility index, to generate synthetic financial data and learn latent financial dynamics.


Installation

Clone this repository and set up a Python environment:

git clone https://github.com/muralikarteek7/Transformer-VAE-Stock-Predictor.git
cd Transformer-VAE-Stock-Predictor
conda create -n transvae python=3.8
conda activate transvae

Dependencies

Install the following dependencies via pip or conda:

  • torch
  • transformers
  • numpy
  • pandas
  • matplotlib
  • yfinance
  • scikit-learn

Dataset

Sources:

  • Historical stock data from Yahoo Finance via yfinance
  • Volatility Index (VIX) data

We used:

  • 3 stocks: e.g., AAPL, MSFT, GOOGL
  • Features: [Close, Volume]
  • Time range: Multiple years (daily frequency)

Format:

Each sample is a sequence of shape (1000, 6):

  • 1000 time steps
  • 6 dimensions (3 stocks × 2 features each)
  • VIX used as a conditional input

Model Architecture

Model Architecture

  • Loss: ELBO (Reconstruction + KL) + Flow log-det Jacobian

Evaluation Metrics

Metrics

  • Cosine Similarity: Between real and generated log returns
  • Volatility clustering check
  • Time-series overlay of real vs. synthetic returns

📊 Baseline Comparison

Model Wasserstein Distance Cosine Similarity
Transformer-VAE-Flow 0.0629 0.9072
TimeGrad 0.3246 0.9905
QuantGAN 0.0159 0.9148
DiM 0.2018
GARCH 0.1499 0.9919

Use Cases

  • Synthetic data generation for backtesting
  • Risk modeling and stress testing
  • Portfolio simulation
  • Improving data diversity for trading models

References

  1. QuantGAN: Deep Generation of Financial Time Series
  2. Bollerslev, T. “Generalized Autoregressive Conditional Heteroskedasticity.” Journal of Econometrics, 1986.
  3. TimeGrad: Diffusion-based Forecasting

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors