Skip to content

Latest commit

 

History

History
188 lines (157 loc) · 6.27 KB

File metadata and controls

188 lines (157 loc) · 6.27 KB

QuantLab Project Structure

Directory Layout

quantlab/
├── README.md                           # Main project documentation
├── .gitignore                          # Git ignore configuration
├── PROJECT_STRUCTURE.md                # This file
│
├── system/                             # System-level configuration
│   └── system_profile.yaml             # Qlib system settings
│
├── configs/                            # Workflow configurations ⭐
│   ├── lightgbm_external_data.yaml     # Full universe (14,310 stocks)
│   ├── lightgbm_fixed_dates.yaml       # 2024 only (date filtered)
│   └── lightgbm_liquid_universe.yaml   # Filtered universe (13,187 stocks)
│
├── data/                               # Local data storage
│   ├── parquet/                        # Raw parquet data files
│   └── metadata/                       # Metadata and cache
│
├── docs/                               # Documentation 📚
│   ├── BACKTEST_SUMMARY.md             # Comprehensive backtest analysis
│   ├── ALPHA158_SUMMARY.md             # Alpha158 features overview
│   ├── ALPHA158_CORRECTED.md           # Alpha158 corrections guide
│   ├── USE_QLIB_ALPHA158.md            # How to use Alpha158
│   └── QUANTMINI_README.md             # QuantMini data setup
│
├── notebooks/                          # Jupyter notebooks 📓
│   └── workflow_by_code.ipynb          # Qlib workflow examples
│
├── results/                            # Experiment outputs 📊
│   ├── mlruns/                         # MLflow experiment tracking
│   │   └── 489214785307856385/         # Experiment ID
│   │       ├── 2b374fe2956c4161a1bd2dcef7299bd2/  # Liquid universe run
│   │       ├── 44c320f998cf4c97a8be68ed15857f66/  # Fixed dates run
│   │       └── dc38cbc355104d67a1917a3f358ceb1f/  # Original run
│   └── visualizations/                 # Charts and plots
│       └── backtest_visualization.png  # Latest backtest chart
│
├── scripts/                            # Utility scripts 🔧
│   ├── analysis/                       # Analysis tools
│   │   └── visualize_results.py        # Backtest visualization script
│   ├── data/                           # Data processing
│   │   ├── convert_to_qlib.py          # Convert data to qlib format
│   │   ├── quantmini_setup.py          # Setup QuantMini data
│   │   └── refresh_today_data.py       # Update latest data
│   └── tests/                          # Test scripts
│       ├── test_qlib_alpha158.py       # Test Alpha158 features
│       ├── test_stocks_minute_fix.py   # Test data fixes
│       └── enable_alpha158.py          # Enable Alpha158 handler
│
├── qlib_repo/                          # Microsoft Qlib source (828MB)
│   └── (Full qlib repository clone)
│
└── .venv/                              # Virtual environment (uv)
    └── (Python packages)

Key Directories

📝 configs/

Workflow configuration files for different backtesting scenarios.

Usage:

cd qlib_repo/examples
uv run qrun ../../configs/lightgbm_liquid_universe.yaml

📊 results/

All experiment outputs including MLflow tracking and visualizations.

MLflow runs:

  • Each run has a unique ID
  • Contains: model, predictions, portfolio analysis, metrics

Visualizations:

  • Generated by scripts/analysis/visualize_results.py
  • Charts showing IC, returns, drawdown

🔧 scripts/

Organized utility scripts by function.

data/ - Data processing and setup analysis/ - Result visualization and analysis tests/ - Testing and validation scripts

📚 docs/

Comprehensive documentation of findings and guides.

Key documents:

  • BACKTEST_SUMMARY.md - Most important - full analysis
  • ALPHA158_SUMMARY.md - Feature documentation
  • USE_QLIB_ALPHA158.md - Implementation guide

External Data Location

/Volumes/sandisk/quantmini-data/data/qlib/stocks_daily/
├── calendars/
│   └── day.txt                         # 442 trading days (2024-2025)
├── instruments/
│   ├── all.txt                         # 14,310 total instruments
│   └── liquid_stocks.txt               # 13,187 filtered instruments
└── features/                           # OHLCV price data
    └── [stock symbols]/

File Sizes

Component Size Notes
qlib_repo/ 828MB Full Microsoft Qlib clone
.venv/ ~500MB Python virtual environment
results/mlruns/ ~150MB All experiment runs
External data ~2GB On /Volumes/sandisk

Ignored Files (.gitignore)

  • qlib_repo/ - Large clone, can be re-downloaded
  • .venv/ - Virtual environment
  • data/parquet/ - Raw data files
  • __pycache__/, *.pyc - Python cache
  • *.log - Log files

Quick Commands

Run a backtest

cd qlib_repo/examples
uv run qrun ../../configs/lightgbm_liquid_universe.yaml

Visualize results

# Update experiment ID in scripts/analysis/visualize_results.py
cd /path/to/quantlab
uv run python scripts/analysis/visualize_results.py

Test Alpha158

python scripts/tests/test_qlib_alpha158.py

Refresh data

python scripts/data/refresh_today_data.py

Configuration Files

configs/lightgbm_liquid_universe.yaml

qlib_init:
    provider_uri: "/Volumes/sandisk/quantmini-data/data/qlib/stocks_daily"
    region: cn

market: &market liquid_stocks  # Uses filtered universe
benchmark: &benchmark SPY

task:
    model:
        class: LGBModel
        # ... LightGBM parameters
    dataset:
        class: DatasetH
        # ... dataset configuration

Notes

  • All scripts should be run from project root
  • Paths are relative to project root
  • External data must be mounted at /Volumes/sandisk
  • MLflow tracks all experiments automatically

Recent Changes

✅ Reorganized from flat structure to organized directories ✅ Moved all docs to docs/ ✅ Separated scripts by function ✅ Centralized configs in configs/ ✅ Moved MLflow results to results/ ✅ Added comprehensive README.md ✅ Created .gitignore for large files