Skip to content

Latest commit

 

History

History
304 lines (259 loc) · 9.42 KB

File metadata and controls

304 lines (259 loc) · 9.42 KB

Project Organization Summary

This document provides an overview of the professional organization applied to the Financial Time Series Forecasting project.

Organization Principles

This project follows industry best practices for professional Python projects:

  1. Clear Structure: Logical organization with separation of concerns
  2. Comprehensive Documentation: README, CONTRIBUTING, CHANGELOG, LICENSE
  3. Professional Tooling: Setup files, Makefile, EditorConfig, pyproject.toml
  4. Quality Assurance: Tests, linting configuration, code formatting standards
  5. Easy Deployment: Docker, Streamlit scripts, package distribution files

Root Directory Files

Documentation Files

File Purpose
README.md Main project documentation with badges, overview, and quick start
QUICKSTART.md Quick reference guide for common tasks
PROJECT_STRUCTURE.md Detailed directory structure and navigation guide
CONTRIBUTING.md Contribution guidelines and development workflow
CHANGELOG.md Version history and release notes
CODE_OF_CONDUCT.md Community standards and behavior guidelines
SECURITY.md Security policy and vulnerability reporting
LICENSE MIT License with third-party acknowledgments

Configuration Files

File Purpose
pyproject.toml Modern Python project configuration (PEP 518)
setup.py Package distribution setup
MANIFEST.in Package inclusion rules
requirements.txt Python dependencies
.gitignore Git ignore rules
.gitattributes Git attributes for line endings and linguist
.editorconfig Editor configuration for consistent code style
Makefile Common task automation (Unix/Linux/Mac)
Dockerfile Docker container configuration

Execution Scripts

File Purpose
run_project.py Main project launcher (tests + Streamlit app)
run_streamlit.bat Windows batch launcher
run_streamlit.ps1 PowerShell launcher
train.py Single model training
train_all_tickers.py Multi-ticker training
train_bnn.py Bayesian model training
evaluate.py Model evaluation
backtest.py Strategy backtesting
hparam_search.py Hyperparameter optimization

Directory Structure

src/ - Source Code

src/
├── data/          # Data processing and feature engineering
├── models/        # Model implementations (LSTM, BNN, MC Dropout, Transformer)
├── training/      # Training pipeline
├── evaluation/    # Metrics and backtesting
└── utils/         # Configuration, logging, visualization

app/ - Web Application

app/
└── streamlit_app.py  # Interactive dashboard

configs/ - Configuration Files

configs/
├── baseline_lstm.yaml
├── mc_dropout.yaml
├── bnn_pyro.yaml
├── transformer.yaml
├── model_config.yaml
└── train_config.yaml

reports/ - Documentation & Reports

reports/
├── figures/                      # Generated visualizations (PNG + PDF)
│   ├── figure_6_*.png/pdf       # AAPL price
│   ├── figure_7_*.png/pdf       # Correlation heatmap
│   ├── figure_8_*.png/pdf       # Training/validation loss
│   ├── figure_9_*.png/pdf       # Uncertainty bands
│   ├── figure_10_*.png/pdf      # Cumulative returns
│   ├── hpo_table_*.png/pdf      # HPO search space
│   └── technical_indicators_*.png/pdf
│
├── tables/                       # Reference tables
│   ├── HPO_SEARCH_SPACE_TABLE.md
│   ├── HPO_TABLE_SINGLE.md
│   ├── TECHNICAL_INDICATORS_TABLE.md
│   └── README.md
│
├── PROJECT_ANALYSIS_REPORT.md    # Comprehensive project analysis
├── QUICK_REFERENCE.md            # Quick reference guide
├── FIGURES_DOCUMENTATION.md      # Detailed figure descriptions
├── HPO_SEARCH_SPACE.md           # HPO documentation
├── HPO_QUICK_REFERENCE.md        # Quick HPO guide
└── README.md                     # Reports index

utils/ - Utility Scripts

utils/
├── generate_figures.py           # Generate all documentation figures
├── generate_hpo_table.py         # Generate HPO table image
└── generate_indicators_table.py  # Generate indicators table image

scripts/ - Data Scripts

scripts/
├── fetch_data.py     # Download financial data
└── make_dataset.py   # Prepare datasets

tests/ - Unit Tests

tests/
├── test_metrics.py      # Test evaluation metrics
├── test_models.py       # Test model implementations
└── test_preprocess.py   # Test data preprocessing

data/ - Data Storage

data/
└── raw/    # Raw downloaded data (gitignored)

results/ - Training Results

results/    # Experiment results and checkpoints (gitignored)

experiments/ - Experiment Tracking

experiments/    # MLflow tracking and logs

aws/ - Cloud Deployment

aws/
└── README.md    # AWS deployment guide

Professional Features

1. Package Distribution

  • setup.py: Traditional setuptools configuration
  • pyproject.toml: Modern PEP 518 configuration
  • MANIFEST.in: File inclusion rules
  • Console scripts: financial-ts-train, financial-ts-app, etc.

2. Code Quality

  • EditorConfig: Consistent code style across editors
  • Black: Code formatting (100 char line length)
  • Flake8: Linting
  • Mypy: Type checking
  • isort: Import sorting

3. Testing

  • pytest: Unit testing framework
  • pytest-cov: Coverage reports
  • Test structure: Mirrors source code structure

4. Documentation

  • README: Comprehensive with badges and table of contents
  • QUICKSTART: Fast onboarding guide
  • PROJECT_STRUCTURE: Detailed navigation
  • CONTRIBUTING: Development guidelines
  • CHANGELOG: Version history
  • CODE_OF_CONDUCT: Community standards
  • SECURITY: Vulnerability reporting

5. Automation

  • Makefile: Common tasks (install, test, lint, format, run)
  • Docker: Containerized deployment
  • Scripts: Batch and PowerShell launchers

6. Version Control

  • .gitignore: Excludes build artifacts, data, results
  • .gitattributes: Line ending normalization, binary files

7. Deployment

  • Docker: Multi-stage build for optimization
  • Streamlit: One-command app launch
  • AWS: Deployment documentation

File Organization Best Practices

Root Level

  • Keep root clean with only essential files
  • Move utility scripts to utils/ directory
  • Documentation files use ALL_CAPS naming
  • Execution scripts use snake_case

Source Code

  • Modular design with clear separation
  • Each module has a specific responsibility
  • Shared utilities in src/utils/
  • Models in separate files

Documentation

  • All documentation in reports/
  • Figures separated by type
  • Tables have both MD and image formats
  • README files in each major directory

Configuration

  • All YAML configs in configs/
  • Environment-specific configs separate
  • Clear naming conventions

Navigation Quick Reference

Task Location
Getting started README.md, QUICKSTART.md
Project structure PROJECT_STRUCTURE.md
Contribute CONTRIBUTING.md
Version history CHANGELOG.md
Run the app run_project.py or run_streamlit.bat
Train models train.py, train_all_tickers.py
Configuration configs/
Documentation reports/README.md
Source code src/
Tests tests/
Utilities utils/

Professional Standards Met

  • Comprehensive README with badges
  • Quick start guide
  • Detailed project structure documentation
  • Contributing guidelines
  • Change log
  • Code of conduct
  • Security policy
  • License file
  • Package distribution setup
  • Modern Python configuration (pyproject.toml)
  • Editor configuration
  • Git attributes
  • Makefile for automation
  • Docker support
  • Unit tests
  • Clean directory structure
  • Professional documentation
  • Consistent naming conventions
  • Separation of concerns

Comparison: Before vs After

Before

  • Mixed utility scripts in root directory
  • Temporary files (fix_emojis.py, generate_figure_8.py)
  • Missing professional documentation
  • No package distribution setup
  • No code quality configuration
  • No automation tools

After

  • Clean, organized root directory
  • All utilities in utils/ directory
  • Comprehensive professional documentation
  • Full package distribution setup (setup.py, pyproject.toml)
  • Code quality tools configured (.editorconfig, Black, Flake8)
  • Makefile for common tasks
  • Professional README with badges
  • Quick start guide
  • Project structure documentation
  • Contributing guidelines
  • Security policy
  • Code of conduct

Maintenance

To keep the project organized:

  1. Add new scripts to appropriate directories (utils/, scripts/)
  2. Update CHANGELOG.md when making changes
  3. Update documentation when adding features
  4. Keep root clean - only essential files
  5. Follow naming conventions - documented in CONTRIBUTING.md
  6. Run tests before committing - pytest tests/
  7. Format code - black src/ --line-length=100
  8. Update README for major changes

Author: Mohansree Vijayakumar
Email: mohansreesk14@gmail.com
Version: 1.0.0
Last Updated: 2025-10-14
Status: Production Ready ✅