README in progress...

📌 Project Title

End-to-End MLB Moneyline Betting System

📖 Table of Contents

Overview
Project Structure
Data Sources
ETL Pipeline
Database Schema
Data Validation & Quality Checks
Modeling Pipeline
How to Run
Environment Setup
Future Work
Acknowledgments
License

🧠 Overview

What does the project do? This project aims to create a production-level (or close to production level) system to aid in MLB moneyline betting. As a result, I hope that this project becomes fully self-contained from data storage to ETL/ELT, modeling, backtesting, monitoring, and deploying. The initial version of this project will be completed within a span of 2 months, so while I don't expect the modeling to be optimal, I hope to get something "good enough" while standing up the deployment aspect. Additional model refinement can come later.

While this is a personal project, which I plan to test out with "skin in the game", I suspect that the insights from this work will be useful for teaching (I teach "Data Science for Sports" at the GW School of Business).

Another aim of this project is to gain additional experience in Docker, ETL/ELT, and deployment to supplement my current role as a data scientist.

🗂️ Project Structure

end-to-end-mlb-betting/
├── data/
│   ├── raw_games.csv
│   ├── raw_team_stats.csv
│   └── raw_player_stats.csv
├── dags/
│   ├── airflow_dag.py
├── db/
│   ├── schema.sql
│   ├── data_validation_checks.sql
├── docker/
├── etl/
│   ├── utils.py
│   ├── extract_games.py
│   ├── extract_team_stats.py
│   ├── extract_player_stats.py
│   ├── extract_odds.py
│   ├── load_to_db.py
│   ├── transform_games_clean.py
│   └── update_all_data.py
├── models/
│   ├── evaluate.py
│   ├── train_model.py
├── serving/
│   ├── api.py
│   ├── Dockerfile
├── tests/
    ├── test_features.py
├── tracking/
│   ├── mlflow_config
├── validate/
│   └── run_data_checks.py

⚾ Data Sources

MLBStats API
See documentation here: ...

🔁 ETL Pipeline

Break into steps: extract, transform, load
- Note that to get updated data, run python -m src.etl.update_all_data
Briefly describe each stage, CLI args, file outputs.

🧱 Database Schema

Diagrams and/or descriptions of your tables: games, team stats, player stats, etc.

✅ Data Validation & Quality Checks

Describe the validation logic, examples of checks, and how to run them

📈 Modeling Pipeline

Overview of modeling workflow: features, target variable, cross-validation approach

🚀 How to Run

Commands to run: - historical extract - load to DB - validation - modeling Include CLI examples and notes

🛠️ Environment Setup

Required packages, Python version, virtualenv/conda, Docker (if used)

🔮 Future Work

Ideas for extending the pipeline or improving the model (e.g., lineup changes, player absences)

🙏 Acknowledgments

Credits to data providers, libraries, or collaborators

📄 License

If public, state license type.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
dags		dags
data		data
db		db
notebooks		notebooks
src		src
validate		validate
.gitignore		.gitignore
README.md		README.md
config.yml		config.yml
constants.py		constants.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README in progress...

📌 Project Title

📖 Table of Contents

🧠 Overview

🗂️ Project Structure

⚾ Data Sources

🔁 ETL Pipeline

🧱 Database Schema

✅ Data Validation & Quality Checks

📈 Modeling Pipeline

🚀 How to Run

🛠️ Environment Setup

🔮 Future Work

🙏 Acknowledgments

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

README in progress...

📌 Project Title

📖 Table of Contents

🧠 Overview

🗂️ Project Structure

⚾ Data Sources

🔁 ETL Pipeline

🧱 Database Schema

✅ Data Validation & Quality Checks

📈 Modeling Pipeline

🚀 How to Run

🛠️ Environment Setup

🔮 Future Work

🙏 Acknowledgments

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages