Skip to content

2fst4u/f1predictor

Repository files navigation

F1 Prediction

Python License

A Formula 1 race prediction tool that uses historical data and machine learning to forecast qualifying and race results.

Note: This project was built with significant assistance from AI (GitHub Copilot / Claude). The codebase, documentation, and overall architecture were developed collaboratively with AI tools.

What It Does

This tool predicts finishing positions for F1 sessions:

  • Qualifying – Grid positions
  • Race – Final standings
  • Sprint Qualifying – Sprint grid
  • Sprint – Sprint race results

It pulls data from public APIs, builds features from historical performance, and trains models fresh on each run—no saved weights, fully self-calibrating.

Quick Start

# Set up environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# Predict the next race
python main.py --round next

# Predict a specific past event
python main.py --season 2024 --round 5 --sessions qualifying race

# Generate HTML report
python main.py --round next --html --open-browser

Data Sources

The app fetches data from free, public APIs:

  • Jolpica F1 – Schedules, results, standings (Ergast-compatible)
  • Open-Meteo – Weather forecasts and historical weather
  • OpenF1 – Session timing data (historical only)
  • FastF1 – Detailed timing and telemetry fallback

How It Works

  1. Roster inference – For future races, the entry list comes from the most recent completed event
  2. Feature engineering – Driver form, team performance, weather conditions, teammate comparisons
  3. Model training – Gradient boosting (LightGBM/XGBoost/sklearn) trained on historical data
  4. DNF estimation – Separate classifier for retirement probability
  5. Monte Carlo simulation – 5000 draws to get win probability, podium chances, and expected position

Configuration

All settings live in config.yaml. The main things you might want to tweak:

modelling:
  recency_half_life_days:
    base: 120      # How quickly old results fade in importance
    weather: 180   # Weather skill memory
    team: 240      # Team performance memory
  monte_carlo:
    draws: 5000    # Simulation iterations (more = slower but smoother)

data_sources:
  open_meteo:
    temperature_unit: "celsius"  # or fahrenheit
    windspeed_unit: "kmh"        # kmh, ms, mph, kn

Output

CSV (output/predictions.csv):

season, round, event, driver_id, driver, team,
predicted_pos, mean_pos, p_top3, p_win, p_dnf,
actual_pos, delta, generated_at, model_version

HTML reports (output/reports/):

  • Per-event predictions with probabilities
  • Movement indicators when actuals are available
  • Backtest summaries with accuracy metrics

Usage Modes

Standard Prediction

python main.py --season 2024 --round 10

Live Mode

Re-runs predictions periodically and updates when results come in:

python main.py --round next --live --refresh 30 --html

Backtesting

Evaluate model accuracy across historical seasons:

python main.py --backtest

Known Limitations

  • No real-time data – OpenF1 is used for historical data only, not live timing
  • Weather is approximate – Forecasts are aggregated around session windows
  • DNF model is basic – Uses historical base rates, not detailed reliability analysis
  • First race of season – Limited data for brand new driver/team combinations

Project Structure

f1pred/
├── predict.py      # Main prediction pipeline
├── features.py     # Feature engineering
├── models.py       # ML model training
├── simulate.py     # Monte Carlo simulation
├── roster.py       # Entry list inference
├── backtest.py     # Historical evaluation
├── report.py       # HTML generation
└── data/           # API clients
    ├── jolpica.py
    ├── open_meteo.py
    ├── openf1.py
    └── fastf1_backend.py

Requirements

  • Python 3.11+
  • See requirements.txt for dependencies

Troubleshooting

Predictions seem random or uniform? Clear the cache and re-run:

rm -rf .cache/
python main.py --round next

Missing actuals for sprint qualifying? Enable OpenF1 and/or install FastF1 in config.yaml.

Rate limiting errors? The built-in cache and retry logic should handle most cases. Try increasing live_refresh_seconds or clearing the .cache/ directory.

HTML report not opening? Pass --html --open-browser explicitly.

Import errors for LightGBM on macOS?

pip uninstall lightgbm
pip install lightgbm --no-binary lightgbm

The system will fall back to XGBoost or scikit-learn if LightGBM is unavailable.

Running Tests

If you want to verify the code or contribute:

pip install pytest
pytest tests/ -v

License

MIT – see LICENSE

About

A tool for predicting the outcome of F1 sessions

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published