Race Decoder

A Formula 1 race analysis dashboard that turns raw timing data into interactive visualizations. Built with a data engineering backend where data flows from the FastF1 API through a multi-layer PostgreSQL data warehouse and surfaces in a Streamlit dashboard with Plotly charts.

Covers every race from 2018 to 2025: 8 seasons, 173 grands prix, 189,000+ lap records.

Race Story	Strategy	Driver Deep Dive

Why I Built This

As an F1 fan, I always found myself jumping between multiple sites to piece together what actually happened in a race. So I built my own tool for it. Along the way, it turned into an excuse to work with real data engineering patterns: a proper layered warehouse, idempotent pipelines, fault-tolerant backfills across 170+ races, and a dashboard that tells the full story of each grand prix.

Tech Stack

Layer	Technology
Data Source	FastF1 (unofficial F1 timing API)
Database	PostgreSQL 18, 4-schema warehouse (staging, curated, marts)
Pipeline	Python 3.12, psycopg, idempotent upserts with SHA1-based stable IDs
Dashboard	Streamlit 1.44, Plotly 6.0
Infrastructure	Docker Compose (3 services), Makefile automation

Dashboard

5 analysis tabs covering every angle of a race:

Race Story - See exactly when and where overtakes happened, with position changes and gap dynamics lap by lap
Race Pace - Compare driver speed across the race with smoothed pace lines, consistency box plots, and sector-by-sector heatmaps
Strategy - Visualize each team's tyre strategy as color-coded stint bars, and see how quickly each compound degrades
Driver Deep Dive - Pick any two drivers for a head-to-head breakdown: per-lap time deltas, sector comparisons, and gap-to-leader overlays
Full Results - Grid-vs-finish dumbbell chart showing who gained or lost the most, plus full race classification with CSV export

At a glance: Podium cards with team colors, 8 stat KPIs (fastest lap, safety car laps, lead changes, biggest mover, track conditions), and SC/VSC shading on every chart. Dark theme throughout.

Architecture

FastF1 API
    |
 Extract --> Staging (raw typed) --> Curated (facts + dims) --> Marts (precomputed)
                                                                      |
                                                                 Streamlit App

The pipeline processes 173 races across 8 seasons, ingesting ~1,100 laps per race, 3 sectors per lap, 20 drivers, weather readings, and race control events into a 17-table PostgreSQL warehouse. The Streamlit app is read-only. It queries marts and curated tables via SQLAlchemy and never touches the external API.

Technical Highlights

Idempotent ingestion - SHA1-based deterministic IDs mean re-running the pipeline for any race produces identical results with no duplicates
Fault-tolerant backfill - Multi-pass retry logic handles FastF1 API rate limits and transient failures across 173 races without manual intervention
Data quality gates - Post-ingestion checks (primary key uniqueness, row count sanity, winner existence) prevent bad data from reaching the dashboard
Clean mart regeneration - DELETE + INSERT per race ensures no stale aggregations survive a re-ingestion
4-schema warehouse pattern - Staging, curated, and marts mirrors production data warehouse practices, with a metadata schema tracking every ingestion run
Graceful data gaps - The pipeline synthesizes "UNKNOWN" stint records when the upstream API has missing tyre data, so the dashboard always shows complete race coverage

Project Structure

app/
├── main.py              # Slim orchestrator (~200 lines)
├── data_access.py       # SQLAlchemy queries against curated/marts
├── charts/              # Plotly chart builders (6 modules)
├── components/          # Banner, metric cards, driver selector
├── tabs/                # One render module per tab
└── theme/               # Dark theme CSS + Phosphor icons

pipeline/
├── extract.py           # FastF1 fetch with exponential backoff
├── transform.py         # Staging + curated dataframe builders
├── load.py              # Upserts and mart replacements
├── marts.py             # Precomputed aggregations
└── quality.py           # Post-ingestion data quality checks

sql/                     # DDL for all 4 schemas (17 tables)
scripts/                 # Ingestion CLI + fault-tolerant backfill
docker/                  # Dockerfiles for streamlit + ingest containers

Quick Start

# Clone and configure
git clone https://github.com/shivaaang/f1-race-decoder-dashboard.git
cd f1-race-decoder-dashboard
cp .env.example .env              # defaults work out of the box

# Start services and initialize the database
make up                           # postgres + streamlit
make db-bootstrap                 # create schemas and tables

# Ingest your first race
make ingest-single                # loads 2024 Bahrain GP

# Open the dashboard
open http://localhost:8501

Backfill more data:

make backfill-season SEASON=2024                        # one season
make backfill-safe SEASON_START=2018 SEASON_END=2025    # all seasons with automatic retries

Database Schema

Four PostgreSQL schemas forming a classic data warehouse:

metadata - Ingestion run tracking and data quality check results
staging - Raw typed data (laps, results, weather) loaded as-is from FastF1
curated - Star schema with fact tables (fact_lap, fact_session_results, fact_race_control, fact_weather_minute) and dimensions (dim_race, dim_driver, dim_team)
marts - Dashboard-optimized tables: mart_gap_timeline, mart_position_chart, mart_stint_summary

Development

Code quality: Ruff + Black with 100-character line length. Three separate requirements files (requirements-ingest.txt, requirements-streamlit.txt, requirements-dev.txt) for build isolation.

Data quality: Post-ingestion checks validate every race: primary key uniqueness, row count sanity, lap number continuity, and winner existence. Failed checks are logged to the metadata schema for auditability.

make lint                # ruff + black (check mode)
make format              # auto-format
make logs                # stream Docker logs
make reset-db            # hard reset, drops all data

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.streamlit		.streamlit
app		app
docker		docker
docs/screenshots		docs/screenshots
pipeline		pipeline
scripts		scripts
sql		sql
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
Procfile		Procfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements-ingest.txt		requirements-ingest.txt
requirements-streamlit.txt		requirements-streamlit.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Race Decoder

Why I Built This

Tech Stack

Dashboard

Architecture

Technical Highlights

Project Structure

Quick Start

Database Schema

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

shivaaang/f1-race-decoder-dashboard

Folders and files

Latest commit

History

Repository files navigation

Race Decoder

Why I Built This

Tech Stack

Dashboard

Architecture

Technical Highlights

Project Structure

Quick Start

Database Schema

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages