teamster 🚛

Next-gen data orchestration for KIPP TEAM & Family Schools

Teamster is the data engineering platform powering analytics and reporting across KIPP Newark, Camden, Miami, and Paterson. It ingests data from 30+ source systems, transforms it through dbt, and delivers it to Tableau, Google Sheets, PowerSchool, and other consumers — all orchestrated by Dagster.

🎻 Dagster — orchestrates every ETL step across five code locations, one per school network; observe and run pipelines in Dagster Cloud
🔧 dbt — transforms raw source data into staging, intermediate, mart, and extract models in Google BigQuery
🚿 dlt — loads data from API sources into BigQuery alongside dbt
🔀 Airbyte — managed connector pipelines for select integrations
🪣 Google Cloud Storage — intermediate storage layer between pipeline steps
☸️ Google Kubernetes Engine — runs each code location in its own container in production
⚙️ GitHub Actions — CI/CD for building and deploying code locations
📊 Tableau — primary BI consumer; Dagster manages workbook extract refreshes

📖 Background

KIPP's data infrastructure was previously a patchwork of Python scripts, cron jobs, stored procedures, Fivetran, and Selenium automation spread across multiple databases. Synchronous scheduling meant a slow pull from one system would cascade into downstream failures. A single data engineer spent more time firefighting than building.

Teamster replaced all of it with a unified, asset-based platform. The results:

⚡ Pipeline development time dropped from weeks to days
🎫 Data-related support tickets fell 30% year-over-year
🧑‍💻 Analysts gained Git, SQL, and DevOps skills through shared PR workflows
🔔 Real-time Slack alerts replaced reactive debugging

"The visibility into the pipelines is a game changer. We know as soon as something fails and why."

Read the full story in the Dagster case study.

🚀 Get started

New to the project? Start here:

Guides — account setup and task-focused walkthroughs
Architecture — how the code is organized
Contributing — workflow and PR guidelines

📚 Reference

Topic	Description
Automations	All schedules and sensors across every code location
Automation Conditions	How asset auto-materialization works
Adding an Integration	Step-by-step guide for new data sources
dbt Conventions	Model naming, contracts, and testing standards
IO Managers	How intermediate data is stored in GCS
Fiscal Year & Partitioning	Partition strategy for historical loads

🗺️ Guides & Troubleshooting

Topic	Description
Dagster Guide	Tableau scheduling, backfills, branch deployments
Google Sheets & Forms	Adding and updating Google Sheets sources
Troubleshooting: Dagster	Pipeline failures, partitions, unsynced views
Troubleshooting: dbt	Contract violations, compilation errors, test failures
Troubleshooting: VS Code	Interpreter, secrets, Trunk, container issues

Name		Name	Last commit message	Last commit date
Latest commit History 32,440 Commits
.claude		.claude
.dagster/home		.dagster/home
.dbt		.dbt
.devcontainer		.devcontainer
.gcloud		.gcloud
.github		.github
.k8s		.k8s
.trunk		.trunk
.vscode		.vscode
docs		docs
mcp		mcp
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.mcp.json		.mcp.json
.python-version		.python-version
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
trunk		trunk
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

teamster 🚛

Next-gen data orchestration for KIPP TEAM & Family Schools

📖 Background

🚀 Get started

📚 Reference

🗺️ Guides & Troubleshooting

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

teamster 🚛

Next-gen data orchestration for KIPP TEAM & Family Schools

📖 Background

🚀 Get started

📚 Reference

🗺️ Guides & Troubleshooting

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages