Dagster + CML MLOps Template 🤖🔬🛠️

This repository is a minimal example of how to integrate CML (Continuous Machine Learning) and Dagster into your MLOps workflow. It uses GitHub Actions to orchestrate and automate a simple machine learning pipeline with reporting directly on your pull requests. 🚀🔁

Introduction 📄🚀

This project demonstrates how to implement a CI/CD pipeline using GitHub Actions that:

Executes a Dagster job to train and evaluate a machine learning model (Random Forest).
Uses CML to report metrics (e.g., accuracy) back to GitHub via PR comments.

By combining these tools, you get a reproducible, automated, and collaborative machine learning workflow in a fully Git-based environment. 🧪🛠️📊

What's Inside 🧬📁

Dagster Pipeline: A simple ML job defined in Python using Dagster to:
- Load and split the Iris dataset.
- Train a RandomForestClassifier.
- Evaluate the model’s accuracy.
GitHub Actions Workflow: A CI/CD pipeline in .github/workflows/pipeline.yml that:
- Sets up Python.
- Installs dependencies.
- Executes the Dagster job via CLI.
- Logs accuracy to a file.
- Posts a comment on the PR using CML.

Getting Started 🛠️🏁

To get this template running in your own repo:

Create a new repository using this template:
- Click “Use this template” at the top right.
- Name your new repository.

Clone your new repo locally:

git clone https://github.com/yourusername/your-mlops-repo.git
cd your-mlops-repo

Customize and test your workflow:
- The GitHub Actions workflow is already set to trigger on pull_request with the main branch as a target.
- You can manually trigger it by creating or updating a pull request with main as the target branch.
View the results:
- Once the CI job completes, CML will post an accuracy score as a comment on the PR.
- This ensures quick visibility into model performance without switching tools.

CML Reporting 📢📊

CML is used to post model evaluation results back to GitHub. Here's how it's done:

During the GitHub Actions run, the Dagster job writes model accuracy to a metrics.txt file.
CML reads this file and posts the contents as a PR comment using:
```
cml comment create --file metrics.txt
```

You can expand on this by:

Adding plots (e.g., confusion matrix).
Tracking experiments with DVC or MLFlow.
Exporting models or data as artifacts.

More info in the CML Docs.

License 📜

This project is licensed under the MIT License. Use, modify, and share freely!

Acknowledgments 🙏

Huge thanks to the CML and Dagster communities.
Inspiration from MLOps best practices and real-world workflows.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dagster_train.py		dagster_train.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dagster + CML MLOps Template 🤖🔬🛠️

Introduction 📄🚀

What's Inside 🧬📁

Getting Started 🛠️🏁

CML Reporting 📢📊

License 📜

Acknowledgments 🙏

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

codecentric/from-jupyter-to-production-cml

Folders and files

Latest commit

History

Repository files navigation

Dagster + CML MLOps Template 🤖🔬🛠️

Introduction 📄🚀

What's Inside 🧬📁

Getting Started 🛠️🏁

CML Reporting 📢📊

License 📜

Acknowledgments 🙏

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages