Skip to content

vsanyanov-ux/mlops_challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 MLOps Challenge: Automated Model Delivery

This project demonstrates a production-ready MLOps pipeline for automated training, logging, and delivery of Machine Learning models using MLflow and GitHub Actions.

🏗 Architecture & Workflow

  1. Code Push: Developer pushes code to the main branch.
  2. CI/CD Pipeline:
    • Training: train.py executes, training a RandomForestClassifier on the Iris dataset.
    • Tracking: Hyperparameters, metrics (accuracy), and tags are logged via MLflow Tracking.
    • Artifacts: Confusion matrices and classification reports are saved as experiment artifacts.
    • Registration: The model is automatically registered in the MLflow Model Registry as IrisClassifierFinal.
  3. Automated Testing: pytest validates the registered model's performance and data consistency.
  4. Deployment (Staging): If tests pass, the model is prepared for transition to the Staging stage.

🛠 Tech Stack

  • Languages: Python 3.9+
  • ML Engine: Scikit-Learn
  • Operations: MLflow (Tracking & Registry)
  • CI/CD: GitHub Actions
  • Testing: Pytest
  • Data: Pandas, NumPy

📂 Project Structure

mlops_challenge/
├── .github/workflows/ml_pipeline.yml  # CI/CD Pipeline definition
├── tests/
│   └── test_model.py                # Validation tests for registry models
├── train.py                          # Training & Registration script
├── requirements.txt                  # Dependency list
├── README.md                         # Documentation
└── .gitignore                        # Git exclusions

🚀 Running Locally

1. Setup Environment

pip install -r requirements.txt

2. Run MLflow Server

In a separate terminal:

mlflow ui

3. Execute Training & Registration

python train.py

4. Run Model Validations

Ensure you provide the correct Model URI from the MLflow UI:

$env:MODEL_URI="models:/IrisClassifierFinal/latest"; python -m pytest tests/test_model.py

📈 Pro Tip: Digital Traceability

We use mlflow.set_tag() to attach Git commit hashes and user metadata to every run, ensuring 100% reproducibility and auditing capabilities in production.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages