Skip to content

Latest commit

Β 

History

History
61 lines (50 loc) Β· 2.24 KB

File metadata and controls

61 lines (50 loc) Β· 2.24 KB

πŸš€ MLOps Challenge: Automated Model Delivery

This project demonstrates a production-ready MLOps pipeline for automated training, logging, and delivery of Machine Learning models using MLflow and GitHub Actions.

πŸ— Architecture & Workflow

  1. Code Push: Developer pushes code to the main branch.
  2. CI/CD Pipeline:
    • Training: train.py executes, training a RandomForestClassifier on the Iris dataset.
    • Tracking: Hyperparameters, metrics (accuracy), and tags are logged via MLflow Tracking.
    • Artifacts: Confusion matrices and classification reports are saved as experiment artifacts.
    • Registration: The model is automatically registered in the MLflow Model Registry as IrisClassifierFinal.
  3. Automated Testing: pytest validates the registered model's performance and data consistency.
  4. Deployment (Staging): If tests pass, the model is prepared for transition to the Staging stage.

πŸ›  Tech Stack

  • Languages: Python 3.9+
  • ML Engine: Scikit-Learn
  • Operations: MLflow (Tracking & Registry)
  • CI/CD: GitHub Actions
  • Testing: Pytest
  • Data: Pandas, NumPy

πŸ“‚ Project Structure

mlops_challenge/
β”œβ”€β”€ .github/workflows/ml_pipeline.yml  # CI/CD Pipeline definition
β”œβ”€β”€ tests/
β”‚   └── test_model.py                # Validation tests for registry models
β”œβ”€β”€ train.py                          # Training & Registration script
β”œβ”€β”€ requirements.txt                  # Dependency list
β”œβ”€β”€ README.md                         # Documentation
└── .gitignore                        # Git exclusions

πŸš€ Running Locally

1. Setup Environment

pip install -r requirements.txt

2. Run MLflow Server

In a separate terminal:

mlflow ui

3. Execute Training & Registration

python train.py

4. Run Model Validations

Ensure you provide the correct Model URI from the MLflow UI:

$env:MODEL_URI="models:/IrisClassifierFinal/latest"; python -m pytest tests/test_model.py

πŸ“ˆ Pro Tip: Digital Traceability

We use mlflow.set_tag() to attach Git commit hashes and user metadata to every run, ensuring 100% reproducibility and auditing capabilities in production.