@@ -11,58 +11,76 @@ pinned: false
1111license: apache-2.0
1212---
1313
14- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
14+ # ZenML MLOps Breast Cancer Classification Demo
1515
16- # 📜 ZenML Stack Show Case
16+ ## 🌍 Project Overview
1717
18- This project aims to demonstrate the power of stacks. The code in this
19- project assumes that you have quite a few stacks registered already.
18+ This is a minimalistic MLOps project demonstrating how to put machine learning
19+ workflows into production using ZenML. The project focuses on building a breast
20+ cancer classification model with end-to-end ML pipeline management.
2021
21- ## default
22- * ` default ` Orchestrator
23- * ` default ` Artifact Store
22+ ### Key Features
2423
25- ``` commandline
26- zenml stack set default
27- python run.py --training-pipeline
24+ - 🔬 Feature engineering pipeline
25+ - 🤖 Model training pipeline
26+ - 🧪 Batch inference pipeline
27+ - 📊 Artifact and model lineage tracking
28+ - 🔗 Integration with Weights & Biases for experiment tracking
29+
30+ ## 🚀 Installation
31+
32+ 1 . Clone the repository
33+ 2 . Install requirements:
34+ ``` bash
35+ pip install -r requirements.txt
36+ ```
37+ 3. Install ZenML integrations:
38+ ` ` ` bash
39+ zenml integration install sklearn xgboost wandb -y
40+ zenml login
41+ zenml init
42+ ` ` `
43+ 4. You need to register a stack with a [Weights & Biases Experiment Tracker](https://docs.zenml.io/stack-components/experiment-trackers/wandb).
44+
45+ # # 🧠 Project Structure
46+
47+ - ` steps/` : Contains individual pipeline steps
48+ - ` pipelines/` : Pipeline definitions
49+ - ` run.py` : Main script to execute pipelines
50+
51+ # # 🔍 Workflow and Execution
52+
53+ First, you need to set your stack:
54+
55+ ` ` ` bash
56+ zenml stack set stack-with-wandb
2857` ` `
2958
30- ## local-sagemaker-step-operator-stack
31- * ` default ` Orchestrator
32- * ` s3 ` Artifact Store
33- * ` local ` Image Builder
34- * ` aws ` Container Registry
35- * ` Sagemaker ` Step Operator
59+ # ## 1. Data Loading and Feature Engineering
3660
37- ``` commandline
38- zenml stack set local-sagemaker-step-operator-stack
39- zenml integration install aws -y
40- python run.py --training-pipeline
61+ - Uses the Breast Cancer dataset from scikit-learn
62+ - Splits data into training and inference sets
63+ - Preprocesses data for model training
64+
65+ ` ` ` bash
66+ python run.py --feature-pipeline
4167` ` `
4268
43- ## sagemaker-airflow-stack
44- * ` Airflow ` Orchestrator
45- * ` s3 ` Artifact Store
46- * ` local ` Image Builder
47- * ` aws ` Container Registry
48- * ` Sagemaker ` Step Operator
49-
50- ``` commandline
51- zenml stack set sagemaker-airflow-stack
52- zenml integration install airflow -y
53- pip install apache-airflow-providers-docker apache-airflow~=2.5.0
54- zenml stack up
69+ # ## 2. Model Training
70+
71+ - Supports multiple model types (SGD, XGBoost)
72+ - Evaluates and compares model performance
73+ - Tracks model metrics with Weights & Biases
74+
75+ ` ` ` bash
5576python run.py --training-pipeline
5677` ` `
5778
58- ## sagemaker-stack
59- * ` Sagemaker ` Orchestrator
60- * ` s3 ` Artifact Store
61- * ` local ` Image Builder
62- * ` aws ` Container Registry
63- * ` Sagemaker ` Step Operator
79+ # ## 3. Batch Inference
6480
65- ``` commandline
66- zenml stack set sagemaker-stack
67- python run.py --training-pipeline
81+ - Loads production model
82+ - Generates predictions on new data
83+
84+ ` ` ` bash
85+ python run.py --inference-pipeline
6886` ` `
0 commit comments