Skip to content

Commit 2946aea

Browse files
committed
Merge branch 'main' into feature/vertex
2 parents 30f6bd3 + 3107915 commit 2946aea

File tree

83 files changed

+2769
-1120
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+2769
-1120
lines changed

.github/workflows/production_run_complete_llm.yml

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,21 @@ concurrency:
1111
cancel-in-progress: true
1212

1313
jobs:
14-
run-staging-workflow:
14+
run-production-workflow:
1515
runs-on: ubuntu-latest
1616
env:
17-
ZENML_HOST: ${{ secrets.ZENML_HOST }}
18-
ZENML_API_KEY: ${{ secrets.ZENML_API_KEY }}
19-
ZENML_PRODUCTION_STACK : 51a49786-b82a-4646-bde7-a460efb0a9c5
17+
ZENML_STORE_URL: ${{ secrets.ZENML_PROJECTS_HOST }}
18+
ZENML_STORE_API_KEY: ${{ secrets.ZENML_PROJECTS_API_KEY }}
19+
ZENML_PRODUCTION_STACK: b3951d43-0fb2-4d32-89c5-3399374e7c7e # Set this to your production stack ID
2020
ZENML_GITHUB_SHA: ${{ github.event.pull_request.head.sha }}
2121
ZENML_GITHUB_URL_PR: ${{ github.event.pull_request._links.html.href }}
2222
ZENML_DEBUG: true
2323
ZENML_ANALYTICS_OPT_IN: false
2424
ZENML_LOGGING_VERBOSITY: INFO
2525
ZENML_PROJECT_SECRET_NAME: llm-complete
2626
ZENML_DISABLE_CLIENT_SERVER_MISMATCH_WARNING: True
27-
ZENML_ACTION_ID: 23a4d58c-bd2b-47d5-a41d-0a845d2982f8
27+
ZENML_EVENT_SOURCE_ID: ae6ae536-d811-4838-a44b-744b768a0f31 # Set this to your preferred event source ID
28+
ZENML_SERVICE_ACCOUNT_ID: fef76af2-382f-4ab2-9e6b-5eb85a303f0e # Set this to your service account ID or delete
2829

2930
steps:
3031
- name: Check out repository code
@@ -37,15 +38,15 @@ jobs:
3738
- name: Install requirements
3839
working-directory: ./llm-complete-guide
3940
run: |
40-
pip3 install -r requirements.txt
41-
pip3 install -r requirements-argilla.txt
42-
zenml integration install gcp -y
41+
pip3 install uv
42+
uv pip install -r requirements.txt --system
43+
uv pip install -r requirements-argilla.txt --system
44+
zenml integration install gcp -y --uv
4345
4446
- name: Connect to ZenML server
4547
working-directory: ./llm-complete-guide
4648
run: |
4749
zenml init
48-
zenml connect --url $ZENML_HOST --api-key $ZENML_API_KEY
4950
5051
- name: Set stack (Production)
5152
working-directory: ./llm-complete-guide
@@ -55,4 +56,4 @@ jobs:
5556
- name: Run pipeline, create pipeline, configure trigger (Production)
5657
working-directory: ./llm-complete-guide
5758
run: |
58-
python gh_action_rag.py --no-cache --create-template --action-id ${{ env.ZENML_ACTION_ID }} --config rag_gcp.yaml
59+
python gh_action_rag.py --no-cache --create-template --event-source-id ${{ env.ZENML_EVENT_SOURCE_ID }} --service-account-id ${{ env.ZENML_SERVICE_ACCOUNT_ID }} --config production/rag.yaml --zenml-model-version production

.github/workflows/staging_run_complete_llm.yml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ jobs:
1212
run-staging-workflow:
1313
runs-on: ubuntu-latest
1414
env:
15-
ZENML_HOST: ${{ secrets.ZENML_HOST }}
16-
ZENML_API_KEY: ${{ secrets.ZENML_API_KEY }}
17-
ZENML_STAGING_STACK: 51a49786-b82a-4646-bde7-a460efb0a9c5
15+
ZENML_STORE_URL: ${{ secrets.ZENML_PROJECTS_HOST }}
16+
ZENML_STORE_API_KEY: ${{ secrets.ZENML_PROJECTS_API_KEY }}
17+
ZENML_STAGING_STACK : 67166d73-a44e-42f9-b67f-011e9afab9b5 # Set this to your staging stack ID
1818
ZENML_GITHUB_SHA: ${{ github.event.pull_request.head.sha }}
1919
ZENML_GITHUB_URL_PR: ${{ github.event.pull_request._links.html.href }}
2020
ZENML_DEBUG: true
@@ -34,15 +34,15 @@ jobs:
3434
- name: Install requirements
3535
working-directory: ./llm-complete-guide
3636
run: |
37-
pip3 install -r requirements.txt
38-
pip3 install -r requirements-argilla.txt
39-
zenml integration install gcp -y
37+
pip3 install uv
38+
uv pip install -r requirements.txt --system
39+
uv pip install -r requirements-argilla.txt --system
40+
zenml integration install aws s3 -y --uv
4041
4142
- name: Connect to ZenML server
4243
working-directory: ./llm-complete-guide
4344
run: |
4445
zenml init
45-
zenml connect --url $ZENML_HOST --api-key $ZENML_API_KEY
4646
4747
- name: Set stack (Staging)
4848
working-directory: ./llm-complete-guide
@@ -52,4 +52,4 @@ jobs:
5252
- name: Run pipeline (Staging)
5353
working-directory: ./llm-complete-guide
5454
run: |
55-
python gh_action_rag.py --no-cache --config rag_gcp.yaml
55+
python gh_action_rag.py --no-cache --config staging/rag.yaml --zenml-model-version staging

classifier-e2e/README.md

Lines changed: 59 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -11,58 +11,76 @@ pinned: false
1111
license: apache-2.0
1212
---
1313

14-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
14+
# ZenML MLOps Breast Cancer Classification Demo
1515

16-
# 📜 ZenML Stack Show Case
16+
## 🌍 Project Overview
1717

18-
This project aims to demonstrate the power of stacks. The code in this
19-
project assumes that you have quite a few stacks registered already.
18+
This is a minimalistic MLOps project demonstrating how to put machine learning
19+
workflows into production using ZenML. The project focuses on building a breast
20+
cancer classification model with end-to-end ML pipeline management.
2021

21-
## default
22-
* `default` Orchestrator
23-
* `default` Artifact Store
22+
### Key Features
2423

25-
```commandline
26-
zenml stack set default
27-
python run.py --training-pipeline
24+
- 🔬 Feature engineering pipeline
25+
- 🤖 Model training pipeline
26+
- 🧪 Batch inference pipeline
27+
- 📊 Artifact and model lineage tracking
28+
- 🔗 Integration with Weights & Biases for experiment tracking
29+
30+
## 🚀 Installation
31+
32+
1. Clone the repository
33+
2. Install requirements:
34+
```bash
35+
pip install -r requirements.txt
36+
```
37+
3. Install ZenML integrations:
38+
```bash
39+
zenml integration install sklearn xgboost wandb -y
40+
zenml login
41+
zenml init
42+
```
43+
4. You need to register a stack with a [Weights & Biases Experiment Tracker](https://docs.zenml.io/stack-components/experiment-trackers/wandb).
44+
45+
## 🧠 Project Structure
46+
47+
- `steps/`: Contains individual pipeline steps
48+
- `pipelines/`: Pipeline definitions
49+
- `run.py`: Main script to execute pipelines
50+
51+
## 🔍 Workflow and Execution
52+
53+
First, you need to set your stack:
54+
55+
```bash
56+
zenml stack set stack-with-wandb
2857
```
2958

30-
## local-sagemaker-step-operator-stack
31-
* `default` Orchestrator
32-
* `s3` Artifact Store
33-
* `local` Image Builder
34-
* `aws` Container Registry
35-
* `Sagemaker` Step Operator
59+
### 1. Data Loading and Feature Engineering
3660

37-
```commandline
38-
zenml stack set local-sagemaker-step-operator-stack
39-
zenml integration install aws -y
40-
python run.py --training-pipeline
61+
- Uses the Breast Cancer dataset from scikit-learn
62+
- Splits data into training and inference sets
63+
- Preprocesses data for model training
64+
65+
```bash
66+
python run.py --feature-pipeline
4167
```
4268

43-
## sagemaker-airflow-stack
44-
* `Airflow` Orchestrator
45-
* `s3` Artifact Store
46-
* `local` Image Builder
47-
* `aws` Container Registry
48-
* `Sagemaker` Step Operator
49-
50-
```commandline
51-
zenml stack set sagemaker-airflow-stack
52-
zenml integration install airflow -y
53-
pip install apache-airflow-providers-docker apache-airflow~=2.5.0
54-
zenml stack up
69+
### 2. Model Training
70+
71+
- Supports multiple model types (SGD, XGBoost)
72+
- Evaluates and compares model performance
73+
- Tracks model metrics with Weights & Biases
74+
75+
```bash
5576
python run.py --training-pipeline
5677
```
5778

58-
## sagemaker-stack
59-
* `Sagemaker` Orchestrator
60-
* `s3` Artifact Store
61-
* `local` Image Builder
62-
* `aws` Container Registry
63-
* `Sagemaker` Step Operator
79+
### 3. Batch Inference
6480

65-
```commandline
66-
zenml stack set sagemaker-stack
67-
python run.py --training-pipeline
81+
- Loads production model
82+
- Generates predictions on new data
83+
84+
```bash
85+
python run.py --inference-pipeline
6886
```

classifier-e2e/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
zenml[server]>=0.55.2
1+
zenml[server]>=0.70.0
22
notebook
33
scikit-learn<1.3
44
s3fs>2022.3.0,<=2023.4.0

classifier-e2e/run_full.ipynb

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
"source": [
3939
"! pip3 install -r requirements.txt\n",
4040
"! zenml integration install sklearn xgboost -y\n",
41-
"! zenml connect --url https://1cf18d95-zenml.cloudinfra.zenml.io \n",
41+
"! zenml login https://1cf18d95-zenml.cloudinfra.zenml.io \n",
4242
"\n",
4343
"import IPython\n",
4444
"IPython.Application.instance().kernel.do_shutdown(restart=True)"
@@ -941,10 +941,17 @@
941941
" .ravel()\n",
942942
" .tolist(),\n",
943943
" }\n",
944-
" log_model_metadata(metadata={\"wandb_url\": wandb.run.url})\n",
945-
" log_artifact_metadata(\n",
944+
"\n",
945+
" try:\n",
946+
" if get_step_context().model:\n",
947+
" log_metadata(metadata=metadata, infer_model=True)\n",
948+
" except StepContextError:\n",
949+
" # If a model is not configured, it is not able to log metadata\n",
950+
" pass\n",
951+
"\n",
952+
" log_metadata(\n",
946953
" metadata=metadata,\n",
947-
" artifact_name=\"breast_cancer_classifier\",\n",
954+
" artifact_version_id=get_step_context().inputs[\"model\"].id,\n",
948955
" )\n",
949956
"\n",
950957
" wandb.log({\"train_accuracy\": metadata[\"train_accuracy\"]})\n",
@@ -1073,6 +1080,7 @@
10731080
{
10741081
"cell_type": "code",
10751082
"execution_count": null,
1083+
"id": "1e2130b9",
10761084
"metadata": {},
10771085
"outputs": [],
10781086
"source": [
@@ -1083,6 +1091,7 @@
10831091
{
10841092
"cell_type": "code",
10851093
"execution_count": null,
1094+
"id": "476cbf5c",
10861095
"metadata": {},
10871096
"outputs": [],
10881097
"source": [
@@ -1091,6 +1100,7 @@
10911100
},
10921101
{
10931102
"cell_type": "markdown",
1103+
"id": "75df10e7",
10941104
"metadata": {},
10951105
"source": [
10961106
"Now full run executed on local stack and experiment is tracked using Model Control Plane and Weights&Biases.\n",
@@ -1103,6 +1113,7 @@
11031113
{
11041114
"cell_type": "code",
11051115
"execution_count": null,
1116+
"id": "bfd6345f",
11061117
"metadata": {},
11071118
"outputs": [],
11081119
"source": [
@@ -1113,6 +1124,7 @@
11131124
{
11141125
"cell_type": "code",
11151126
"execution_count": null,
1127+
"id": "24358031",
11161128
"metadata": {},
11171129
"outputs": [],
11181130
"source": [
@@ -1136,7 +1148,7 @@
11361148
"name": "python",
11371149
"nbconvert_exporter": "python",
11381150
"pygments_lexer": "ipython3",
1139-
"version": "3.9.18"
1151+
"version": "3.11.3"
11401152
}
11411153
},
11421154
"nbformat": 4,

classifier-e2e/run_skip_basics.ipynb

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
"source": [
3939
"! pip3 install -r requirements.txt\n",
4040
"! zenml integration install sklearn xgboost -y\n",
41-
"! zenml connect --url https://1cf18d95-zenml.cloudinfra.zenml.io \n",
41+
"! zenml login https://1cf18d95-zenml.cloudinfra.zenml.io \n",
4242
"\n",
4343
"import IPython\n",
4444
"IPython.Application.instance().kernel.do_shutdown(restart=True)"
@@ -829,10 +829,17 @@
829829
" .ravel()\n",
830830
" .tolist(),\n",
831831
" }\n",
832-
" log_model_metadata(metadata={\"wandb_url\": wandb.run.url})\n",
833-
" log_artifact_metadata(\n",
832+
"\n",
833+
" try:\n",
834+
" if get_step_context().model:\n",
835+
" log_metadata(metadata=metadata, infer_model=True)\n",
836+
" except StepContextError:\n",
837+
" # If a model is not configured, it is not able to log metadata\n",
838+
" pass\n",
839+
"\n",
840+
" log_metadata(\n",
834841
" metadata=metadata,\n",
835-
" artifact_name=\"breast_cancer_classifier\",\n",
842+
" artifact_version_id=get_step_context().inputs[\"model\"].id,\n",
836843
" )\n",
837844
"\n",
838845
" wandb.log({\"train_accuracy\": metadata[\"train_accuracy\"]})\n",
@@ -1211,7 +1218,7 @@
12111218
"name": "python",
12121219
"nbconvert_exporter": "python",
12131220
"pygments_lexer": "ipython3",
1214-
"version": "3.9.18"
1221+
"version": "3.11.3"
12151222
}
12161223
},
12171224
"nbformat": 4,

classifier-e2e/steps/deploy_endpoint.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
from utils.aws import get_aws_config
88
from utils.sagemaker_materializer import SagemakerPredictorMaterializer
99
from zenml import ArtifactConfig, get_step_context, log_artifact_metadata, step
10+
from zenml.enums import ArtifactType
1011

1112

1213
@step(
@@ -16,7 +17,10 @@
1617
def deploy_endpoint() -> (
1718
Annotated[
1819
Predictor,
19-
ArtifactConfig(name="sagemaker_endpoint", is_deployment_artifact=True),
20+
ArtifactConfig(
21+
name="sagemaker_endpoint",
22+
artifact_type=ArtifactType.SERVICE
23+
),
2024
]
2125
):
2226
role, session, region = get_aws_config()

0 commit comments

Comments
 (0)