You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* add mlops starter and readme
* update minimal_agent
* quickstart
* format
* Auto-update of Starter template
* New quickstart
* Update deployer and stack registration for Docker
* Update agent serving pipeline to remove confidence threshold
* Add missing comma in classify_intent function parameters
* Add ZenML Quickstart example notebook
* Update README with ZenML pipeline modes and OpenAI key.- Add support for OpenAI in agent.yaml.- Integrate LLM responses and classifier upgrades in agent pipeline.- Log metadata for training and response generation
* Update quickstart README with key ZenML features
* Upgrade to structured, targeted agent responses
* lint and docstring
* Update pipeline docstrings with Args and Returns
* Refactor typing to remove Optional for boolean inputs
* Rename agent_serving_pipeline.py to support_agent.py<commit message> Rename agent_serving_pipeline to support_agent
* Update import statement in visualization module
* Update quickstart README with additional details
* lint and docstring
* Update payload structure for ZenML deployment API
* Remove unnecessary code and display analysis results in a structured layout
* Add DocumentAnalysis to pipelines/__init__.py & refactor code
* Remove unused static files and associated styles
* New quickstart
* Code review
* README updates
* README updates
* Removed future imports
* Removed future imports
* README updates
* README updates
* Update import order and use requirements file for installation
* Update README.md for direct agent pipeline run explanation
* Add weather agent example
* Update examples/minimal_agent_production/streamlit_app.py
* Update examples/quickstart/utils.py
* Update examples/quickstart/steps/evaluate.py
* Update examples/quickstart/steps/data.py
* Update examples/minimal_agent_production/README.md
* Update examples/minimal_agent_production/steps/analyze.py
* Update examples/minimal_agent_production/steps/ingest.py
* Update examples/minimal_agent_production/requirements.txt
* Update examples/quickstart/steps/train.py
* Update examples/minimal_agent_production/pipelines/__init__.py
* Update README for minimal agent production examples
This commit updates the README.md file for the minimal agent production examples to reflect the current structure and content of the example files. The changes ensure that the descriptions for each script are accurate and up-to-date.
No functional changes were made to the code; this is purely a documentation update.
* Extract HTML and CSS into template files for render step
Refactors the render_analysis_report_step to use external template files
instead of inline strings for better maintainability and learning patterns.
- Created templates/report.css with all report styling
- Created templates/report.html with HTML structure and format placeholders
- Updated render.py to load and populate templates using Path-based resolution
- Reduced render.py from 302 to 89 lines for improved readability
This makes the templates easier to customize and provides a clearer
separation between presentation logic and data processing.
* Fix model label to reflect actual analysis method used
Previously, the model field was always set to "openai-gpt-4o-mini" even
when the deterministic fallback was used, which doesn't use OpenAI at all.
Now correctly sets:
- "openai-gpt-4o-mini (llm)" when LLM analysis succeeds
- "rule-based (deterministic)" when fallback is used
This ensures accurate metadata for lineage tracking, cost attribution,
and quality assessment of pipeline runs.
* Remove deleted functions from __all__ export list
Cleaned up __all__ to only include functions that are actually imported
and available in the module. Removed references to:
- aggregate_evaluation_results_step
- annotate_analyses_step
- load_recent_analyses
- render_evaluation_report_step
These functions were previously deleted but their names remained in the
export list, which could cause import errors.
* Add Literal types for sentiment and readability values
Introduces type-safe constraints for sentiment and readability fields:
- SentimentType: Literal["positive", "negative", "neutral"]
- ReadabilityType: Literal["easy", "medium", "hard"]
Applied to:
- AnalysisResponse.sentiment and .readability fields
- DocumentAnalysis.sentiment field
This provides IDE autocomplete, type checking at development time, and
Pydantic validation at runtime to prevent invalid values from being
assigned to these fields throughout the pipeline.
* Remove duplicate stop words from get_common_stop_words
Removed duplicate entries in the stop words set:
- 'will' (was listed twice)
- 'were' (was listed twice)
Sets automatically deduplicate in Python, but having duplicates in the
source code is confusing and suggests copy-paste errors.
* Add code fences
* Fix model label tracking and deduplicate constants
Implements compliance fixes to ensure model labels reflect actual models used
and eliminate duplicate constant definitions across the codebase.
Key changes:
- Return used_model from perform_llm_analysis to track actual model
- Use actual model from analysis result for label formatting
- Deduplicate extension→type mapping in Streamlit using shared constants
- Update Streamlit banner to show actual model from analysis object
- Fix pipelines/__init__.py by removing unimported DocumentAnalysis export
These changes ensure runtime accuracy over build-time constants, preventing
scenarios where displayed model names don't match what was actually used for
analysis (e.g., when users override default model parameters).
* Update default LLM model to gpt-5-mini
* Update the README so it's accurate
* cleanup
* Formatting
* Update production notes in README for clarity
Enhanced the production notes section in the README to provide clearer instructions on deploying with Docker settings. Updated the example configuration file name and deployment command to reflect optional user-defined configurations.
This change aims to improve user experience by ensuring accurate and helpful deployment guidance.
* Update README examples for consistency in deployment commands
Replaced hyphens with underscores in the deployment command examples within the README to ensure consistency and accuracy. This change aligns with the updated naming conventions for deployment identifiers, enhancing clarity for users.
No functional changes were made; this is purely a documentation update to improve user experience.
* Extract get_winner_info to module level
Makes the function independently testable and reusable rather than
being nested inside render_evaluation_template. The function now
takes explicit parameters instead of relying on closure scope.
* Extract LLM prompts to dedicated prompts module
Move all LLM prompt content from utils.py into a new prompts.py module
to improve code organization and maintainability.
Changes:
- Create examples/quickstart/prompts.py with:
- INTENT_CLASSIFICATION_PROMPT constant
- INTENT_CONTEXTS mapping
- TEMPLATE_RESPONSES mapping
- RESPONSE_PROMPT_TEMPLATE constant
- build_intent_classification_prompt(text) builder
- build_response_prompt(original_text, intent) builder
- Update examples/quickstart/utils.py:
- Add dual-path imports to support both local and orchestrated execution
- Remove inline prompt constants
- Replace manual prompt formatting with builder functions
- call_llm_for_intent now uses build_intent_classification_prompt()
- generate_llm_response now uses build_response_prompt()
The prompts module is dependency-free (no OpenAI/ZenML imports) and
centralizes all prompt text in one place for easier maintenance.
* Remove unused call_llm_for_intent function
The function was never imported or called anywhere in the codebase.
The ruff formatter also removed the now-unused build_intent_classification_prompt import.
* Remove unused shebang from run.py
The shebang was incorrectly placed after the license header (line 17)
where it has no effect. Shebangs must be on line 1 to work. The file
is always invoked as 'python run.py' per the documentation.
* Escape HTML template variables to prevent rendering issues
Add html.escape() to string variables before inserting them into HTML
templates. This prevents issues with special characters like <, >, &,
and quotes. CSS styles are explicitly excluded from escaping since they
need to be raw CSS.
* Update "Your First AI Pipeline" documentation
Revised the "Your First AI Pipeline" guide to reflect changes in terminology and functionality. The section now emphasizes deploying a ZenML pipeline as a managed HTTP endpoint, with options for invoking via CLI or curl. Additional updates include clarifying the architecture, prerequisites, and deployment instructions, as well as enhancing the descriptions of tracked artifacts and metadata.
This update aims to improve clarity and user experience for new users starting with ZenML.
* Update quickstart README with agent pipeline execution instructions
Added instructions for running the agent pipeline directly in batch mode within the quickstart README. This update aims to enhance user understanding of executing the pipeline without deployment, providing a clear example command for users to follow.
No functional changes were made; this is a documentation improvement to support user experience.
* Fix minimal agent README instructions and add image
* Update "Your First AI Pipeline" documentation for clarity
Revised the "Your First AI Pipeline" guide to enhance clarity and user experience. Changes include improved phrasing for invoking the deployment via the ZenML CLI and curl, as well as the addition of a Streamlit web interface section. This update aims to provide clearer instructions for new users on building and deploying a ZenML pipeline as a managed HTTP endpoint.
No functional changes were made; this is a documentation improvement to support user understanding.
* Add cloud training configs and automated release workflow for quickstart
Enhances the quickstart example with cloud orchestrator support and automates
the release validation workflow for AWS, Azure, and GCP deployments.
Changes:
- Add cloud-specific training configs (training_{aws,azure,gcp,default}.yaml)
- Generate cloud-specific requirements files with integration dependencies
- Support --config flag in run.py for both training and evaluation steps
- Create generate_cloud_requirements.sh to automate requirements generation
- Create qs_run_release_flow.sh to orchestrate release validation flow
- Update release workflow to use new automated scripts
- Add Docker availability checks in release workflow
- Document that cloud configs are for CI validation only
The cloud configs enable testing of the quickstart example on production cloud
orchestrators during the release process while keeping the local development
experience simple.
* Add featured companies logos to README.md
* Update company logos with border-radius and padding
* Update architecture overview link to documentation
* Update featured company logos in README.md
* Update company logos in README.md
* Update featured company logos in README.md
* Update featured company links in README.md
* Add streamlined quickstart testing workflow
Create testing.yml workflow to validate new quickstart implementation on PR.
Simplified from release_prepare.yml by removing all release automation
(version bumping, PR creation, Discord notifications, file updates).
Focus solely on testing quickstart pipelines across AWS/Azure/GCP:
- Extract version from VERSION file (works with any branch)
- Build test Docker images
- Setup release tenant
- Run quickstart on all three cloud providers
This allows us to validate the new sklearn-based quickstart before merging.
* Remove OPENAI_API_KEY from training pipeline configs
The sklearn-based training pipeline doesn't use OpenAI, so remove
OPENAI_API_KEY from:
- Training pipeline definition (intent_training_pipeline.py)
- All training config files (training_*.yaml)
Agent pipeline correctly retains OPENAI_API_KEY in agent.yaml.
This fixes CI failures where OPENAI_API_KEY environment variable
substitution was failing during training pipeline compilation.
* Add OPENAI_API_KEY secret to quickstart pipeline workflows
The agent deployment phase requires OPENAI_API_KEY to deploy the
OpenAI-powered support agent pipeline. Add the secret to both:
- testing.yml (test workflow)
- release_prepare.yml (release workflow)
The repository already has this secret configured (used in
weekly-agent-pipelines-test.yml), it just wasn't being passed
to the run_quickstart_pipelines job.
This fixes agent deployment failures with:
"Unable to substitute environment variable placeholder 'OPENAI_API_KEY'"
* Fix
* Delete test workflow
---------
Co-authored-by: GitHub Actions <[email protected]>
Co-authored-by: Hamza Tahir <[email protected]>
Co-authored-by: Hamza Tahir <[email protected]>
Co-authored-by: Alex Strick van Linschoten <[email protected]>
Co-authored-by: Stefan Nica <[email protected]>
Co-authored-by: Alex Strick van Linschoten <[email protected]>
(cherry picked from commit 118c799)
<h3 align="center">Your unified toolkit for shipping everything from decision trees to complex AI agents, built on the MLOps principles you already trust.</h3>
10
+
<h3 align="center">Your unified toolkit for shipping everything from decision trees to complex AI agents.</h3>
<a href="https://zenml.io/pro">Sign up for ZenML Pro</a> •
38
38
<a href="https://www.zenml.io/blog">Blog</a> •
39
-
<a href="https://zenml.io/podcast">Podcast</a>
40
39
<br />
41
40
<br />
42
41
🎉 For the latest release, see the <a href="https://github.com/zenml-io/zenml/releases">release notes</a>.
@@ -45,112 +44,43 @@
45
44
46
45
---
47
46
48
-
ZenML is a unified MLOps framework that extends the battle-tested principles you rely on for classical ML to the new world of AI agents. It's one platform to develop, evaluate, and deploy your entire AI portfolio - from decision trees to complex multi-agent systems. By providing a single framework for your entire AI stack, ZenML enables developers across your organization to collaborate more effectively without maintaining separate toolchains for models and agents.
47
+
ZenML is built for ML or AI Engineers working on traditional ML use-cases, LLM workflows, or agents, in a company setting.
49
48
49
+
At it's core, ZenML allows you to write **workflows (pipelines)** that run on any **infrastructure backend (stacks)**. You can embed any Pythonic logic within these pipelines, like training a model, or running an agentic loop. ZenML then operationalizes your application by:
50
50
51
-
## 🚨 The Problem: MLOps Works for Models, But What About AI?
51
+
1. Automatically containerizing and tracking your code.
52
+
2. Tracking individual runs with metrics, logs, and metadata.
53
+
3. Abstracting away infrastructure complexity.
54
+
4. Integrating your existing tools and infrastructure e.g. MLflow, Langgraph, Langfuse, Sagemaker, GCP Vertex, etc.
55
+
5. Allowing you to quickly iterate on experiments with an observable layer, in development and in production.
52
56
53
-

57
+
...amongst many other features.
54
58
55
-
You're an ML engineer. You've perfected deploying `scikit-learn` models and wrangling PyTorch jobs. Your MLOps stack is dialed in. But now, you're being asked to build and ship AI agents, and suddenly your trusted toolkit is starting to crack.
59
+
ZenML is used by thousands of companies to run their AI workflows. Here are some featured ones:
56
60
57
-
-**The Adaptation Struggle:** Your MLOps habits (rigorous testing, versioning, CI/CD) don’t map cleanly onto agent development. How do you version a prompt? How do you regression test a non-deterministic system? The tools that gave you confidence for models now create friction for agents.
58
-
59
-
-**The Divided Stack:** To cope, teams are building a second, parallel stack just for LLM-based systems. Now you’re maintaining two sets of tools, two deployment pipelines, and two mental models. Your classical models live in one world, your agents in another. It's expensive, complex, and slows everyone down.
60
-
61
-
-**The Broken Feedback Loop:** Getting an agent from your local environment to production is a slow, painful journey. By the time you get feedback on performance, cost, or quality, the requirements have already changed. Iteration is a guessing game, not a data-driven process.
62
-
63
-
## 💡 The Solution: One Framework for your Entire AI Stack
64
-
65
-
Stop maintaining two separate worlds. ZenML is a unified MLOps framework that extends the battle-tested principles you rely on for classical ML to the new world of AI agents. It’s one platform to develop, evaluate, and deploy your entire AI portfolio.
66
-
67
-
```python
68
-
# Morning: Your sklearn pipeline is still versioned and reproducible.
69
-
train_and_deploy_classifier()
70
-
71
-
# Afternoon: Your new agent evaluation pipeline uses the same logic.
72
-
evaluate_and_deploy_agent()
73
-
74
-
# Same platform. Same principles. New possibilities.
75
-
```
76
-
77
-
With ZenML, you're not replacing your knowledge; you're extending it. Use the pipelines and practices you already know to version, test, deploy, and monitor everything from classic models to the most advanced agents.
78
-
79
-
## 💻 See It In Action: Multi-Agent Architecture Comparison
80
-
81
-
**The Challenge:** Your team built three different customer service agents. Which one should go to production? With ZenML, you can build a reproducible pipeline to test them on real data and make a data-driven decision, with full observability via Langgraph, LiteLLM & Langfuse.
"""Generate beautiful HTML report with winner selection."""
126
-
return create_styled_comparison_report(results)
127
-
128
-
@pipeline
129
-
defcompare_agent_architectures():
130
-
"""Data-driven agent architecture decisions with full MLOps tracking."""
131
-
queries = load_real_conversations()
132
-
prompts = load_prompts() # Prompts as versioned artifacts
133
-
classifier = train_intent_classifier(queries)
134
-
results, viz = run_architecture_comparison(queries, classifier, prompts)
135
-
report = evaluate_and_decide(queries, results)
136
-
137
-
if__name__=="__main__":
138
-
compare_agent_architectures()
139
-
# 🎯 Rich visualizations automatically appear in ZenML dashboard
140
-
```
141
-
142
-
**🚀 [See the complete working example →](examples/agent_comparison/)**
143
-
Prefer a smaller end-to-end template? Check out the [Minimal Agent Production](examples/minimal_agent_production/) example — a lightweight document analysis service with pipelines, evaluation, and a simple web UI.
144
-
145
-
**The Result:** A clear winner is selected based on data, not opinions. You have full lineage from the test data and agent versions to the final report and deployment decision.
<sub><i>(please email [email protected] if you want to be featured)</i></sub>
148
78
149
79
## 🚀 Get Started (5 minutes)
150
80
151
81
### 🏗️ Architecture Overview
152
82
153
-
ZenML uses a **client-server architecture** with an integrated web dashboard ([zenml-io/zenml-dashboard](https://github.com/zenml-io/zenml-dashboard)) for pipeline visualization and management:
83
+
ZenML uses a [**client-server architecture**](https://docs.zenml.io/getting-started/system-architectures) with an integrated web dashboard ([zenml-io/zenml-dashboard](https://github.com/zenml-io/zenml-dashboard)):
154
84
155
85
-**Local Development**: `pip install "zenml[server]"` - runs both client and server locally
156
86
-**Production**: Deploy server separately, connect with `pip install zenml` + `zenml login <server-url>`
@@ -167,87 +97,11 @@ zenml init
167
97
168
98
# Start local server or connect to a remote one
169
99
zenml login
170
-
171
-
# Set OpenAI API key (optional)
172
-
export OPENAI_API_KEY=sk-svv....
173
100
```
174
101
175
-
### Your First Pipeline (2 minutes)
176
-
177
-
```python
178
-
# simple_pipeline.py
179
-
from zenml import pipeline, step
180
-
from sklearn.ensemble import RandomForestClassifier
181
-
from sklearn.datasets import make_classification
182
-
from sklearn.model_selection import train_test_split
183
-
from sklearn.metrics import accuracy_score
184
-
from typing import Tuple
185
-
from typing_extensions import Annotated
186
-
import numpy as np
187
-
188
-
@step
189
-
defcreate_dataset() -> Tuple[
190
-
Annotated[np.ndarray, "X_train"],
191
-
Annotated[np.ndarray, "X_test"],
192
-
Annotated[np.ndarray, "y_train"],
193
-
Annotated[np.ndarray, "y_test"]
194
-
]:
195
-
"""Generate a simple classification dataset."""
196
-
X, y = make_classification(n_samples=100, n_features=4, n_classes=2, random_state=42)
197
-
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
1.**[Agent Architecture Comparison](examples/agent_comparison/)** - Compare AI agents with LangGraph workflows, LiteLLM integration, and automatic visualizations via custom materializers
@@ -291,18 +141,6 @@ For visual learners, start with this 11-minute introduction:
291
141
5.**[Agentic Workflow (Deep Research)](https://github.com/zenml-io/zenml-projects/tree/main/deep_research)** - Orchestrate your agents with ZenML
292
142
6.**[Fine-tuning Pipeline](https://github.com/zenml-io/zenml-projects/tree/main/gamesense)** - Fine-tune and deploy LLMs
293
143
294
-
### 🏢 Deployment Options
295
-
296
-
**For Teams:**
297
-
-**[Self-hosted](https://docs.zenml.io/getting-started/deploying-zenml)** - Deploy on your infrastructure with Helm/Docker
298
-
-**[ZenML Pro](https://cloud.zenml.io/?utm_source=readme)** - Managed service with enterprise support (free trial)
299
-
300
-
**Infrastructure Requirements:**
301
-
- Docker (or Kubernetes for production)
302
-
- Object storage (S3/GCS/Azure)
303
-
- MySQL-compatible database (MySQL 8.0+ or MariaDB)
@@ -354,7 +192,7 @@ A: ZenML integrates with Kubernetes through the native Kubernetes orchestrator,
354
192
355
193
A: ZenML's open-source version is free forever. You likely already have the required infrastructure (like a Kubernetes cluster and object storage). We just help you make better use of it for MLOps.
0 commit comments