Skip to content

Commit 39f0e07

Browse files
htahir1claudestrickvl
authored
Huggingface deployer (#4119)
* Add Huggingface deployer for ZenML Implements a new Huggingface deployer that allows deploying ZenML pipelines as Docker-based Huggingface Spaces. This deployer extends ContainerizedDeployer and uses the huggingface_hub API to manage Space lifecycles. Key features: - Create and update Huggingface Spaces with Docker SDK - Support for hardware tier selection (CPU, GPU options) - Support for persistent storage tiers - Automatic Space naming with configurable prefixes - Full deployment lifecycle management (provision, status, deprovision) Implementation includes: - HuggingfaceDeployer: Main deployer class with Space management - HuggingfaceDeployerFlavor: Flavor registration for the stack - HuggingfaceDeployerConfig: Configuration including token and defaults - HuggingfaceDeployerSettings: Per-deployment settings for hardware/storage - HuggingfaceDeploymentMetadata: Metadata tracking for Space deployments The deployer automatically generates Dockerfile and README.md for Spaces, handles authentication via token or environment variables, and provides proper error handling for Space operations. * Fix logical issues in Huggingface deployer Addressed several logical issues found during code review: 1. Fixed environment variable name: Changed from HUGGING_FACE_HUB_TOKEN to the correct HF_TOKEN as per huggingface_hub documentation 2. Removed unused 'hardware' variable in _create_readme method 3. Improved SpaceStage status mapping to handle all possible states: - Added support for RUNNING_BUILDING, BUILD_ERROR, RUNTIME_ERROR - Added support for CONFIG_ERROR, NO_APP_FILE, DELETING - Better categorization of error vs pending states 4. Fixed Dockerfile environment variable escaping to properly handle backslashes followed by quotes 5. Added safety check for empty deployment names in _sanitize_space_name to prevent edge case failures 6. Fixed potential None access in log error message by checking if space_url exists before using it These fixes ensure more robust error handling and better compatibility with the Huggingface Spaces API. * Simplify Huggingface deployer implementation Removed over-engineered code to make it minimal and maintainable: - Removed HuggingfaceDeploymentMetadata class, use simple dict instead - Removed _sanitize_space_name complexity, simplified to basic regex - Inlined _create_readme and _create_dockerfile methods - Simplified status mapping from 10+ cases to 3 simple cases - Removed unnecessary exception wrapping and verbose logging - Removed unused imports and helper methods The code is now ~200 lines instead of ~500 lines, much easier to understand and maintain while keeping all core functionality. * Fix critical bugs in Huggingface deployer Fixed several bugs found during code review: 1. CRITICAL: Fixed Dockerfile env var escaping - values with quotes or backslashes would break the Dockerfile. Now properly escapes both. 2. Fixed empty space name handling - if deployment name only contains special characters, now defaults to 'deployment' instead of empty string 3. Added try/except around hardware and storage API calls to prevent them from crashing the entire provisioning if they fail 4. Changed flavor name from 'huggingface' to 'huggingface-spaces' to avoid confusion with the existing model deployer flavor These fixes make the deployer more robust and handle edge cases properly. * Standardize Hugging Face deployer with existing integration Aligned the new deployer with existing Hugging Face integration patterns: 1. Naming consistency: Changed 'Huggingface' to 'HuggingFace' (capital F) to match existing classes like HuggingFaceModelDeployer 2. Token security: Use SecretField for token to mark it as sensitive, following the same pattern as HuggingFaceModelDeployerConfig 3. Secret support: Added secret_name field to fetch tokens from ZenML secrets, providing flexibility like the model deployer 4. Stack validation: Added validator to ensure either token or secret_name is configured before deployment 5. Token retrieval: Implemented _get_token() method with priority: config.token > secret_name > HF_TOKEN env var All class names, patterns, and conventions now match the existing Hugging Face integration for consistency and maintainability. * Change deployer flavor name to 'huggingface' The flavor name should be 'huggingface' not 'huggingface-spaces'. Model deployer and deployer are different component types so there's no naming collision. * Fix linting and docstring errors Fixed all linting issues found by scripts/lint.sh and scripts/docstring.sh: 1. F821 - Added HfApi to TYPE_CHECKING imports to fix undefined name error 2. DAR302 - Removed 'Yields' section from do_get_deployment_state_logs since it only raises an exception 3. DAR401 - Added Exception to Raises section in do_deprovision_deployment to document the re-raised exception All linting checks now pass. * Add Hugging Face deployer documentation - Created comprehensive documentation for the Hugging Face deployer in docs/book/component-guide/deployers/huggingface.md - Added entry to deployer table of contents (toc.md) - Updated deployers README.md with Hugging Face deployer information - Documented important limitation about Docker image accessibility and private registries - Included configuration examples, settings, and usage instructions * Implement two-mode deployment and add entrypoint to Dockerfile ## Major Changes ### 1. Added Deployment Server Entrypoint - Dockerfiles now include proper ENTRYPOINT and CMD instructions - Uses DeploymentEntrypointConfiguration to generate correct startup command - Deployment server now starts automatically with deployment ID parameter ### 2. Implemented Two-Mode Deployment System #### Mode 1: Image Reference (with container registry) - References pre-built Docker image from container registry - Lightweight Dockerfile with FROM, ENV, USER, ENTRYPOINT, CMD - Image must be publicly accessible (documented limitation) #### Mode 2: Full Build (without container registry) - Builds complete image from scratch in Hugging Face Spaces - Uploads source code, requirements.txt, and Dockerfile to Space - Generates full Dockerfile with dependency installation - **Solves private registry authentication problem!** ### 3. Implementation Details - Added _get_entrypoint_and_command() helper method - Added _generate_image_reference_dockerfile() for Mode 1 - Added _generate_full_build_dockerfile() for Mode 2 - Added _get_requirements_for_deployment() to gather dependencies - Modified do_provision_deployment() to detect stack configuration and choose mode - Mode selection based on stack.container_registry presence ### 4. Documentation Updates - Replaced "Important Limitations" section with "Deployment Modes" - Documented both deployment modes with use cases - Added clear workarounds for private registry issues - Provided examples for stack configuration ## Benefits - Fixes missing entrypoint issue - deployments can now start properly - Provides solution for private registry problem via full-build mode - Maintains backward compatibility for users with public registries - Automatic mode selection based on stack configuration * Refactor to use ZenML's internal Dockerfile generation Replace custom _generate_full_build_dockerfile implementation with ZenML's internal PipelineDockerImageBuilder._generate_zenml_pipeline_dockerfile method to: - Eliminate code duplication - Ensure consistency with how ZenML builds Docker images elsewhere - Automatically benefit from future improvements to internal method - Reduce maintenance burden Changes: - Import PipelineDockerImageBuilder and json module - Modify _generate_full_build_dockerfile to call internal method - Create requirements_files in format expected by internal method - Merge environment/secrets into docker_settings.environment - Internal method handles: FROM, WORKDIR, ENV, apt packages, requirements, COPY, USER - Manually append ENTRYPOINT and CMD using json.dumps() for exec form - Update docstring to document use of internal method Benefits: - ~50 lines of code eliminated - Consistent with ZenML patterns (python_package_installer, installer_args, etc.) - Proper handling of docker_settings.local_project_install_command - Correct WORKDIR and permissions setup * Fix linting and docstring errors Based on testing feedback, made the following critical fixes: ## Code Changes 1. **Fixed ENTRYPOINT/CMD Serialization (Line 217-218)** - Changed from str() to json.dumps() for proper Docker exec form - Before: str(entrypoint) → ['python', ...] (single quotes, invalid) - After: json.dumps(entrypoint) → ["python", ...] (double quotes, valid) - Fixes: "/bin/sh: 1: [python,: not found" error 2. **Fixed Default Port (Line 78, 107)** - Changed app_port default from 7860 to 8000 - 7860 is HF Spaces default, but ZenML server runs on 8000 - Updated both code and documentation 3. **Added Container Registry Requirement to Validator (Lines 102-140)** - Stack validator now requires container registry - Prevents deployment without pre-built image - Clear error message explains requirement 4. **Removed Full Build Mode (Lines 250-376 deleted)** - Deleted _generate_full_build_dockerfile method - Deleted _get_requirements_for_deployment method - Simplified do_provision_deployment to single mode - Removed unused imports (source_utils, PipelineDockerImageBuilder) - Full build mode caused issues: * Uploaded entire codebase (inefficient) * Configuration problems * Recommended to always use container registry ## Documentation Changes 1. **Updated Port Documentation** - Changed default from 7860 to 8000 in docs - Added note that 8000 is ZenML server default 2. **Removed Deployment Modes Section** - Deleted entire two-mode explanation - Simplified to single-mode operation 3. **Added Important Requirements Section** - Container Registry Requirement subsection - Explains why public access is needed - Lists recommended registries (Docker Hub, GHCR) - Example setup with GitHub Container Registry 4. **Added X-Frame-Options Configuration Section** - Documents iframe embedding issue - Provides complete code example - Shows DeploymentSettings with SecureHeadersConfig - Explains: xfo=False disables X-Frame-Options header - Without this, HF Spaces shows blank page ## Summary of Fixes ✅ ENTRYPOINT/CMD serialization (json.dumps) ✅ Default port 7860 → 8000 ✅ Container registry now required by validator ✅ Full build mode removed ✅ Documentation updated with requirements ✅ X-Frame-Options configuration documented These changes address all issues discovered during testing. * CRITICAL SECURITY FIX: Use HF Space Secrets/Variables API ## Security Issue Previously, environment variables and secrets were written directly into the Dockerfile using ENV instructions. This meant that: - ❌ ALL secrets were exposed in the Dockerfile - ❌ Anyone with access to the Space could view credentials - ❌ Especially dangerous since Spaces are PUBLIC by default (private: False) - ❌ Secrets were permanently in the repository history This was a critical security vulnerability that could leak API keys, database credentials, and other sensitive information. ## Security Fix ### Code Changes (huggingface_deployer.py) 1. **Updated _generate_image_reference_dockerfile (Lines 217-247)** - Removed environment and secrets parameters - No longer generates ENV lines in Dockerfile - Added security note in docstring - Dockerfile now only contains: FROM, USER, ENTRYPOINT, CMD 2. **Updated do_provision_deployment (Lines 312-363)** - Changed method call: removed environment, secrets args - Added secure environment variable handling (lines 337-350): * Uses api.add_space_variable() for each environment variable * Variables stored securely by Hugging Face * Not exposed in Dockerfile or repository - Added secure secrets handling (lines 352-363): * Uses api.add_space_secret() for each secret * Secrets encrypted and never exposed * Safe even with public Spaces ### Documentation Changes (huggingface.md) Added "Secure Secrets and Environment Variables" section (Lines 158-175): - Explains security approach with success hint - Documents use of HF Space Secrets/Variables API - Clarifies that nothing is written to Dockerfile - Emphasizes safety with public Spaces (the default) - Lists security benefits with checkmarks ## Security Improvements ✅ Secrets encrypted and never exposed in repository ✅ Environment variables managed through secure API ✅ No credentials in Dockerfile ✅ Safe to use with public Spaces (default behavior) ✅ No risk of credential leakage even if Space is public ✅ Follows security best practices ## How It Works Now 1. Dockerfile is generated WITHOUT any ENV instructions 2. After uploading Dockerfile, deployer calls: - `api.add_space_variable(repo_id, key, value)` for each env var - `api.add_space_secret(repo_id, key, value)` for each secret 3. Hugging Face injects these at runtime securely 4. No credentials ever appear in repository files This approach is the recommended way to handle secrets/env vars with Hugging Face Spaces according to their API documentation. * Add reference to HF Spaces GPU documentation - Added link to https://huggingface.co/docs/hub/spaces-gpus for space_hardware setting - Helps users understand available GPU options and pricing - Addresses documentation feedback * Address code review feedback for HuggingFace deployer Implements 8 improvements from code review: 1. Add organization support for deploying to HF organizations - Added 'organization' config parameter to HuggingFaceDeployerConfig - Updated _get_space_id() to check organization before falling back to username 2. Add space name length validation - Added HF_SPACE_NAME_MAX_LENGTH constant (96 chars) - Validate space name length and raise DeployerError if exceeded 3. Handle space visibility updates - Check if space_info.private != settings.private when updating - Call api.update_repo_visibility() to apply visibility changes 4. Fail on invalid hardware/storage instead of warning - Changed hardware/storage errors to raise DeploymentProvisionError - Added clear error messages with documentation links 5. Use proper error types for 404 detection - Import HfHubHTTPError from huggingface_hub.utils - Check e.response.status_code == 404 instead of string matching 6. Document timeout parameter in deprovision - Added docstring note that timeout is unused (deletion is immediate) 7. Remove unused space_exists variable - Removed space_exists assignments to fix lint error 8. Update documentation terminology - Changed "Docker applications" to "Docker Spaces" in deployers README All changes maintain backward compatibility and improve code quality. * Remove redundant secret_name parameter from HuggingFace deployer The secret_name parameter was redundant since token is already a SecretField that supports ZenML's secret reference syntax ({{secret.key}}). Changes: - Removed secret_name from HuggingFaceDeployerConfig - Simplified _get_token() to just return config.token or environment variable - Updated validator to only check for token (not token or secret_name) - Updated documentation to show proper secret reference syntax: {{hf_token.token}} - Removed unused Client import (auto-removed by formatter) This simplifies the API and follows ZenML's standard pattern for secret handling. * Add comprehensive Field descriptions to HuggingFace deployer config Added detailed descriptions to all config fields following ZenML standards: - Minimum 30 characters - Action-oriented language - Concrete examples with realistic values - Clear format specifications and constraints Field descriptions added: - token: Authentication, secret syntax, permissions, example - organization: Purpose, example URL, permission requirements - space_hardware: Options with specs, GPU tiers, documentation link - space_storage: Tiers with sizes, persistence behavior - space_prefix: Purpose, naming example, length constraint All descriptions include practical examples and relevant documentation links to help users configure the deployer correctly. * Change HuggingFace Space default visibility to private for security Changed the default value of `private` parameter from False to True to follow security best practices. Private by default prevents accidental exposure of deployment information. Changes: - Set private=True as default in HuggingFaceDeployerSettings - Updated docstring to indicate default is True for security - Updated documentation to reflect new default value - Removed redundant private=True from code example (now default) - Updated security section to mention both private and public Spaces - Clarified that secure secrets handling protects credentials even in public Spaces Users can still explicitly set private=False to make Spaces publicly visible, but the safer default protects users who don't explicitly configure visibility. * Update src/zenml/integrations/huggingface/deployers/huggingface_deployer.py * Address PR review feedback Implements three improvements requested in code review: 1. Add UUID suffix to Space names for uniqueness (@stefannica) - Added first 8 chars of deployment UUID as suffix: {prefix}-{name}-{uuid} - Updated docstring to mention UUID suffix for uniqueness - Follows same pattern as GCP and AWS deployers - Ensures Space names are unique at the project level 2. Add documentation link for secret syntax (@strickvl) - Added link to ZenML secret reference syntax documentation - Links to: https://docs.zenml.io/how-to/project-setup-and-management/interact-with-secrets - Helps users understand how to use {{secret.key}} syntax 3. Replace deprecated update_repo_visibility API (@strickvl) - Changed from update_repo_visibility to update_repo_settings - update_repo_visibility was removed in huggingface_hub v1.0.0.rc7 - New API: api.update_repo_settings(repo_id, private, repo_type) - Maintains same functionality with updated API All changes maintain backward compatibility and improve code quality. * Address comprehensive PR review feedback from @stefannica Implements all requested improvements from second round of code review: 1. Use specific HfHubHTTPError exception instead of broad Exception - Check for 404 status code when space_info fails - Raise DeploymentProvisionError for other HTTP errors - Provides better error messages and debugging 2. Fail early on environment variable failures - Changed from logger.warning to raise DeploymentProvisionError - Deployment will fail immediately if env vars can't be set - Prevents deployment from starting without required configuration 3. Fail early on secret failures - Changed from logger.warning to raise DeploymentProvisionError - Deployment will fail immediately if secrets can't be set - Critical since secrets are needed for ZenML server connection 4. Fix RUNNING_BUILDING state mapping - Moved RUNNING_BUILDING from RUNNING to PENDING status - Only fully running Spaces return RUNNING status - Health endpoint not available during RUNNING_BUILDING 5. Store external state in metadata for debugging - Added runtime.stage to DeploymentOperationalState metadata - Helps with debugging deployment status issues - Provides visibility into HF Space's internal state 6. Use correct exception type in deprovision - Changed from DeploymentNotFoundError to DeploymentDeprovisionError - Signals backend errors vs successful deletion - Updated docstring to document both exception types 7. Document API behavior with clarifying comments - add_space_variable/secret are upsert operations (add or update) - request_space_hardware/storage replace the current tier - Clarifies behavior for redeployments 8. Handle space_id mismatch corner case - Detect when space_id changes (renamed deployment/changed prefix) - Automatically deprovision old Space before creating new one - Prevents orphaned Spaces from accumulating All changes improve error handling, debuggability, and resource cleanup. * Fix Space stage mapping to use only ZenML standard deployment states Addresses feedback about using non-standard deployment states. The previous implementation introduced HuggingFace-specific stages (RUNNING_BUILDING, NO_APP_FILE) into ZenML status mapping without properly handling all cases. Changes: - Import and use SpaceStage enum instead of string matching - Map all 10 HuggingFace Space stages to ZenML's 5 standard states: * RUNNING → RUNNING (only when fully provisioned) * BUILDING, RUNNING_BUILDING → PENDING (health endpoint not available) * BUILD_ERROR, RUNTIME_ERROR, CONFIG_ERROR, NO_APP_FILE → ERROR * STOPPED, PAUSED, DELETING → ABSENT (exists but not running) * Unknown stages → UNKNOWN (future-proofing) Key fix: RUNNING_BUILDING now correctly maps to PENDING, not RUNNING, because the health endpoint is not available during this rebuild phase. Follows same pattern as GCP/AWS deployers which map external service states to ZenML's standard deployment lifecycle states. * Add missing DeploymentDeprovisionError import Fixed lint error - DeploymentDeprovisionError was used but not imported. This exception is used in do_deprovision_deployment when deletion fails. * Fix deployment URL to use actual Space domain instead of HF page CRITICAL BUG FIX: The deployer was returning the HuggingFace Space page URL (https://huggingface.co/spaces/{space_id}) instead of the actual deployment endpoint URL. This caused the base deployer to continuously poll because the health check was hitting the HF page instead of the deployment server. Changes: - Extract the actual domain from runtime.raw['domains'] - Construct proper deployment URL: https://{domain} - Only set URL when status is RUNNING (follows GCP/AWS pattern) - URL is None for non-RUNNING states Example: - Before: https://huggingface.co/spaces/zenml/zenml-weather_agent-5917ffec - After: https://zenml-zenml-weather_agent-5917ffec.hf.space This allows the base deployer's health check to succeed and stop polling once the deployment is fully ready. * Update docs/book/component-guide/deployers/huggingface.md * Check domain stage before returning RUNNING status Additional fix for continuous polling issue. The Space can be in RUNNING stage (Docker container started) but the domain might not be ready yet (DNS propagating, routing not configured). This caused premature RUNNING status reports. Changes: - Only return DeploymentStatus.RUNNING when BOTH conditions are met: 1. runtime.stage == SpaceStage.RUNNING 2. domains[0]['stage'] == "READY" - Space RUNNING + domain not ready → PENDING status - Only set deployment URL when domain stage is READY - Added domain_stage to metadata for debugging This ensures health checks only run when the domain is actually ready to receive traffic, not just when the Docker container has started. Note: If polling continues after domain is READY, it likely means the FastAPI app inside the container is still initializing. The base deployer will continue polling until the /health endpoint responds with 200 OK. * Fix HfHubHTTPError import for mypy compatibility Changed import from huggingface_hub.utils to huggingface_hub.errors to fix mypy errors about implicit re-exports. HfHubHTTPError is defined in the errors module and should be imported from there directly. * Refactor settings to follow deployer pattern Move HuggingFaceDeployerSettings to flavor module and make HuggingFaceDeployerConfig inherit from it, following the pattern used by other deployers in the codebase. Key changes: - Move HuggingFaceDeployerSettings from deployer to flavor module - Make HuggingFaceDeployerConfig inherit from HuggingFaceDeployerSettings - Remove duplicate space_hardware and space_storage fields from config - Remove app_port setting and use uvicorn_port from DeploymentSettings instead, consistent with docker deployer pattern - Add comprehensive Field descriptions for settings fields This addresses PR feedback from @stefannica. * Add debug logging for deployment state detection Added debug logging to help diagnose why deployments might get stuck in pending state even when Space and domain are ready. This will log the Space stage, domain stage, and domain availability to help identify any issues with state detection logic. Also cleaned up redundant domain_stage variable assignments. * Remove redundant import statements in HuggingFace integration * Restart sleeping Spaces during provisioning When provisioning a deployment to an existing Space that is STOPPED or PAUSED, the deployer now automatically restarts it. This ensures that deployments work correctly even when reusing Spaces that have been put to sleep by HuggingFace's auto-sleep mechanism. Key changes: - Check Space runtime state when updating existing Space - Call restart_space() API if Space is STOPPED or PAUSED - Add logging to indicate when Space is being restarted This fixes the bug where deployments would fail silently when the target Space was in a sleeping state. * Add logging for deployment URL when Space is ready Added logging to show the deployment URL when the Space domain is READY. This helps diagnose health check issues during polling by showing exactly what URL the base deployer will attempt to connect to for health checks. * Fix critical bug: use SpaceStage enum values for comparison Fixed a critical bug where we were comparing runtime.stage (a string) directly to SpaceStage enum objects, which always evaluated to False. This caused deployments to never reach RUNNING status and continue polling forever. The HuggingFace API returns runtime.stage as a string (e.g., "RUNNING"), so we must use enum.value for comparison (e.g., SpaceStage.RUNNING.value). Changes: - Use SpaceStage.RUNNING.value instead of SpaceStage.RUNNING - Use enum.value for all stage comparisons - Added comments explaining that runtime.stage is a string - Maintains type safety by using enum values instead of hardcoded strings This fixes the infinite polling issue where deployments would never stop polling even when the Space was running and healthy. * Add comprehensive logging for debugging polling issues Added detailed logging to diagnose two issues: 1. **Unknown status detection**: When the deployment status shows as "unknown", we now log the actual stage value received from HuggingFace and list all known stages. This helps identify if HuggingFace has introduced new stages we don't recognize yet. 2. **Health check failures**: Override _check_deployment_health with better logging to show: - The exact health check URL being tested - Whether health check passed or failed - Specific error messages when health check fails - Status codes returned from the endpoint These logs will help diagnose why deployments continue polling even when the Space appears to be running and the health endpoint is working. The logs now show stage transitions and health check results at INFO level for easier debugging without requiring DEBUG logging. * Add support for RUNNING_APP_STARTING stage and fix import Fixed two critical issues discovered in deployment logs: 1. **Added RUNNING_APP_STARTING stage support**: HuggingFace introduced a new intermediate stage "RUNNING_APP_STARTING" that occurs when the container is running but the application inside is still starting up. Map this to PENDING status since the health endpoint isn't ready yet. 2. **Fixed DeploymentDefaultEndpoints import**: Corrected import path from `zenml.enums` to `zenml.config.deployment_settings` to fix ImportError that was preventing health checks from running. These fixes allow deployments to properly transition through all HuggingFace Space stages and reach RUNNING status once the app is fully started. * Skip HTTP health check for HuggingFace Spaces Fixed the issue where private HuggingFace Spaces would continuously poll because the HTTP health check endpoint returns 404 for unauthenticated requests. Why this is the right solution: 1. Private Spaces block unauthenticated HTTP requests (returns 404) 2. We already have reliable health state from HuggingFace API 3. When Space stage is RUNNING and domain is READY, we know the deployment is genuinely healthy 4. HuggingFace platform handles internal health checks for us The health check method now simply returns True, relying on the comprehensive Space runtime state validation we perform in do_get_deployment_state(). This allows deployments to complete successfully for both public and private Spaces. * Remove verbose debug and polling logs Cleaned up excessive logging that was added for debugging: - Removed deployment URL log that printed on every poll - Removed all debug logs showing space state details - Simplified unknown stage warning message - Removed health check debug log Kept important informational logs: - Creating/updating Space - Restarting sleeping Spaces - Unknown stage warnings (simplified) This significantly reduces log verbosity while maintaining visibility into key deployment lifecycle events. Note: HTTP Request logs visible in output are from huggingface_hub library's own logging, not from our deployer code. * Fix bandit B108 false positive warnings in agent examples (#4107) --------- Co-authored-by: Claude <[email protected]> Co-authored-by: Alex Strick van Linschoten <[email protected]>
1 parent deaedab commit 39f0e07

File tree

8 files changed

+1096
-2
lines changed

8 files changed

+1096
-2
lines changed

docs/book/component-guide/deployers/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ Out of the box, ZenML comes with a `local` deployer already part of the default
3333
| [Docker](docker.md) | `docker` | Built-in | Deploys pipelines as locally running Docker containers |
3434
| [GCP Cloud Run](gcp-cloud-run.md) | `gcp` | `gcp` | Deploys pipelines to Google Cloud Run for serverless execution |
3535
| [AWS App Runner](aws-app-runner.md) | `aws` | `aws` | Deploys pipelines to AWS App Runner for serverless execution |
36+
| [Hugging Face](huggingface.md) | `huggingface` | `huggingface` | Deploys pipelines to Hugging Face Spaces as Docker Spaces |
3637

3738
If you would like to see the available flavors of deployers, you can use the command:
3839

Lines changed: 250 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,250 @@
1+
---
2+
description: Deploying your pipelines to Hugging Face Spaces.
3+
---
4+
5+
# Hugging Face Deployer
6+
7+
[Hugging Face Spaces](https://huggingface.co/spaces) is a platform for hosting and sharing machine learning applications. The Hugging Face deployer is a [deployer](./) flavor included in the ZenML Hugging Face integration that deploys your pipelines to Hugging Face Spaces as Docker-based applications.
8+
9+
{% hint style="warning" %}
10+
This component is only meant to be used within the context of a [remote ZenML installation](https://docs.zenml.io/getting-started/deploying-zenml). Usage with a local ZenML setup may lead to unexpected behavior!
11+
{% endhint %}
12+
13+
## When to use it
14+
15+
You should use the Hugging Face deployer if:
16+
17+
* you're already using Hugging Face for model hosting or datasets.
18+
* you want to share your AI pipelines as publicly accessible or private Spaces.
19+
* you're looking for a simple, managed platform for deploying Docker-based applications.
20+
* you want to leverage Hugging Face's infrastructure for hosting your pipeline deployments.
21+
* you need an easy way to showcase ML workflows to the community.
22+
23+
## How to deploy it
24+
25+
{% hint style="info" %}
26+
The Hugging Face deployer requires a remote ZenML installation. You must ensure that you are connected to the remote ZenML server before using this stack component.
27+
{% endhint %}
28+
29+
In order to use a Hugging Face deployer, you need to first deploy [ZenML to the cloud](https://docs.zenml.io/getting-started/deploying-zenml/).
30+
31+
The only other requirement is having a Hugging Face account and generating an access token with write permissions.
32+
33+
## How to use it
34+
35+
To use the Hugging Face deployer, you need:
36+
37+
* The ZenML `huggingface` integration installed. If you haven't done so, run
38+
39+
```shell
40+
zenml integration install huggingface
41+
```
42+
* [Docker](https://www.docker.com) installed and running.
43+
* A [remote artifact store](https://docs.zenml.io/stacks/artifact-stores/) as part of your stack.
44+
* A [remote container registry](https://docs.zenml.io/stacks/container-registries/) as part of your stack.
45+
* A [Hugging Face access token with write permissions](https://huggingface.co/settings/tokens)
46+
47+
### Hugging Face credentials
48+
49+
You need a Hugging Face access token with write permissions to deploy pipelines. You can create one at [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens).
50+
51+
You have two options to provide credentials to the Hugging Face deployer:
52+
53+
* Pass the token directly when registering the deployer using the `--token` parameter
54+
* (recommended) Store the token in a ZenML secret and reference it using [secret reference syntax](https://docs.zenml.io/how-to/project-setup-and-management/interact-with-secrets)
55+
56+
### Registering the deployer
57+
58+
The deployer can be registered as follows:
59+
60+
```shell
61+
# Option 1: Direct token (not recommended for production)
62+
zenml deployer register <DEPLOYER_NAME> \
63+
--flavor=huggingface \
64+
--token=<YOUR_HF_TOKEN>
65+
66+
# Option 2: Using a secret (recommended)
67+
zenml secret create hf_token --token=<YOUR_HF_TOKEN>
68+
zenml deployer register <DEPLOYER_NAME> \
69+
--flavor=huggingface \
70+
--token='{{hf_token.token}}'
71+
```
72+
73+
### Configuring the stack
74+
75+
With the deployer registered, it can be used in the active stack:
76+
77+
```shell
78+
# Register and activate a stack with the new deployer
79+
zenml stack register <STACK_NAME> -D <DEPLOYER_NAME> ... --set
80+
```
81+
82+
{% hint style="info" %}
83+
ZenML will build a Docker image called `<CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>` which will be referenced in a Dockerfile deployed to your Hugging Face Space. Check out [this page](https://docs.zenml.io/how-to/customize-docker-builds/) if you want to learn more about how ZenML builds these images and how you can customize them.
84+
{% endhint %}
85+
86+
You can now [deploy any ZenML pipeline](https://docs.zenml.io/concepts/deployment) using the Hugging Face deployer:
87+
88+
```shell
89+
zenml pipeline deploy --name my_deployment my_module.my_pipeline
90+
```
91+
92+
### Additional configuration
93+
94+
For additional configuration of the Hugging Face deployer, you can pass the following `HuggingFaceDeployerSettings` attributes defined in the `zenml.integrations.huggingface.flavors.huggingface_deployer_flavor` module when configuring the deployer or defining or deploying your pipeline:
95+
96+
* Basic settings common to all Deployers:
97+
98+
* `auth_key`: A user-defined authentication key to use to authenticate with deployment API calls.
99+
* `generate_auth_key`: Whether to generate and use a random authentication key instead of the user-defined one.
100+
* `lcm_timeout`: The maximum time in seconds to wait for the deployment lifecycle management to complete.
101+
102+
* Hugging Face Spaces-specific settings:
103+
104+
* `space_hardware` (default: `None`): Hardware tier for the Space (e.g., `'cpu-basic'`, `'cpu-upgrade'`, `'t4-small'`, `'t4-medium'`, `'a10g-small'`, `'a10g-large'`). If not specified, uses free CPU tier. See [Hugging Face Spaces GPU documentation](https://huggingface.co/docs/hub/spaces-gpus) for available options and pricing.
105+
* `space_storage` (default: `None`): Persistent storage tier for the Space (e.g., `'small'`, `'medium'`, `'large'`). If not specified, no persistent storage is allocated.
106+
* `private` (default: `True`): Whether to create the Space as private. Set to `False` to make the Space publicly visible to everyone.
107+
* `app_port` (default: `8000`): Port number where your deployment server listens. Defaults to 8000 (ZenML server default). Hugging Face Spaces will route traffic to this port.
108+
109+
Check out [this docs page](https://docs.zenml.io/concepts/steps_and_pipelines/configuration) for more information on how to specify settings.
110+
111+
For example, if you wanted to deploy on GPU hardware with persistent storage, you would configure settings as follows:
112+
113+
```python
114+
from zenml.integrations.huggingface.deployers import HuggingFaceDeployerSettings
115+
116+
huggingface_settings = HuggingFaceDeployerSettings(
117+
space_hardware="t4-small",
118+
space_storage="small",
119+
# private=True is the default for security
120+
)
121+
122+
@pipeline(
123+
settings={
124+
"deployer": huggingface_settings
125+
}
126+
)
127+
def my_pipeline(...):
128+
...
129+
```
130+
131+
### Managing deployments
132+
133+
Once deployed, you can manage your deployments using the ZenML CLI:
134+
135+
```shell
136+
# List all deployments
137+
zenml deployment list
138+
139+
# Get deployment status
140+
zenml deployment describe <DEPLOYMENT_NAME>
141+
142+
# Get deployment logs
143+
zenml deployment logs <DEPLOYMENT_NAME>
144+
145+
# Delete a deployment
146+
zenml deployment delete <DEPLOYMENT_NAME>
147+
```
148+
149+
The deployed pipeline will be available as a Hugging Face Space at:
150+
```
151+
https://huggingface.co/spaces/<YOUR_USERNAME>/<SPACE_PREFIX>-<DEPLOYMENT_NAME>
152+
```
153+
154+
By default, the space prefix is `zenml` but this can be configured using the `space_prefix` parameter when registering the deployer.
155+
156+
## Important Requirements
157+
158+
### Secure Secrets and Environment Variables
159+
160+
{% hint style="success" %}
161+
The Hugging Face deployer handles secrets and environment variables **securely** using Hugging Face's Space Secrets and Variables API. Credentials are **never** written to the Dockerfile.
162+
{% endhint %}
163+
164+
**How it works:**
165+
- Environment variables are set using `HfApi.add_space_variable()` - stored securely by Hugging Face
166+
- Secrets are set using `HfApi.add_space_secret()` - encrypted and never exposed in the Space repository
167+
- **Nothing is baked into the Dockerfile** - no risk of leaked credentials even in public Spaces
168+
169+
**What this means:**
170+
- ✅ Safe to use with both private and public Spaces
171+
- ✅ Secrets remain encrypted and hidden from view
172+
- ✅ Environment variables are managed through HF's secure API
173+
- ✅ No credentials exposed in Dockerfile or repository files
174+
175+
This secure approach ensures that if you choose to make your Space public (`private=False`), credentials remain protected and are never visible to anyone viewing your Space's repository.
176+
177+
### Container Registry Requirement
178+
179+
{% hint style="warning" %}
180+
The Hugging Face deployer **requires** a container registry to be part of your ZenML stack. The Docker image must be pre-built and pushed to a **publicly accessible** container registry.
181+
{% endhint %}
182+
183+
**Why public access is required:**
184+
Hugging Face Spaces cannot authenticate with private Docker registries when building Docker Spaces. The platform pulls your Docker image during the build process, which means it needs public access.
185+
186+
**Recommended registries:**
187+
- [Docker Hub](https://hub.docker.com/) public repositories
188+
- [GitHub Container Registry (GHCR)](https://ghcr.io) with public images
189+
- Any other public container registry
190+
191+
**Example setup with GitHub Container Registry:**
192+
```shell
193+
# Register a public container registry
194+
zenml container-registry register ghcr_public \
195+
--flavor=default \
196+
--uri=ghcr.io/<your-github-username>
197+
198+
# Add it to your stack
199+
zenml stack update <STACK_NAME> --container-registry=ghcr_public
200+
```
201+
202+
### Configuring iframe Embedding (X-Frame-Options)
203+
204+
By default, ZenML's deployment server sends an `X-Frame-Options` header that prevents the deployment UI from being embedded in iframes. This causes issues with Hugging Face Spaces, which displays deployments in an iframe.
205+
206+
**To fix this**, you must configure your pipeline's `DeploymentSettings` to disable the `X-Frame-Options` header:
207+
208+
```python
209+
from zenml import pipeline
210+
from zenml.config import DeploymentSettings, SecureHeadersConfig
211+
212+
# Configure deployment settings
213+
deployment_settings = DeploymentSettings(
214+
app_title="My ZenML Pipeline",
215+
app_description="ML pipeline deployed to Hugging Face Spaces",
216+
app_version="1.0.0",
217+
secure_headers=SecureHeadersConfig(
218+
xfo=False, # Disable X-Frame-Options to allow iframe embedding
219+
server=True,
220+
hsts=False,
221+
content=True,
222+
referrer=True,
223+
cache=True,
224+
permissions=True,
225+
),
226+
cors={
227+
"allow_origins": ["*"],
228+
"allow_methods": ["GET", "POST", "OPTIONS"],
229+
"allow_headers": ["*"],
230+
"allow_credentials": False,
231+
},
232+
)
233+
234+
@pipeline(
235+
name="my_hf_pipeline",
236+
settings={"deployment": deployment_settings}
237+
)
238+
def my_pipeline():
239+
# Your pipeline steps here
240+
pass
241+
```
242+
243+
Without this configuration, the Hugging Face Spaces UI will show a blank page or errors when trying to display your deployment.
244+
245+
## Additional Resources
246+
247+
* [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
248+
* [Docker Spaces Guide](https://huggingface.co/docs/hub/spaces-sdks-docker)
249+
* [Hugging Face Hardware Options](https://huggingface.co/docs/hub/spaces-gpus)
250+
* [ZenML Deployment Concepts](https://docs.zenml.io/concepts/deployment)

docs/book/component-guide/toc.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
* [Docker Deployer](deployers/docker.md)
2626
* [AWS App Runner Deployer](deployers/aws-app-runner.md)
2727
* [GCP Cloud Run Deployer](deployers/gcp-cloud-run.md)
28+
* [Hugging Face Deployer](deployers/huggingface.md)
2829
* [Artifact Stores](artifact-stores/README.md)
2930
* [Local Artifact Store](artifact-stores/local.md)
3031
* [Amazon Simple Cloud Storage (S3)](artifact-stores/s3.md)

src/zenml/integrations/huggingface/__init__.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
from zenml.stack import Flavor
2121

2222
HUGGINGFACE_MODEL_DEPLOYER_FLAVOR = "huggingface"
23+
HUGGINGFACE_DEPLOYER_FLAVOR = "huggingface"
2324
HUGGINGFACE_SERVICE_ARTIFACT = "hf_deployment_service"
2425

2526

@@ -65,15 +66,16 @@ def get_requirements(cls, target_os: Optional[str] = None, python_version: Optio
6566

6667
@classmethod
6768
def flavors(cls) -> List[Type[Flavor]]:
68-
"""Declare the stack component flavors for the Huggingface integration.
69+
"""Declare the stack component flavors for the Hugging Face integration.
6970
7071
Returns:
7172
List of stack component flavors for this integration.
7273
"""
7374
from zenml.integrations.huggingface.flavors import (
75+
HuggingFaceDeployerFlavor,
7476
HuggingFaceModelDeployerFlavor,
7577
)
7678

77-
return [HuggingFaceModelDeployerFlavor]
79+
return [HuggingFaceDeployerFlavor, HuggingFaceModelDeployerFlavor]
7880

7981

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Copyright (c) ZenML GmbH 2025. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at:
6+
#
7+
# https://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
12+
# or implied. See the License for the specific language governing
13+
# permissions and limitations under the License.
14+
"""Hugging Face deployers."""
15+
16+
from zenml.integrations.huggingface.deployers.huggingface_deployer import (
17+
HuggingFaceDeployer,
18+
)
19+
20+
__all__ = [
21+
"HuggingFaceDeployer",
22+
]

0 commit comments

Comments
 (0)