diff --git a/docs/book/component-guide/deployers/README.md b/docs/book/component-guide/deployers/README.md index 173423301b3..4cc7a67af08 100644 --- a/docs/book/component-guide/deployers/README.md +++ b/docs/book/component-guide/deployers/README.md @@ -33,6 +33,7 @@ Out of the box, ZenML comes with a `local` deployer already part of the default | [Docker](docker.md) | `docker` | Built-in | Deploys pipelines as locally running Docker containers | | [GCP Cloud Run](gcp-cloud-run.md) | `gcp` | `gcp` | Deploys pipelines to Google Cloud Run for serverless execution | | [AWS App Runner](aws-app-runner.md) | `aws` | `aws` | Deploys pipelines to AWS App Runner for serverless execution | +| [Hugging Face](huggingface.md) | `huggingface` | `huggingface` | Deploys pipelines to Hugging Face Spaces as Docker Spaces | If you would like to see the available flavors of deployers, you can use the command: diff --git a/docs/book/component-guide/deployers/huggingface.md b/docs/book/component-guide/deployers/huggingface.md new file mode 100644 index 00000000000..f7e7560de1c --- /dev/null +++ b/docs/book/component-guide/deployers/huggingface.md @@ -0,0 +1,250 @@ +--- +description: Deploying your pipelines to Hugging Face Spaces. +--- + +# Hugging Face Deployer + +[Hugging Face Spaces](https://huggingface.co/spaces) is a platform for hosting and sharing machine learning applications. The Hugging Face deployer is a [deployer](./) flavor included in the ZenML Hugging Face integration that deploys your pipelines to Hugging Face Spaces as Docker-based applications. + +{% hint style="warning" %} +This component is only meant to be used within the context of a [remote ZenML installation](https://docs.zenml.io/getting-started/deploying-zenml). Usage with a local ZenML setup may lead to unexpected behavior! +{% endhint %} + +## When to use it + +You should use the Hugging Face deployer if: + +* you're already using Hugging Face for model hosting or datasets. +* you want to share your AI pipelines as publicly accessible or private Spaces. +* you're looking for a simple, managed platform for deploying Docker-based applications. +* you want to leverage Hugging Face's infrastructure for hosting your pipeline deployments. +* you need an easy way to showcase ML workflows to the community. + +## How to deploy it + +{% hint style="info" %} +The Hugging Face deployer requires a remote ZenML installation. You must ensure that you are connected to the remote ZenML server before using this stack component. +{% endhint %} + +In order to use a Hugging Face deployer, you need to first deploy [ZenML to the cloud](https://docs.zenml.io/getting-started/deploying-zenml/). + +The only other requirement is having a Hugging Face account and generating an access token with write permissions. + +## How to use it + +To use the Hugging Face deployer, you need: + +* The ZenML `huggingface` integration installed. If you haven't done so, run + + ```shell + zenml integration install huggingface + ``` +* [Docker](https://www.docker.com) installed and running. +* A [remote artifact store](https://docs.zenml.io/stacks/artifact-stores/) as part of your stack. +* A [remote container registry](https://docs.zenml.io/stacks/container-registries/) as part of your stack. +* A [Hugging Face access token with write permissions](https://huggingface.co/settings/tokens) + +### Hugging Face credentials + +You need a Hugging Face access token with write permissions to deploy pipelines. You can create one at [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens). + +You have two options to provide credentials to the Hugging Face deployer: + +* Pass the token directly when registering the deployer using the `--token` parameter +* (recommended) Store the token in a ZenML secret and reference it using [secret reference syntax](https://docs.zenml.io/how-to/project-setup-and-management/interact-with-secrets) + +### Registering the deployer + +The deployer can be registered as follows: + +```shell +# Option 1: Direct token (not recommended for production) +zenml deployer register \ + --flavor=huggingface \ + --token= + +# Option 2: Using a secret (recommended) +zenml secret create hf_token --token= +zenml deployer register \ + --flavor=huggingface \ + --token='{{hf_token.token}}' +``` + +### Configuring the stack + +With the deployer registered, it can be used in the active stack: + +```shell +# Register and activate a stack with the new deployer +zenml stack register -D ... --set +``` + +{% hint style="info" %} +ZenML will build a Docker image called `/zenml:` which will be referenced in a Dockerfile deployed to your Hugging Face Space. Check out [this page](https://docs.zenml.io/how-to/customize-docker-builds/) if you want to learn more about how ZenML builds these images and how you can customize them. +{% endhint %} + +You can now [deploy any ZenML pipeline](https://docs.zenml.io/concepts/deployment) using the Hugging Face deployer: + +```shell +zenml pipeline deploy --name my_deployment my_module.my_pipeline +``` + +### Additional configuration + +For additional configuration of the Hugging Face deployer, you can pass the following `HuggingFaceDeployerSettings` attributes defined in the `zenml.integrations.huggingface.flavors.huggingface_deployer_flavor` module when configuring the deployer or defining or deploying your pipeline: + +* Basic settings common to all Deployers: + + * `auth_key`: A user-defined authentication key to use to authenticate with deployment API calls. + * `generate_auth_key`: Whether to generate and use a random authentication key instead of the user-defined one. + * `lcm_timeout`: The maximum time in seconds to wait for the deployment lifecycle management to complete. + +* Hugging Face Spaces-specific settings: + + * `space_hardware` (default: `None`): Hardware tier for the Space (e.g., `'cpu-basic'`, `'cpu-upgrade'`, `'t4-small'`, `'t4-medium'`, `'a10g-small'`, `'a10g-large'`). If not specified, uses free CPU tier. See [Hugging Face Spaces GPU documentation](https://huggingface.co/docs/hub/spaces-gpus) for available options and pricing. + * `space_storage` (default: `None`): Persistent storage tier for the Space (e.g., `'small'`, `'medium'`, `'large'`). If not specified, no persistent storage is allocated. + * `private` (default: `True`): Whether to create the Space as private. Set to `False` to make the Space publicly visible to everyone. + * `app_port` (default: `8000`): Port number where your deployment server listens. Defaults to 8000 (ZenML server default). Hugging Face Spaces will route traffic to this port. + +Check out [this docs page](https://docs.zenml.io/concepts/steps_and_pipelines/configuration) for more information on how to specify settings. + +For example, if you wanted to deploy on GPU hardware with persistent storage, you would configure settings as follows: + +```python +from zenml.integrations.huggingface.deployers import HuggingFaceDeployerSettings + +huggingface_settings = HuggingFaceDeployerSettings( + space_hardware="t4-small", + space_storage="small", + # private=True is the default for security +) + +@pipeline( + settings={ + "deployer": huggingface_settings + } +) +def my_pipeline(...): + ... +``` + +### Managing deployments + +Once deployed, you can manage your deployments using the ZenML CLI: + +```shell +# List all deployments +zenml deployment list + +# Get deployment status +zenml deployment describe + +# Get deployment logs +zenml deployment logs + +# Delete a deployment +zenml deployment delete +``` + +The deployed pipeline will be available as a Hugging Face Space at: +``` +https://huggingface.co/spaces//- +``` + +By default, the space prefix is `zenml` but this can be configured using the `space_prefix` parameter when registering the deployer. + +## Important Requirements + +### Secure Secrets and Environment Variables + +{% hint style="success" %} +The Hugging Face deployer handles secrets and environment variables **securely** using Hugging Face's Space Secrets and Variables API. Credentials are **never** written to the Dockerfile. +{% endhint %} + +**How it works:** +- Environment variables are set using `HfApi.add_space_variable()` - stored securely by Hugging Face +- Secrets are set using `HfApi.add_space_secret()` - encrypted and never exposed in the Space repository +- **Nothing is baked into the Dockerfile** - no risk of leaked credentials even in public Spaces + +**What this means:** +- ✅ Safe to use with both private and public Spaces +- ✅ Secrets remain encrypted and hidden from view +- ✅ Environment variables are managed through HF's secure API +- ✅ No credentials exposed in Dockerfile or repository files + +This secure approach ensures that if you choose to make your Space public (`private=False`), credentials remain protected and are never visible to anyone viewing your Space's repository. + +### Container Registry Requirement + +{% hint style="warning" %} +The Hugging Face deployer **requires** a container registry to be part of your ZenML stack. The Docker image must be pre-built and pushed to a **publicly accessible** container registry. +{% endhint %} + +**Why public access is required:** +Hugging Face Spaces cannot authenticate with private Docker registries when building Docker Spaces. The platform pulls your Docker image during the build process, which means it needs public access. + +**Recommended registries:** +- [Docker Hub](https://hub.docker.com/) public repositories +- [GitHub Container Registry (GHCR)](https://ghcr.io) with public images +- Any other public container registry + +**Example setup with GitHub Container Registry:** +```shell +# Register a public container registry +zenml container-registry register ghcr_public \ + --flavor=default \ + --uri=ghcr.io/ + +# Add it to your stack +zenml stack update --container-registry=ghcr_public +``` + +### Configuring iframe Embedding (X-Frame-Options) + +By default, ZenML's deployment server sends an `X-Frame-Options` header that prevents the deployment UI from being embedded in iframes. This causes issues with Hugging Face Spaces, which displays deployments in an iframe. + +**To fix this**, you must configure your pipeline's `DeploymentSettings` to disable the `X-Frame-Options` header: + +```python +from zenml import pipeline +from zenml.config import DeploymentSettings, SecureHeadersConfig + +# Configure deployment settings +deployment_settings = DeploymentSettings( + app_title="My ZenML Pipeline", + app_description="ML pipeline deployed to Hugging Face Spaces", + app_version="1.0.0", + secure_headers=SecureHeadersConfig( + xfo=False, # Disable X-Frame-Options to allow iframe embedding + server=True, + hsts=False, + content=True, + referrer=True, + cache=True, + permissions=True, + ), + cors={ + "allow_origins": ["*"], + "allow_methods": ["GET", "POST", "OPTIONS"], + "allow_headers": ["*"], + "allow_credentials": False, + }, +) + +@pipeline( + name="my_hf_pipeline", + settings={"deployment": deployment_settings} +) +def my_pipeline(): + # Your pipeline steps here + pass +``` + +Without this configuration, the Hugging Face Spaces UI will show a blank page or errors when trying to display your deployment. + +## Additional Resources + +* [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces) +* [Docker Spaces Guide](https://huggingface.co/docs/hub/spaces-sdks-docker) +* [Hugging Face Hardware Options](https://huggingface.co/docs/hub/spaces-gpus) +* [ZenML Deployment Concepts](https://docs.zenml.io/concepts/deployment) diff --git a/docs/book/component-guide/toc.md b/docs/book/component-guide/toc.md index 6ee235b169c..53a867ed184 100644 --- a/docs/book/component-guide/toc.md +++ b/docs/book/component-guide/toc.md @@ -25,6 +25,7 @@ * [Docker Deployer](deployers/docker.md) * [AWS App Runner Deployer](deployers/aws-app-runner.md) * [GCP Cloud Run Deployer](deployers/gcp-cloud-run.md) + * [Hugging Face Deployer](deployers/huggingface.md) * [Artifact Stores](artifact-stores/README.md) * [Local Artifact Store](artifact-stores/local.md) * [Amazon Simple Cloud Storage (S3)](artifact-stores/s3.md) diff --git a/src/zenml/integrations/huggingface/__init__.py b/src/zenml/integrations/huggingface/__init__.py index 0fb3ce74214..83e3908af02 100644 --- a/src/zenml/integrations/huggingface/__init__.py +++ b/src/zenml/integrations/huggingface/__init__.py @@ -20,6 +20,7 @@ from zenml.stack import Flavor HUGGINGFACE_MODEL_DEPLOYER_FLAVOR = "huggingface" +HUGGINGFACE_DEPLOYER_FLAVOR = "huggingface" HUGGINGFACE_SERVICE_ARTIFACT = "hf_deployment_service" @@ -65,15 +66,16 @@ def get_requirements(cls, target_os: Optional[str] = None, python_version: Optio @classmethod def flavors(cls) -> List[Type[Flavor]]: - """Declare the stack component flavors for the Huggingface integration. + """Declare the stack component flavors for the Hugging Face integration. Returns: List of stack component flavors for this integration. """ from zenml.integrations.huggingface.flavors import ( + HuggingFaceDeployerFlavor, HuggingFaceModelDeployerFlavor, ) - return [HuggingFaceModelDeployerFlavor] + return [HuggingFaceDeployerFlavor, HuggingFaceModelDeployerFlavor] diff --git a/src/zenml/integrations/huggingface/deployers/__init__.py b/src/zenml/integrations/huggingface/deployers/__init__.py new file mode 100644 index 00000000000..f757b988979 --- /dev/null +++ b/src/zenml/integrations/huggingface/deployers/__init__.py @@ -0,0 +1,22 @@ +# Copyright (c) ZenML GmbH 2025. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Hugging Face deployers.""" + +from zenml.integrations.huggingface.deployers.huggingface_deployer import ( + HuggingFaceDeployer, +) + +__all__ = [ + "HuggingFaceDeployer", +] diff --git a/src/zenml/integrations/huggingface/deployers/huggingface_deployer.py b/src/zenml/integrations/huggingface/deployers/huggingface_deployer.py new file mode 100644 index 00000000000..6d39f31a3dc --- /dev/null +++ b/src/zenml/integrations/huggingface/deployers/huggingface_deployer.py @@ -0,0 +1,620 @@ +# Copyright (c) ZenML GmbH 2025. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Implementation of the ZenML Hugging Face deployer.""" + +import json +import os +import re +import tempfile +from typing import ( + TYPE_CHECKING, + Dict, + Generator, + Optional, + Tuple, + Type, + cast, +) + +from zenml.config.base_settings import BaseSettings +from zenml.deployers.containerized_deployer import ContainerizedDeployer +from zenml.deployers.exceptions import ( + DeployerError, + DeploymentDeprovisionError, + DeploymentLogsNotFoundError, + DeploymentNotFoundError, + DeploymentProvisionError, +) +from zenml.deployers.server.entrypoint_configuration import ( + DEPLOYMENT_ID_OPTION, + DeploymentEntrypointConfiguration, +) +from zenml.enums import DeploymentStatus +from zenml.integrations.huggingface.flavors.huggingface_deployer_flavor import ( + HuggingFaceDeployerConfig, + HuggingFaceDeployerSettings, +) +from zenml.logger import get_logger +from zenml.models import DeploymentOperationalState, DeploymentResponse +from zenml.stack.stack_validator import StackValidator + +if TYPE_CHECKING: + from huggingface_hub import HfApi + + from zenml.stack import Stack + +# HF Space name max length (repo name limit) +HF_SPACE_NAME_MAX_LENGTH = 96 + +logger = get_logger(__name__) + + +class HuggingFaceDeployer(ContainerizedDeployer): + """Deployer that runs deployments as Hugging Face Spaces.""" + + @property + def settings_class(self) -> Optional[Type[BaseSettings]]: + """Settings class for the Hugging Face deployer. + + Returns: + The settings class. + """ + return HuggingFaceDeployerSettings + + @property + def config(self) -> HuggingFaceDeployerConfig: + """Returns the `HuggingFaceDeployerConfig` config. + + Returns: + The configuration. + """ + return cast(HuggingFaceDeployerConfig, self._config) + + @property + def validator(self) -> Optional[StackValidator]: + """Validates the stack. + + Returns: + A validator that checks required stack components. + """ + + def _validate_requirements( + stack: "Stack", + ) -> Tuple[bool, str]: + """Check if all requirements are met. + + Args: + stack: The stack to validate. + + Returns: + Tuple of (is_valid, message). + """ + # Check token + if not self.config.token: + return False, ( + "The Hugging Face deployer requires a token to be " + "configured. Use --token parameter with a direct token " + "value or reference a ZenML secret with {{secret.key}} syntax." + ) + + # Check container registry + if not stack.container_registry: + return False, ( + "The Hugging Face deployer requires a container registry " + "to be part of the stack. The Docker image must be " + "pre-built and pushed to a publicly accessible registry." + ) + + return True, "" + + return StackValidator( + custom_validation_function=_validate_requirements, + ) + + def _get_token(self) -> Optional[str]: + """Get the Hugging Face token. + + Returns: + The token from config or environment. If config.token uses secret + reference syntax like {{secret.key}}, ZenML automatically resolves it. + """ + return self.config.token or os.environ.get("HF_TOKEN") + + def _get_hf_api(self) -> "HfApi": + """Get the Hugging Face API client. + + Returns: + The Hugging Face API client. + + Raises: + DeployerError: If huggingface_hub is not installed. + """ + try: + from huggingface_hub import HfApi + except ImportError: + raise DeployerError( + "huggingface_hub is required. Install with: pip install huggingface_hub" + ) + + token = self._get_token() + return HfApi(token=token) + + def _get_space_id(self, deployment: DeploymentResponse) -> str: + """Get the Space ID for a deployment. + + Supports deploying to either a user account or organization. + + Args: + deployment: The deployment. + + Returns: + The Space ID in format 'owner/space-name' where owner is either + the username or organization name, and space-name includes a + UUID suffix for uniqueness. + + Raises: + DeployerError: If the space name exceeds HF's maximum length. + """ + # Get owner (organization or username) + if self.config.organization: + owner = self.config.organization + else: + api = self._get_hf_api() + owner = api.whoami()["name"] + + # Sanitize deployment name: alphanumeric, hyphens, underscores only + sanitized = re.sub(r"[^a-zA-Z0-9\-_]", "-", deployment.name).lower() + sanitized = sanitized.strip("-") or "deployment" + + # Add UUID suffix for uniqueness (first 8 chars of deployment ID) + uuid_suffix = str(deployment.id)[:8] + space_name = f"{self.config.space_prefix}-{sanitized}-{uuid_suffix}" + + # Validate length (HF has 96 char limit for repo names) + if len(space_name) > HF_SPACE_NAME_MAX_LENGTH: + raise DeployerError( + f"Space name '{space_name}' exceeds Hugging Face's " + f"maximum length of {HF_SPACE_NAME_MAX_LENGTH} characters. " + f"Please use a shorter deployment name or space_prefix." + ) + + return f"{owner}/{space_name}" + + def _get_entrypoint_and_command( + self, deployment: DeploymentResponse + ) -> Tuple[str, str]: + """Generate ENTRYPOINT and CMD for the Dockerfile. + + Args: + deployment: The deployment. + + Returns: + Tuple of (ENTRYPOINT line, CMD line) for Dockerfile. + """ + # Get entrypoint command: ["python", "-m", "zenml.entrypoints.entrypoint"] + entrypoint = DeploymentEntrypointConfiguration.get_entrypoint_command() + + # Get arguments with deployment ID + arguments = DeploymentEntrypointConfiguration.get_entrypoint_arguments( + **{DEPLOYMENT_ID_OPTION: deployment.id} + ) + + # Format as JSON arrays for Dockerfile exec form + # Use json.dumps() to ensure proper JSON with double quotes + entrypoint_line = f"ENTRYPOINT {json.dumps(entrypoint)}" + cmd_line = f"CMD {json.dumps(arguments)}" + + return entrypoint_line, cmd_line + + def _generate_image_reference_dockerfile( + self, + image: str, + deployment: DeploymentResponse, + ) -> str: + """Generate Dockerfile that references a pre-built image. + + Note: Environment variables and secrets are NOT included in the + Dockerfile for security reasons. They are set using Hugging Face's + Space secrets and variables API instead. + + Args: + image: The pre-built image to reference. + deployment: The deployment. + + Returns: + The Dockerfile content. + """ + lines = [f"FROM {image}"] + + # Add user + lines.append("USER 1000") + + # Add entrypoint and command + entrypoint_line, cmd_line = self._get_entrypoint_and_command( + deployment + ) + lines.append(entrypoint_line) + lines.append(cmd_line) + + return "\n".join(lines) + + def do_provision_deployment( + self, + deployment: DeploymentResponse, + stack: "Stack", + environment: Dict[str, str], + secrets: Dict[str, str], + timeout: int, + ) -> DeploymentOperationalState: + """Provision a Huggingface Space deployment. + + Args: + deployment: The deployment to run. + stack: The active stack. + environment: Environment variables for the app. + secrets: Secret environment variables for the app. + timeout: Maximum time to wait for deployment (unused). + + Returns: + Operational state of the provisioned deployment. + + Raises: + DeploymentProvisionError: If the deployment cannot be provisioned. + """ + assert deployment.snapshot, "Pipeline snapshot not found" + + settings = cast( + HuggingFaceDeployerSettings, + self.get_settings(deployment.snapshot), + ) + + api = self._get_hf_api() + space_id = self._get_space_id(deployment) + image = self.get_image(deployment.snapshot) + + # Handle space_id mismatch (e.g., renamed deployment or changed prefix) + old_space_id = deployment.deployment_metadata.get("space_id") + if old_space_id and old_space_id != space_id: + logger.info( + f"Space ID changed from {old_space_id} to {space_id}. " + f"Cleaning up old Space..." + ) + try: + self.do_deprovision_deployment(deployment, timeout=0) + except Exception as e: + logger.warning( + f"Failed to clean up old Space {old_space_id}: {e}" + ) + + logger.info( + f"Deploying image {image} to Hugging Face Space. " + "Ensure the image is publicly accessible." + ) + + try: + from huggingface_hub.errors import HfHubHTTPError + + # Create Space if it doesn't exist, or update visibility if needed + try: + space_info = api.space_info(space_id) + logger.info(f"Updating existing Space: {space_id}") + + # Update visibility if changed + if space_info.private != settings.private: + logger.info( + f"Updating Space visibility: " + f"{'private' if settings.private else 'public'}" + ) + api.update_repo_settings( + repo_id=space_id, + private=settings.private, + repo_type="space", + ) + except HfHubHTTPError as e: + if e.response.status_code != 404: + raise DeploymentProvisionError( + f"Failed to check Space {space_id}: {e}" + ) from e + logger.info(f"Creating new Space: {space_id}") + api.create_repo( + repo_id=space_id, + repo_type="space", + space_sdk="docker", + private=settings.private, + ) + + # Upload Dockerfile and README + with tempfile.TemporaryDirectory() as tmpdir: + # Create README + readme = os.path.join(tmpdir, "README.md") + # Get port from deployment settings + port = deployment.snapshot.pipeline_configuration.deployment_settings.uvicorn_port + with open(readme, "w") as f: + f.write( + f"---\ntitle: {deployment.name}\nsdk: docker\n" + f"app_port: {port}\n---\n" + ) + + # Create Dockerfile + dockerfile = os.path.join(tmpdir, "Dockerfile") + dockerfile_content = self._generate_image_reference_dockerfile( + image, deployment + ) + + with open(dockerfile, "w") as f: + f.write(dockerfile_content) + + # Upload README + api.upload_file( + path_or_fileobj=readme, + path_in_repo="README.md", + repo_id=space_id, + repo_type="space", + ) + + # Upload Dockerfile + api.upload_file( + path_or_fileobj=dockerfile, + path_in_repo="Dockerfile", + repo_id=space_id, + repo_type="space", + ) + + # Set environment variables using Space variables API + # This is secure - variables are not exposed in the Dockerfile + # Note: add_space_variable is an upsert operation (adds or updates) + logger.info(f"Setting {len(environment)} environment variables...") + for key, value in environment.items(): + try: + api.add_space_variable( + repo_id=space_id, + key=key, + value=value, + ) + except Exception as e: + raise DeploymentProvisionError( + f"Failed to set environment variable {key}: {e}" + ) from e + + # Set secrets using Space secrets API + # This is secure - secrets are encrypted and not exposed + # Note: add_space_secret is an upsert operation (adds or updates) + logger.info(f"Setting {len(secrets)} secrets...") + for key, value in secrets.items(): + try: + api.add_space_secret( + repo_id=space_id, + key=key, + value=value, + ) + except Exception as e: + raise DeploymentProvisionError( + f"Failed to set secret {key}: {e}" + ) from e + + # Set hardware if specified (fail if this doesn't work) + # Note: request_space_hardware replaces the current hardware tier + hardware = settings.space_hardware or self.config.space_hardware + if hardware: + from huggingface_hub import SpaceHardware + + try: + api.request_space_hardware( + repo_id=space_id, + hardware=getattr( + SpaceHardware, hardware.upper().replace("-", "_") + ), + ) + logger.info(f"Requested hardware: {hardware}") + except AttributeError: + raise DeploymentProvisionError( + f"Invalid hardware tier '{hardware}'. " + f"See https://huggingface.co/docs/hub/spaces-gpus" + ) + except Exception as e: + raise DeploymentProvisionError( + f"Failed to set hardware {hardware}: {e}" + ) from e + + # Set storage if specified (fail if this doesn't work) + # Note: request_space_storage replaces the current storage tier + storage = settings.space_storage or self.config.space_storage + if storage: + from huggingface_hub import SpaceStorage + + try: + api.request_space_storage( + repo_id=space_id, + storage=getattr(SpaceStorage, storage.upper()), + ) + logger.info(f"Requested storage: {storage}") + except AttributeError: + raise DeploymentProvisionError( + f"Invalid storage tier '{storage}'. " + f"Valid options: small, medium, large" + ) + except Exception as e: + raise DeploymentProvisionError( + f"Failed to set storage {storage}: {e}" + ) from e + + space_url = f"https://huggingface.co/spaces/{space_id}" + return DeploymentOperationalState( + status=DeploymentStatus.PENDING, + url=space_url, + metadata={"space_id": space_id}, + ) + + except Exception as e: + raise DeploymentProvisionError( + f"Failed to provision Space: {e}" + ) from e + + def do_get_deployment_state( + self, deployment: DeploymentResponse + ) -> DeploymentOperationalState: + """Get information about a Huggingface Space deployment. + + Args: + deployment: The deployment to inspect. + + Returns: + Operational state of the deployment. + + Raises: + DeploymentNotFoundError: If the Space is not found. + """ + space_id = deployment.deployment_metadata.get("space_id") + if not space_id: + raise DeploymentNotFoundError("Space ID not found in metadata") + + api = self._get_hf_api() + + try: + from huggingface_hub import SpaceStage + + runtime = api.get_space_runtime(repo_id=space_id) + + # Debug logging + domains = runtime.raw.get("domains", []) + domain_stage = domains[0].get("stage") if domains else None + logger.debug( + f"Space {space_id} state: stage={runtime.stage}, " + f"domain_stage={domain_stage}, has_domains={bool(domains)}" + ) + + # Map HuggingFace Space stages to ZenML standard deployment states + # Only RUNNING + domain READY means fully provisioned with health endpoint available + if runtime.stage == SpaceStage.RUNNING: + # Check if domain is also ready (not just Space running) + if domains and domains[0].get("stage") == "READY": + status = DeploymentStatus.RUNNING + logger.debug( + f"Space {space_id} is fully ready: " + f"stage=RUNNING, domain_stage=READY" + ) + else: + # Space is running but domain not ready yet (DNS propagating, etc.) + status = DeploymentStatus.PENDING + logger.debug( + f"Space {space_id} is running but domain not ready: " + f"domain_stage={domain_stage}" + ) + # Building/updating states - health endpoint not yet available + elif runtime.stage in [ + SpaceStage.BUILDING, + SpaceStage.RUNNING_BUILDING, # Rebuilding, not fully ready + ]: + status = DeploymentStatus.PENDING + # Error states - deployment failed or misconfigured + elif runtime.stage in [ + SpaceStage.BUILD_ERROR, + SpaceStage.RUNTIME_ERROR, + SpaceStage.CONFIG_ERROR, + SpaceStage.NO_APP_FILE, + ]: + status = DeploymentStatus.ERROR + # Stopped/paused states - deployment exists but not running + elif runtime.stage in [ + SpaceStage.STOPPED, + SpaceStage.PAUSED, + SpaceStage.DELETING, + ]: + status = DeploymentStatus.ABSENT + else: + # Unknown/future stages + status = DeploymentStatus.UNKNOWN + + # Get deployment URL from Space domains (only when fully ready) + url = None + # Only set URL if domain is ready for traffic + if ( + domain_stage == "READY" + and domains + and domains[0].get("domain") + ): + url = f"https://{domains[0]['domain']}" + + return DeploymentOperationalState( + status=status, + url=url, + metadata={ + "space_id": space_id, + "external_state": runtime.stage, + "domain_stage": domain_stage, + }, + ) + + except Exception as e: + raise DeploymentNotFoundError( + f"Space {space_id} not found: {e}" + ) from e + + def do_get_deployment_state_logs( + self, + deployment: DeploymentResponse, + follow: bool = False, + tail: Optional[int] = None, + ) -> Generator[str, bool, None]: + """Get logs from a Huggingface Space deployment. + + Args: + deployment: The deployment to read logs for. + follow: Stream logs if True (not supported). + tail: Return only last N lines if set (not supported). + + Raises: + DeploymentLogsNotFoundError: Always, as logs are not available via API. + """ + space_id = deployment.deployment_metadata.get("space_id") + raise DeploymentLogsNotFoundError( + f"Logs not available via API. View at: " + f"https://huggingface.co/spaces/{space_id}/logs" + ) + + def do_deprovision_deployment( + self, deployment: DeploymentResponse, timeout: int + ) -> Optional[DeploymentOperationalState]: + """Deprovision a Huggingface Space deployment. + + Args: + deployment: The deployment to stop. + timeout: Maximum time to wait for deprovision (unused - deletion is immediate). + + Returns: + None, indicating immediate deletion completed. + + Raises: + DeploymentNotFoundError: If the Space ID is not in metadata. + DeploymentDeprovisionError: If deletion fails. + """ + space_id = deployment.deployment_metadata.get("space_id") + if not space_id: + raise DeploymentNotFoundError("Space ID not found in metadata") + + api = self._get_hf_api() + + try: + from huggingface_hub.errors import HfHubHTTPError + + api.delete_repo(repo_id=space_id, repo_type="space") + logger.info(f"Deleted Space: {space_id}") + return None + except HfHubHTTPError as e: + if e.response.status_code == 404: + logger.info(f"Space {space_id} already deleted") + return None + raise DeploymentDeprovisionError( + f"Failed to delete Space {space_id}: {e}" + ) from e diff --git a/src/zenml/integrations/huggingface/flavors/__init__.py b/src/zenml/integrations/huggingface/flavors/__init__.py index e963d202170..c55e46840f3 100644 --- a/src/zenml/integrations/huggingface/flavors/__init__.py +++ b/src/zenml/integrations/huggingface/flavors/__init__.py @@ -13,6 +13,11 @@ # permissions and limitations under the License. """Hugging Face integration flavors.""" +from zenml.integrations.huggingface.flavors.huggingface_deployer_flavor import ( # noqa + HuggingFaceDeployerConfig, + HuggingFaceDeployerFlavor, + HuggingFaceDeployerSettings, +) from zenml.integrations.huggingface.flavors.huggingface_model_deployer_flavor import ( # noqa HuggingFaceModelDeployerConfig, HuggingFaceModelDeployerFlavor, @@ -20,7 +25,10 @@ ) __all__ = [ + "HuggingFaceDeployerConfig", + "HuggingFaceDeployerFlavor", "HuggingFaceModelDeployerConfig", "HuggingFaceModelDeployerFlavor", "HuggingFaceBaseConfig", + "HuggingFaceDeployerSettings", ] diff --git a/src/zenml/integrations/huggingface/flavors/huggingface_deployer_flavor.py b/src/zenml/integrations/huggingface/flavors/huggingface_deployer_flavor.py new file mode 100644 index 00000000000..f064134de44 --- /dev/null +++ b/src/zenml/integrations/huggingface/flavors/huggingface_deployer_flavor.py @@ -0,0 +1,165 @@ +# Copyright (c) ZenML GmbH 2025. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Huggingface deployer flavor.""" + +from typing import TYPE_CHECKING, Optional, Type + +from pydantic import Field + +from zenml.deployers.base_deployer import ( + BaseDeployerConfig, + BaseDeployerFlavor, + BaseDeployerSettings, +) +from zenml.integrations.huggingface import HUGGINGFACE_DEPLOYER_FLAVOR +from zenml.models import ServiceConnectorRequirements +from zenml.utils.secret_utils import SecretField + +if TYPE_CHECKING: + from zenml.integrations.huggingface.deployers import HuggingFaceDeployer + + +class HuggingFaceDeployerSettings(BaseDeployerSettings): + """Hugging Face deployer settings. + + Attributes: + space_hardware: Hardware tier for the Space (e.g., 'cpu-basic', 't4-small') + space_storage: Persistent storage tier (e.g., 'small', 'medium', 'large') + private: Whether to create a private Space (default: True for security) + """ + + space_hardware: Optional[str] = Field( + default=None, + description="Hardware tier for Space execution. Controls compute resources " + "available to the deployed pipeline. Options: 'cpu-basic' (2 vCPU, 16GB RAM), " + "'cpu-upgrade' (8 vCPU, 32GB RAM), 't4-small' (4 vCPU, 15GB RAM, NVIDIA T4), " + "'t4-medium' (8 vCPU, 30GB RAM, NVIDIA T4). See " + "https://huggingface.co/docs/hub/spaces-gpus for full list. Defaults to " + "cpu-basic if not specified", + ) + space_storage: Optional[str] = Field( + default=None, + description="Persistent storage tier for Space data. Determines available disk " + "space for artifacts and logs. Options: 'small' (20GB), 'medium' (150GB), " + "'large' (1TB). Storage persists across Space restarts. If not specified, " + "uses ephemeral storage that resets on restart", + ) + private: bool = Field( + default=True, + description="Controls whether the deployed Space is private or public. " + "Private Spaces are only accessible to the owner and authorized users. " + "Public Spaces are visible to anyone with the URL. Defaults to True for security", + ) + + +class HuggingFaceDeployerConfig( + BaseDeployerConfig, HuggingFaceDeployerSettings +): + """Configuration for the Hugging Face deployer.""" + + token: Optional[str] = SecretField( + default=None, + description="Hugging Face API token for authentication with write permissions. " + "Can reference a ZenML secret using {{secret_name.key}} syntax or provide " + "the token directly. Create tokens at https://huggingface.co/settings/tokens " + "with 'write' access enabled. Example: '{{hf_token.token}}' references the " + "'token' key in the 'hf_token' secret", + ) + organization: Optional[str] = Field( + default=None, + description="Hugging Face organization name to deploy Spaces under. If not " + "specified, Spaces are created under the authenticated user's account. " + "Example: 'zenml' deploys to https://huggingface.co/spaces/zenml/. " + "Requires organization membership with appropriate permissions", + ) + space_prefix: str = Field( + default="zenml", + description="Prefix for Space names to organize deployments and avoid naming " + "conflicts. Combined with deployment name to form the full Space ID. " + "Example: prefix 'zenml' with deployment 'my-pipeline' creates Space " + "'zenml-my-pipeline'. Maximum combined length is 96 characters", + ) + + +class HuggingFaceDeployerFlavor(BaseDeployerFlavor): + """Flavor for the Hugging Face deployer.""" + + @property + def name(self) -> str: + """Name of the flavor. + + Returns: + The flavor name. + """ + return HUGGINGFACE_DEPLOYER_FLAVOR + + @property + def docs_url(self) -> Optional[str]: + """A URL to point at docs explaining this flavor. + + Returns: + A flavor docs url. + """ + return self.generate_default_docs_url() + + @property + def sdk_docs_url(self) -> Optional[str]: + """A URL to point at SDK docs explaining this flavor. + + Returns: + A flavor SDK docs url. + """ + return self.generate_default_sdk_docs_url() + + @property + def logo_url(self) -> str: + """A URL to represent the flavor in the dashboard. + + Returns: + The flavor logo. + """ + return "https://public-flavor-logos.s3.eu-central-1.amazonaws.com/deployer/huggingface.png" + + @property + def config_class(self) -> Type[BaseDeployerConfig]: + """Returns `HuggingFaceDeployerConfig` config class. + + Returns: + The config class. + """ + return HuggingFaceDeployerConfig + + @property + def implementation_class(self) -> Type["HuggingFaceDeployer"]: + """Implementation class for this flavor. + + Returns: + The implementation class. + """ + from zenml.integrations.huggingface.deployers import ( + HuggingFaceDeployer, + ) + + return HuggingFaceDeployer + + @property + def service_connector_requirements( + self, + ) -> Optional[ServiceConnectorRequirements]: + """Service connector resource requirements for this flavor. + + Returns: + Service connector resource requirements. + """ + return None