- 
                Notifications
    
You must be signed in to change notification settings  - Fork 551
 
Add Modal orchestrator with step operator and orchestrator flavors #3733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from 14 commits
a599f69
              b122c63
              cd6b59f
              58d72c5
              9a0f2a5
              7cf8fd5
              60d227a
              45ed008
              0b6b9f9
              d8318b1
              b7400b6
              ee15117
              85aeef4
              01f7ef2
              6deba69
              481972b
              12ac7c4
              788ae84
              8cdb3c2
              f2218b1
              e9e2574
              74bf8ca
              331a164
              2770359
              f3c0159
              3173f6a
              37d4a57
              e8b6b42
              32a56df
              7616eef
              3e7fbc5
              5f22bb0
              376d7e4
              04d0204
              34f717a
              bc1f488
              32b5571
              31b2c13
              4a90c3d
              312d490
              1392ec0
              f9c0818
              11d63b3
              75e0615
              4fa278f
              79a449a
              a499755
              2ae1e8b
              72a9f72
              9dc7bfd
              5be346d
              c12a37f
              9a6daaf
              0b5311a
              5f4c207
              623c5fb
              27af8a3
              0b37850
              b5935f6
              4be1fc7
              be58a00
              1b2cfbe
              c739f88
              5d7a44d
              7685c46
              d9fbeb2
              4fb95b2
              c410179
              6387d42
              083320f
              15e27a8
              4f4ab4a
              5f4c69a
              bbbe9df
              0208989
              83d1bbc
              b7d8cc2
              bb9790b
              5b09c58
              2e618d7
              44121d1
              099264e
              8d6bfec
              8ec1c08
              d272580
              0160c16
              1490d39
              6f23076
              0533723
              a638328
              092ed52
              f15ce9c
              9053fc1
              e131b22
              322f1e2
              File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
| @@ -0,0 +1,375 @@ | ||||
| --- | ||||
| description: Orchestrating your pipelines to run on Modal's serverless cloud platform. | ||||
| --- | ||||
| 
     | 
||||
| # Modal Orchestrator | ||||
| 
     | 
||||
| Using the ZenML `modal` integration, you can orchestrate and scale your ML pipelines on [Modal's](https://modal.com/) serverless cloud platform with minimal setup and maximum efficiency. | ||||
| 
     | 
||||
| The Modal orchestrator is designed for speed and cost-effectiveness, running entire pipelines in single serverless functions to minimize cold starts and optimize resource utilization. | ||||
| 
     | 
||||
| {% hint style="warning" %} | ||||
| This component is only meant to be used within the context of a [remote ZenML deployment scenario](https://docs.zenml.io/getting-started/deploying-zenml/). Usage with a local ZenML deployment may lead to unexpected behavior! | ||||
| {% endhint %} | ||||
| 
     | 
||||
| ## When to use it | ||||
| 
     | 
||||
| You should use the Modal orchestrator if: | ||||
                
      
                  htahir1 marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||
| 
     | 
||||
| * you want a serverless solution that scales to zero when not in use. | ||||
| * you're looking for fast pipeline execution with minimal cold start overhead. | ||||
| * you want cost-effective ML pipeline orchestration without managing infrastructure. | ||||
| * you need easy access to GPUs and high-performance computing resources. | ||||
| * you prefer a simple setup process without complex Kubernetes configurations. | ||||
| 
     | 
||||
| ## How to deploy it | ||||
| 
     | 
||||
| The Modal orchestrator runs on Modal's cloud infrastructure, so you don't need to deploy or manage any servers. You just need: | ||||
| 
     | 
||||
| 1. A [Modal account](https://modal.com/) (free tier available) | ||||
| 2. Modal CLI installed and authenticated | ||||
| 3. A [remote ZenML deployment](https://docs.zenml.io/getting-started/deploying-zenml/) for production use | ||||
| 
     | 
||||
| ## How to use it | ||||
| 
     | 
||||
| To use the Modal orchestrator, we need: | ||||
| 
     | 
||||
| * The ZenML `modal` integration installed. If you haven't done so, run: | ||||
| ```shell | ||||
| zenml integration install modal | ||||
| ``` | ||||
| * [Docker](https://www.docker.com) installed and running. | ||||
| * A [remote artifact store](../artifact-stores/README.md) as part of your stack. | ||||
| * A [remote container registry](../container-registries/README.md) as part of your stack. | ||||
| * Modal CLI installed and authenticated: | ||||
| ```shell | ||||
| pip install modal | ||||
                
       | 
||||
| modal setup | ||||
| ``` | ||||
| 
     | 
||||
| ### Setting up the orchestrator | ||||
| 
     | 
||||
| You can register the orchestrator with or without explicit Modal credentials: | ||||
| 
     | 
||||
| **Option 1: Using Modal CLI authentication (recommended for development)** | ||||
| 
     | 
||||
| ```shell | ||||
| # Register the orchestrator (uses Modal CLI credentials) | ||||
| zenml orchestrator register <ORCHESTRATOR_NAME> \ | ||||
| --flavor=modal \ | ||||
| --synchronous=true | ||||
| 
     | 
||||
| # Register and activate a stack with the new orchestrator | ||||
| zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set | ||||
| ``` | ||||
| 
     | 
||||
| **Option 2: Using Modal API token (recommended for production)** | ||||
| 
     | 
||||
| ```shell | ||||
| # Register the orchestrator with explicit credentials | ||||
| zenml orchestrator register <ORCHESTRATOR_NAME> \ | ||||
| --flavor=modal \ | ||||
| --token=<MODAL_TOKEN> \ | ||||
| --workspace=<MODAL_WORKSPACE> \ | ||||
| --synchronous=true | ||||
| 
         
      Comment on lines
    
      100
     to 
      106
    
   
  There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should use   | 
||||
| 
     | 
||||
| # Register and activate a stack with the new orchestrator | ||||
| zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set | ||||
| ``` | ||||
| 
     | 
||||
| You can get your Modal token from the [Modal dashboard](https://modal.com/settings/tokens). | ||||
| 
     | 
||||
| {% hint style="info" %} | ||||
| ZenML will build a Docker image called `<CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>` which includes your code and use it to run your pipeline steps in Modal functions. Check out [this page](https://docs.zenml.io/how-to/customize-docker-builds/) if you want to learn more about how ZenML builds these images and how you can customize them. | ||||
| {% endhint %} | ||||
| 
     | 
||||
| You can now run any ZenML pipeline using the Modal orchestrator: | ||||
| 
     | 
||||
| ```shell | ||||
| python file_that_runs_a_zenml_pipeline.py | ||||
| ``` | ||||
| 
     | 
||||
| ### Modal UI | ||||
| 
     | 
||||
| Modal provides an excellent web interface where you can monitor your pipeline runs in real-time, view logs, and track resource usage. | ||||
| 
     | 
||||
| You can access the Modal dashboard at [modal.com/apps](https://modal.com/apps) to see your running and completed functions. | ||||
| 
     | 
||||
| ### Configuration overview | ||||
                
      
                  strickvl marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||
| 
     | 
||||
| The Modal orchestrator uses two types of settings following ZenML's standard pattern: | ||||
| 
     | 
||||
| 1. **`ResourceSettings`** (standard ZenML) - for hardware resource quantities: | ||||
| - `cpu_count` - Number of CPU cores | ||||
| - `memory` - Memory allocation (e.g., "16GB") | ||||
| - `gpu_count` - Number of GPUs to allocate | ||||
| 
     | 
||||
| 2. **`ModalOrchestratorSettings`** (Modal-specific) - for Modal platform configuration: | ||||
| - `gpu` - GPU type specification (e.g., "T4", "A100", "H100") | ||||
| - `region` - Cloud region preference | ||||
| - `cloud` - Cloud provider selection | ||||
| - `execution_mode` - How to run the pipeline | ||||
| - `timeout`, `min_containers`, `max_containers` - Performance settings | ||||
| 
     | 
||||
| {% hint style="info" %} | ||||
| **GPU Configuration**: Use `ResourceSettings.gpu_count` to specify how many GPUs you need, and `ModalOrchestratorSettings.gpu` to specify what type of GPU. Modal will combine these automatically (e.g., `gpu_count=2` + `gpu="A100"` becomes `"A100:2"`). | ||||
| {% endhint %} | ||||
| 
     | 
||||
| ### Additional configuration | ||||
| 
     | 
||||
| Here's how to configure both types of settings: | ||||
| 
     | 
||||
| ```python | ||||
| from zenml.integrations.modal.flavors.modal_orchestrator_flavor import ( | ||||
| ModalOrchestratorSettings | ||||
| ) | ||||
| from zenml.config import ResourceSettings | ||||
| 
     | 
||||
| # Configure Modal-specific settings | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| gpu="A100", # GPU type (optional) | ||||
| region="us-east-1", # Preferred region | ||||
| cloud="aws", # Cloud provider | ||||
| execution_mode="pipeline", # or "per_step" | ||||
| timeout=3600, # 1 hour timeout | ||||
| min_containers=1, # Keep warm containers | ||||
| max_containers=10, # Scale up to 10 containers | ||||
| ) | ||||
| 
     | 
||||
| # Configure hardware resources (quantities) | ||||
| resource_settings = ResourceSettings( | ||||
| cpu_count=16, # Number of CPU cores | ||||
| memory="32GB", # 32GB RAM | ||||
| gpu_count=1 # Number of GPUs (combined with gpu type below) | ||||
| ) | ||||
| 
     | 
||||
| @pipeline( | ||||
| settings={ | ||||
| "orchestrator": modal_settings, | ||||
| "resources": resource_settings | ||||
| } | ||||
| ) | ||||
| def my_modal_pipeline(): | ||||
| # Your pipeline steps here | ||||
| ... | ||||
| ``` | ||||
| 
     | 
||||
| ### Resource configuration | ||||
| 
     | 
||||
| {% hint style="info" %} | ||||
| **Pipeline-Level Resources**: The Modal orchestrator uses pipeline-level resource settings to configure the Modal function for the entire pipeline. All steps share the same Modal function resources. Configure resources at the `@pipeline` level for best results. | ||||
| {% endhint %} | ||||
| 
     | 
||||
| You can configure pipeline-wide resource requirements using `ResourceSettings` for hardware resources and `ModalOrchestratorSettings` for Modal-specific configurations: | ||||
| 
     | 
||||
| ```python | ||||
| from zenml.config import ResourceSettings | ||||
| from zenml.integrations.modal.flavors.modal_orchestrator_flavor import ( | ||||
| ModalOrchestratorSettings | ||||
| ) | ||||
| 
     | 
||||
| # Configure resources at the pipeline level (recommended) | ||||
| @pipeline( | ||||
| settings={ | ||||
| "resources": ResourceSettings( | ||||
| cpu_count=16, | ||||
| memory="32GB", | ||||
| gpu_count=1 # These resources apply to the entire pipeline | ||||
| ), | ||||
| "orchestrator": ModalOrchestratorSettings( | ||||
| gpu="A100", # GPU type for the entire pipeline | ||||
| region="us-west-2" | ||||
| ) | ||||
| } | ||||
| ) | ||||
| def my_pipeline(): | ||||
| first_step() # Runs with pipeline resources: 16 CPU, 32GB RAM, 1x A100 | ||||
| second_step() # Runs with same resources: 16 CPU, 32GB RAM, 1x A100 | ||||
| ... | ||||
| 
     | 
||||
| @step | ||||
| def first_step(): | ||||
| # Uses pipeline-level resource configuration | ||||
| ... | ||||
| 
     | 
||||
| @step | ||||
| def second_step(): | ||||
| # Uses same pipeline-level resource configuration | ||||
| ... | ||||
| ``` | ||||
| 
     | 
||||
| ### Execution modes | ||||
| 
     | 
||||
| The Modal orchestrator supports two execution modes: | ||||
| 
     | 
||||
| 1. **`pipeline` (default)**: Runs the entire pipeline in a single Modal function for maximum speed and cost efficiency | ||||
                
       | 
||||
| 2. **`per_step`**: Runs each step in a separate Modal function call for granular control and debugging | ||||
| 
     | 
||||
| {% hint style="info" %} | ||||
| **Resource Sharing**: Both execution modes use the same Modal function with the same resource configuration (from pipeline-level settings). The difference is whether steps run sequentially in one function call (`pipeline`) or as separate function calls (`per_step`). | ||||
| {% endhint %} | ||||
| 
     | 
||||
| ```python | ||||
| # Fast execution (default) - entire pipeline in one function | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| execution_mode="pipeline" | ||||
| ) | ||||
| 
     | 
||||
| # Granular execution - each step separate (useful for debugging) | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| execution_mode="per_step" | ||||
| ) | ||||
| ``` | ||||
| 
     | 
||||
| ### Using GPUs | ||||
| 
     | 
||||
| Modal makes it easy to use GPUs for your ML workloads. Use `ResourceSettings` to specify the number of GPUs and `ModalOrchestratorSettings` to specify the GPU type: | ||||
                
      
                  htahir1 marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||||
| 
     | 
||||
| ```python | ||||
| from zenml.config import ResourceSettings | ||||
| from zenml.integrations.modal.flavors.modal_orchestrator_flavor import ( | ||||
| ModalOrchestratorSettings | ||||
| ) | ||||
| 
     | 
||||
| @step( | ||||
| settings={ | ||||
| "resources": ResourceSettings( | ||||
| gpu_count=1 # Number of GPUs to allocate | ||||
| ), | ||||
| "orchestrator": ModalOrchestratorSettings( | ||||
| gpu="A100", # GPU type: "T4", "A10G", "A100", "H100" | ||||
| region="us-east-1" | ||||
| ) | ||||
| } | ||||
| ) | ||||
| def train_model(): | ||||
| # Your GPU-accelerated training code | ||||
| # Modal will provision 1x A100 GPU (gpu_count=1 + gpu="A100") | ||||
| import torch | ||||
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | ||||
| print(f"Using device: {device}") | ||||
| ... | ||||
| ``` | ||||
| 
     | 
||||
| Available GPU types include: | ||||
| - `T4` - Cost-effective for inference and light training | ||||
| - `A10G` - Balanced performance for training and inference | ||||
| - `A100` - High-performance for large model training | ||||
| - `H100` - Latest generation for maximum performance | ||||
| 
     | 
||||
| **Examples of GPU configurations (applied to entire pipeline):** | ||||
| 
     | 
||||
| ```python | ||||
| # Pipeline with GPU - configure on first step or pipeline level | ||||
| @pipeline( | ||||
| settings={ | ||||
| "resources": ResourceSettings(gpu_count=1), | ||||
| "orchestrator": ModalOrchestratorSettings(gpu="A100") | ||||
| } | ||||
| ) | ||||
| def gpu_pipeline(): | ||||
| # All steps in this pipeline will have access to 1x A100 GPU | ||||
| step_one() | ||||
| step_two() | ||||
| 
     | 
||||
| # Multiple GPUs - configure at pipeline level | ||||
| @pipeline( | ||||
| settings={ | ||||
| "resources": ResourceSettings(gpu_count=4), | ||||
| "orchestrator": ModalOrchestratorSettings(gpu="A100") | ||||
| } | ||||
| ) | ||||
| def multi_gpu_pipeline(): | ||||
| # All steps in this pipeline will have access to 4x A100 GPUs | ||||
| training_step() | ||||
| evaluation_step() | ||||
| ``` | ||||
| 
     | 
||||
| ### Synchronous vs Asynchronous execution | ||||
| 
     | 
||||
| You can choose whether to wait for pipeline completion or run asynchronously: | ||||
| 
     | 
||||
| ```python | ||||
| # Wait for completion (default) | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| synchronous=True | ||||
| ) | ||||
| 
     | 
||||
| # Fire-and-forget execution | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| synchronous=False | ||||
| ) | ||||
| ``` | ||||
| 
     | 
||||
| ### Authentication with different environments | ||||
| 
     | 
||||
| For production deployments, you can specify different Modal environments: | ||||
| 
         
      Comment on lines
    
      519
     to 
      533
    
   
  There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe could have a little info box in this section (or maybe even above, linking down here) to say that you might want to have two different stacks, each associated with a different modal environment, one for prod and the other for development etc etc.  | 
||||
| 
     | 
||||
| ```python | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| environment="production", # or "staging", "dev", etc. | ||||
| workspace="my-company" | ||||
| ) | ||||
| ``` | ||||
| 
     | 
||||
| ### Warm containers for faster execution | ||||
| 
     | 
||||
| Modal orchestrator uses persistent apps with warm containers to minimize cold starts: | ||||
| 
     | 
||||
| ```python | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| min_containers=2, # Keep 2 containers warm | ||||
| max_containers=20, # Scale up to 20 containers | ||||
| ) | ||||
| 
     | 
||||
| @pipeline( | ||||
| settings={ | ||||
| "orchestrator": modal_settings | ||||
| } | ||||
| ) | ||||
| def my_pipeline(): | ||||
| ... | ||||
| ``` | ||||
| 
     | 
||||
| This ensures your pipelines start executing immediately without waiting for container initialization. | ||||
                
      
                  strickvl marked this conversation as resolved.
               
              
                Outdated
          
            Show resolved
            Hide resolved
         | 
||||
| 
     | 
||||
| ## Best practices | ||||
| 
     | 
||||
| 1. **Use pipeline mode for production**: The default `pipeline` execution mode runs your entire pipeline in one function, minimizing overhead and cost. | ||||
| 
     | 
||||
| 2. **Separate resource and orchestrator settings**: Use `ResourceSettings` for hardware (CPU, memory, GPU count) and `ModalOrchestratorSettings` for Modal-specific configurations (GPU type, region, etc.). | ||||
| 
     | 
||||
| 3. **Configure appropriate timeouts**: Set realistic timeouts for your workloads: | ||||
| ```python | ||||
| modal_settings = ModalOrchestratorSettings( | ||||
| timeout=7200 # 2 hours | ||||
| ) | ||||
| ``` | ||||
| 
     | 
||||
| 4. **Choose the right region**: Select regions close to your data sources to minimize transfer costs and latency. | ||||
| 
     | 
||||
| 5. **Use appropriate GPU types**: Match GPU types to your workload requirements - don't use A100s for simple inference tasks. | ||||
| 
     | 
||||
| 6. **Monitor resource usage**: Use Modal's dashboard to track your resource consumption and optimize accordingly. | ||||
| 
     | 
||||
| ## Troubleshooting | ||||
| 
     | 
||||
| ### Common issues | ||||
| 
     | 
||||
| 1. **Authentication errors**: Ensure your Modal token is correctly configured and has the necessary permissions. | ||||
| 
     | 
||||
| 2. **Image build failures**: Check that your Docker registry credentials are properly configured in your ZenML stack. | ||||
| 
     | 
||||
| 3. **Resource limits**: If you hit resource limits, consider breaking large steps into smaller ones or requesting quota increases from Modal. | ||||
| 
     | 
||||
| 4. **Network timeouts**: For long-running steps, ensure your timeout settings are appropriate. | ||||
| 
     | 
||||
| ### Getting help | ||||
| 
     | 
||||
| - Check the [Modal documentation](https://modal.com/docs) for platform-specific issues | ||||
| - Monitor your functions in the [Modal dashboard](https://modal.com/apps) | ||||
| - Use `zenml logs` to view detailed pipeline execution logs | ||||
| 
     | 
||||
| For more information and a full list of configurable attributes of the Modal orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-modal.html#zenml.integrations.modal.orchestrators). | ||||
| 
     | 
||||
| <figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure> | ||||
                
       | 
||||
| <figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some representative screenshot of the Modal UI in here to make the docs a bit friendlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think its fine without