Proposal: “CWL → Prefect Flow” converter for simple CWL workflows #19968
-
Proposal: CWL → Prefect Flow adapter (strict subset, Docker & Kubernetes backends)Before beginningThe Common Workflow Language (CWL) is an open, community-driven specification for describing command-line tools and workflows in a portable and reproducible way. CWL focuses on explicit inputs, containerized execution, and well-defined filesystem semantics. SummaryThis proposal describes a strict, deterministic adapter that converts a single CWL workflow into a Prefect flow, where:
The goal is not to implement the full CWL specification, but to provide a reliable bridge for “simple CWL” workflows into Prefect’s execution and deployment model. Scope (intentionally strict)Supported
Explicitly not supported (v1)
Prefect-aligned execution modelExecution is modeled in two layers, consistent with Prefect’s architecture.
This preserves CWL’s “container-per-step” semantics while using Prefect strictly as an orchestrator. Execution architecture (Mermaid)flowchart TD
A[Prefect Deployment] --> B[Worker]
B --> C[Parent Flow Run<br/>]
C --> S1[Step 1<br/>Container / Job]
S1 --> S2[Step 2<br/>Container / Job]
S2 --> S3[Step N<br/>Container / Job]
V[(Shared Workspace<br/>/workspace)] --- S1
V --- S2
V --- S3
Backend modelThe library distinguishes step execution backend, not parent runtime. Docker backend
Kubernetes backend
Workspace / volume contract (core design)Filesystem semantics are handled as infrastructure, not CWL semantics. Canonical workspace layoutAll steps see the same workspace: Environment variables (always set)All CWL commands are resolved relative to WORKDIR. Volume implementation by backend
Internal architectureThe internal architecture resolves around multiple components. Idea is simple: validating the CWL provided, parsing, creating the flow&tasks, allow user to deploy it:
As diagram: flowchart TD
A[CWL document] --> B[Validator]
B -->|valid| C[Parser]
B -->|invalid| E1[Validation error]
C -->|parsed| D[Flow builder]
C -->|parse error| E2[Parse error]
D -->|built| F[Step runner wiring]
D -->|build error| E3[Build error]
F -->|ok| G[Generated flow and tasks]
F -->|config error| E4[Runner configuration error]
G --> N[Ready for deployment]
How each CWL step executes
Open questions (just a few)
ConclusionThis approach:
Feedback on architecture, scope, and backend expectations is very welcome. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Thanks for the write up @brunifrancesco! I think a Prefect adapter for CWL would be great as a community-maintained package. What you've described makes sense at a high level, and I think some of our existing integrations in |
Beta Was this translation helpful? Give feedback.
-
|
Hello @desertaxle , let's get practical: at the current time, the implementation is fairly easy.
Even working, this is pretty different from Prefect idea of pool type. At the same time, since a CWL can be composed by multiple docker images, how can I fill this gap? Can we do better? |
Beta Was this translation helpful? Give feedback.
Thanks for the write up @brunifrancesco! I think a Prefect adapter for CWL would be great as a community-maintained package. What you've described makes sense at a high level, and I think some of our existing integrations in
prefect-dockerandprefect-kuberneteswill be helpful in implementing this adapater. Let us know in this dicussion if you need any roadblocks on Prefect specifics as you're working on implementation and we can provide guidance!