Ask Backfill Pattern: Should I use subflows (.submit()) or run_deployment() for daily Spark jobs? #19500

TomMonkeyMan · 2025-11-21T03:02:29Z

TomMonkeyMan
Nov 21, 2025

Hi Prefect friends & community!

I'm building a backfill workflow for Spark jobs and would love some guidance on the best architectural pattern.

My setup

I have a @flow called daily_spark_flow(day: date) that submits a Spark job for a single day.
I want to run this flow for a range of dates (e.g., backfill 30 days).
The daily_spark_flow is defined in one file (myflow.py) and imported into a parent/backfill flow.

Two approaches I'm considering

Subflow via .submit()

@flow
def backfill_flow(start, end):
    for day in date_range(start, end):
        daily_spark_task.submit(day)

Runs as nested tasks under one flow run.
All runs share the same execution context.

Trigger via run_deployment(), I'm now using this one

@flow
def trigger_backfill(start, end):
    for day in date_range(start, end):
        run_deployment(
            "daily-spark-flow/prod",
            parameters={"day": day.isoformat()},
        )

Each day becomes an independent level flow run.

My questions

For a production backfill scenario (e.g., reprocessing last 90 days), which pattern is more aligned with Prefect best practices?
Are there strong reasons to prefer one over the other in terms of:

Observability & debugging?
Failure isolation & retries?
Resource/infrastructure management (e.g., running on Kubernetes)?

In the future, will Flow has .submit() as subflow?

Answered by zzstoatzz

Nov 24, 2025

Are there strong reasons to prefer one over the other in terms of:

Observability & debugging?

Failure isolation & retries?

Resource/infrastructure management (e.g., running on Kubernetes)?

tasks and flows have pretty much equal representation in the UI, so i'd say this doesn't matter so much
same here. tasks have transactional semantics, so if you need that, use tasks
deployments are made for resource configuration / infra isolation, so if you need to tweak CPU/mem requests for each child, then flows are what you want

but you don't need run_deployment or to configure another deployment up front if you don't want to, check out: https://docs.prefect.io/v3/advanced/submit-flows-directl…

View full answer

zzstoatzz · 2025-11-24T16:06:25Z

zzstoatzz
Nov 24, 2025
Maintainer

Are there strong reasons to prefer one over the other in terms of:

Observability & debugging?

Failure isolation & retries?

Resource/infrastructure management (e.g., running on Kubernetes)?

tasks and flows have pretty much equal representation in the UI, so i'd say this doesn't matter so much
same here. tasks have transactional semantics, so if you need that, use tasks
deployments are made for resource configuration / infra isolation, so if you need to tweak CPU/mem requests for each child, then flows are what you want

but you don't need run_deployment or to configure another deployment up front if you don't want to, check out: https://docs.prefect.io/v3/advanced/submit-flows-directly-to-dynamic-infrastructure#submitting-workflows-to-specific-infrastructure

this should cover your ask about flow.submit()

0 replies

TomMonkeyMan · 2025-11-25T05:42:27Z

TomMonkeyMan
Nov 25, 2025
Author

Hi @zzstoatzz , thanks so much for the thoughtful reply and the pointer to the docs — really appreciate it!

I’m actually already using a pattern very close to what you suggested: I have a parent flow that concurrently submits multiple instances of a child @flow (not wrapped as a task) using .submit() inside an asyncio.Semaphore-controlled loop, like this:

@flow(log_prints=True)
async def my_subflow(name: str = "world"):
    ctx = get_run_context()
    print(f"Subflow {name} parent_task_run_id: {ctx.flow_run.parent_task_run_id}")

    await asyncio.sleep(5) 
    print(f"I'm an awake subflow!")

    if name == "test":
        raise MyCustomError("this is a test error")
    print(f"Hello {name}! I'm a subflow (not a task, not a deployment)!")


@flow(log_prints=True)
async def my_parent_flow(n: int = BIGN):
    semaphore = asyncio.Semaphore(2)

    async def _run_with_limit(name: str):
        async with semaphore:
            return await my_subflow(name=name)

    tasks = [
        _run_with_limit(str(i)) for i in range(n)
    ]
    tasks.append(_run_with_limit("test"))
    tasks.append(_run_with_limit("final"))

    results = await asyncio.gather(*tasks, return_exceptions=True)

This gives me true subflows — each shows up as an independent Flow Run in the UI with proper parent-child hierarchy (via parent_task_run_id), just like run_deployment, but without needing a separate deployment. I’ve also set concurrency limits on the work pool for resource control.

However, I’ve run into a subtle but important issue around failure propagation: when a subflow fails (e.g., raises a ValueError), the parent flow does not show fail, even though the subflow is logically part of its execution graph. This makes backfill monitoring tricky — the parent appears "Completed" while some children are "Failed", and you have to manually inspect each run.

I’ve documented this behavior and proposed an enhancement (raise_on_failure=True for run_deployment-like semantics) in this issue:
#19479

1 reply

TomMonkeyMan Nov 25, 2025
Author

One thing I noticed while digging into the src code is that subflows seem to be orchestrated by creating a virtual task run under the parent flow to represent the child flow’s execution. This works well for tracking lineage, but it also means subflows inherit task-like orchestration semantics (e.g., no automatic failure propagation).

May I ask do we have plans to evolve this toward a more native subflow model — maybe a subflow is treated as a first-class flow-level construct in the orchestration layer, with built-in state propagation and lifecycle coupling to its parent? Or is the current “task-backed subflow” approach intentional for flexibility?

Thanks again for the help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ask Backfill Pattern: Should I use subflows (.submit()) or run_deployment() for daily Spark jobs? #19500

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Ask Backfill Pattern: Should I use subflows (.submit()) or run_deployment() for daily Spark jobs? #19500

Uh oh!

Uh oh!

TomMonkeyMan Nov 21, 2025

Replies: 2 comments · 1 reply

Uh oh!

zzstoatzz Nov 24, 2025 Maintainer

Uh oh!

TomMonkeyMan Nov 25, 2025 Author

Uh oh!

TomMonkeyMan Nov 25, 2025 Author

TomMonkeyMan
Nov 21, 2025

Replies: 2 comments 1 reply

zzstoatzz
Nov 24, 2025
Maintainer

TomMonkeyMan
Nov 25, 2025
Author

TomMonkeyMan Nov 25, 2025
Author