Runtime: Allow a step to execute only when all upstream steps have completed

When a step completes right now, the runtime looks to see if the step has any `next` (or downstream) steps. If so, the downstream step will be added to the queue to be executed immediately.  If a step has multiple upstream edges, it'll run multiple times (after each one completes)

Basically if a node has upstream edges, they are treated as logical ORs. When each upstream step completes, the step will be re-executed.

![Image](https://github.com/user-attachments/assets/eebf8d22-c4f0-4beb-bf9d-1df2fea1f1e2)

This is often useful, but we also want to support a mode where a step will not execute until ALL upstream steps have completed. Like a logical AND.

See https://community.openfn.org/t/allow-a-step-to-run-only-when-all-upstream-ancestor-steps-have-run/738

Things to consider:

* The runtime needs to be more aware of the hierarchy of steps. A step cannot be executed unless all upstream edges have been tested (or all upstream branches have been executed)
* In other words, a step has dependencies now and cannot run until all dependencies have had a chance to run. Does this mean looking ahead in the queue to see if any upstream (including indirect upstream) steps are waiting? And then defer to the back of the queue? I think so - but it may be more complex than this
* Do we toggle this behaviour on the edge, node, or global? Does it make sense that some branches are ORs and some are ANDs? I kind of hope not because that's over complicated and hard to visually explain.
* How to reconcile state. Three upstream steps will have three different state objects. What state does the downstream step receive? We should have a shallow first-to-last merge - just squash it all down - by default. But we also need to enable a reconcile function which takes all state objects as arguments and returns a single state.
* Don't get blocked if some upstream steps don't execute. The runtime needs to know if all upstream edges have had a chance to run, and when they've all been tried, we can run the downstream step. 
* In other words, if two upstreams steps say "execute x" and one upstream step says "don't execute x", who wins? I'd suggest that as soon as any step allows step `x` to run, then step x MUST run. We must just wait for any other ancestors to run first.
* Remember that when referring to "upstream" steps, the upstream step may be indirect. Consider the whole branch.
* Instead of a reconcile function, should we instead have a reconcile strategy, deep vs shallow? If deep, then we'll recursively traverse all state objects and arrays and merge them. Otherwise we just spread/assign keys at the top level.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime: Allow a step to execute only when all upstream steps have completed #850

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Runtime: Allow a step to execute only when all upstream steps have completed #850

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions