Skip to content

Commit 91a641b

Browse files
committed
Add "ADR-002: Workflow Orchestration"
Signed-off-by: nscuro <nscuro@protonmail.com>
1 parent e51e049 commit 91a641b

File tree

2 files changed

+291
-0
lines changed

2 files changed

+291
-0
lines changed
Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
| Status | Date | Author(s) |
2+
|:---------|:-----------|:-------------------------------------|
3+
| Proposed | 2025-01-16 | [@nscuro](https://github.com/nscuro) |
4+
5+
## Context
6+
7+
By dropping the Kafka dependency ([ADR-001]), we are now missing a way to enqueue
8+
and distribute work for asynchronous execution.
9+
10+
Dependency-Track v4 used an in-memory queue, which is neither reliable (queued messages are lost on restart),
11+
nor cluster-friendly (queue is not shared by multiple instances).
12+
13+
As part of [ADR-001], we explored options such as introducing a different, more lightweight message
14+
broker. Another alternative was the introduction of an in-memory data grid (IMDG).
15+
We decided against these options, and opted for leveraging the existing Postgres infrastructure instead.
16+
17+
We also expressed having grown unhappy with the choreography-style architecture,
18+
as it complicates observability and is hard to grasp. A solution that is more akin to an
19+
orchestrator is desirable.
20+
21+
### Requirements
22+
23+
To better understand our requirements, we explain them based on workflows we're dealing with today.
24+
One of the core workflows in Dependency-Track is processing uploaded BOMs:
25+
26+
```mermaid
27+
---
28+
title: BOM Processing Workflow
29+
---
30+
flowchart LR
31+
A@{ shape: circle, label: "Start" }
32+
B["BOM Ingestion"]
33+
C@{ shape: fork, label: "Fork" }
34+
D["Vulnerability Analysis"]
35+
E["Repository Metadata Analysis"]
36+
F@{ shape: fork, label: "Join" }
37+
G["Policy Evaluation"]
38+
H["Metrics Update"]
39+
I@{ shape: dbl-circ, label: "Stop" }
40+
A --> B
41+
B --> C
42+
C --> D
43+
C --> E
44+
D --> F
45+
E --> F
46+
F --> G
47+
G --> H
48+
H --> I
49+
```
50+
51+
> Workflow simplified for brevity. Omitted failure compensations and further fork-join patterns.
52+
> In reality, *Vulnerability Analysis* would split into more steps, one for each enabled analyzer.
53+
54+
Each step is either I/O intensive, or relies on external systems. It is not practical to execute
55+
all steps synchronously. Depending on the size of the BOM and system load, an execution of this
56+
workflow can take anywhere from a few milliseconds, to multiple minutes.
57+
58+
However, BOM uploads are not the only way in which a project analysis may be triggered:
59+
60+
* All projects are re-analyzed on a recurring basis, at least daily.
61+
* Users can manually request a re-analysis of specific projects.
62+
63+
This means that project analysis should ideally be its own, reusable workflow:
64+
65+
```mermaid
66+
---
67+
title: Project Analysis Workflow
68+
---
69+
flowchart LR
70+
A@{ shape: circle, label: "Start" }
71+
C@{ shape: fork, label: "Fork" }
72+
D["Vulnerability Analysis"]
73+
E["Repository Metadata Analysis"]
74+
F@{ shape: fork, label: "Join" }
75+
G["Policy Evaluation"]
76+
H["Metrics Update"]
77+
I@{ shape: dbl-circ, label: "Stop" }
78+
A --> C
79+
C --> D
80+
C --> E
81+
D --> F
82+
E --> F
83+
F --> G
84+
G --> H
85+
H --> I
86+
```
87+
88+
It could then be launched individually, or reused as follows:
89+
90+
```mermaid
91+
---
92+
title: Goal BOM Processing Workflow
93+
---
94+
flowchart LR
95+
A@{ shape: circle, label: "Start" }
96+
B["BOM Ingestion"]
97+
C@{ shape: subproc, label: "Project Analysis Workflow" }
98+
D@{ shape: dbl-circ, label: "Stop" }
99+
A --> B
100+
B --> C
101+
C --> D
102+
```
103+
104+
#### Concurrency control
105+
106+
To prevent race conditions and redundant work, an instance of the *Project Analysis* workflow for any given
107+
project should prevent other instances of *Project Analysis* for the same project from executing concurrently.
108+
This should be true even if the *Project Analysis* is part of another workflow, e.g. *BOM Upload Processing*.
109+
As such, **workflow instances must be treated as dedicated units of work, rather than a simple chain of tasks**.
110+
111+
#### Fairness
112+
113+
Since workflows can be triggered as part of a client interaction (i.e. BOM upload, explicit request), *and* on schedule,
114+
it is critical that the former take precedence over the latter. The system performing a scheduled analysis of all projects
115+
in the portfolio should not prevent clients from receiving feedback in a timely manner. There must be a mechanism
116+
to prioritize work.
117+
118+
We expect to run multiple types of workflows on the system. Many instances of workflow `A` being scheduled for
119+
execution should not cause execution of instances of workflow `B` to be delayed. Work queues must be isolated.
120+
121+
#### Performance
122+
123+
Throughput is more important than latency for Dependency-Track. This is particularly true
124+
as we aim to raise the ceiling of how many projects we can support in a portfolio.
125+
126+
Due to tasks being I/O-heavy, speed of scheduling and queueing of tasks in itself is not anticipated to be a
127+
limiting factor of the system. If the execution of a single task takes longer than a second, it's acceptable
128+
for dequeueing of `N >= 1` tasks to take up to a few hundreds of milliseconds. Given proper indexing, autovacuum
129+
configuration, and fast disks, Postgres should support dequeue queries *well below* 100ms, even for very large queues.
130+
131+
Since we have at least one workflow that will be scheduled for each project in the portfolio, at least daily,
132+
the system must be able to handle large queue depths. Assuming a portfolio of 100k projects, a queue depth of
133+
at least 100k tasks must be supported with little to no performance degradation as the bare minimum.
134+
135+
We chose to rely on Postgres. A relational, [MVCC]-based database. The workflow orchestration solution
136+
should avoid usage patterns that are suboptimal for Postgres (e.g. lots of small transactions),
137+
and make use of features that allow Postgres to operate more efficiently (e.g. batching, advisory locks,
138+
`FOR UPDATE SKIP LOCKED`, partitioned tables).
139+
140+
It is expected that the solution can optionally be pointed to a Postgres instance separate from the main
141+
application. This may be necessary for larger deployments, as more tasks are enqueued and more workers
142+
poll for tasks.
143+
144+
#### Resiliency
145+
146+
We frequently interact with external systems which can experience temporary downtime,
147+
or enforce rate limiting, causing RPCs to fail or be rejected. In the majority of cases,
148+
retries are sufficient to recover from such failures.
149+
150+
Individual workflow steps must be retryable. It should be possible to define a backoff strategy,
151+
as well as a limit after which no more retries will be attempted. A workflow step being retried
152+
should not block the execution of steps belonging to other workflow instances.
153+
154+
In case of node failures, it is expected that uncompleted work is not lost,
155+
and will be picked up by the same node upon restart, or other nodes in the cluster.
156+
157+
#### Scheduling
158+
159+
We have many tasks that need to run repeatedly on a schedule. The currently implemented solution is based on
160+
a combination of in-memory timers and database-backed locks to prevent duplicate executions.
161+
162+
The orchestration system should allow us to schedule recurring workflow executions, e.g. using cron expressions.
163+
164+
Ideally, the system would allow for schedules to be adjusted at runtime, instead of requiring a restart.
165+
166+
#### Observability
167+
168+
Based on interactions with the community, it is clear that not everyone is able or willing to operate infrastructure
169+
for centralized log monitoring and alerting. Further, it is very common for users, even administrative ones,
170+
to *not* have direct access to application logs.
171+
172+
We can thus not rely on logging and metrics instrumentation to ensure observability. We need a solution that has
173+
observability built in, and allows us to expose it on the application layer.
174+
175+
In extension, records of workflow instances should be retained for a specified amount of time.
176+
This allows retrospective investigation of failures. The retention duration should be configurable.
177+
178+
### Possible solutions
179+
180+
#### A: Adopt a job scheduling framework
181+
182+
Multiple open source frameworks for job scheduling using relational databases exist.
183+
Among them [db-scheduler], [JobRunr] and [Quartz]. While they focus on execution of individual
184+
jobs, they usually ship with basic workflow capabilities as well.
185+
186+
Quartz jobs can be chained to form workflows, using [`JobChainingJobListener`](https://www.quartz-scheduler.org/api/2.3.0/org/quartz/listeners/JobChainingJobListener.html).
187+
This functionality is self-described as "poor man's workflow".
188+
189+
[JobRunr] has a [similar concept](https://www.jobrunr.io/en/documentation/pro/job-chaining/),
190+
but it's only available in the commercial Pro version. As an open source project, we cannot use
191+
commercial tooling.
192+
193+
[db-scheduler] appears to be the most capable option, while being entirely free without commercial offering.
194+
It, too supports job chaining to form workflows of multiple steps.
195+
196+
Job chaining is an implementation of the [routing slip pattern]. This model is akin to what Dependency-Track v4
197+
relied on, and we found it to be both unreliable and limiting at times. Since there is no orchestration layer
198+
in this setup, failure compensations are harder to implement.
199+
200+
Fork-join / [scatter-gather] patterns are not supported by this approach. We need this capability to run certain
201+
workflow steps concurrently, e.g. *Vulnerability Analysis* and *Repository Metadata Analysis*. Additional state-keeping
202+
and coordination is necessary to achieve the desired behavior with job chaining.
203+
204+
Also, none of these options allow job chains to be treated as logical units of work,
205+
which complicates concurrency control, fairness, and monitoring.
206+
207+
#### B: Adopt an embedded workflow engine
208+
209+
Per the [awesome-workflow-engines] list, available Java-based embedded workflow engines
210+
that can cater to our requirements include [COPPER] and [nFlow].
211+
212+
At the time of writing, the [COPPER] repository has not seen any activity for over 6 months.
213+
The documentation is scarce, but [COPPER] appears to be achieving durability by serializing stack
214+
frames of Java code, which leads to many restrictions when it comes to modifying workflows.
215+
216+
[nFlow] requires Spring Framework which we do not use, and we don't plan on changing this.
217+
218+
#### C: Adopt an external workflow engine
219+
220+
Out of all external engines we encountered, [Temporal] is a perfect match. It supports everything we need,
221+
allows us to write workflows as code, and is open source. The same is true for [Cadence], the spiritual predecessor
222+
of [Temporal].
223+
224+
But systems like the above are built so scale, and scale in such a way that can support
225+
entire enterprises in running their workflows on shared infrastructure. This is great,
226+
but comes with a lot of additional complexity that we can't justify to introduce.
227+
We just need something that supports our own use cases, without additional operational burden.
228+
229+
#### D: Build our own
230+
231+
[Temporal] is a spiritual successor to Microsoft's [Durable Task Framework] (DTFx). DTFx is not an external service,
232+
but an embeddable library. Unfortunately, DTFx is written in C#, so we can not use it.
233+
234+
Microsoft has further invested into a Go port of DTFx: [durabletask-go], which can be used as a Go library,
235+
but also as an external service (i.e. sidecar). It powers [Dapr Workflow] in this configuration.
236+
[durabletask-go] can be used with various storage backends, among them Postgres. A Java SDK for the
237+
[durabletask-go] sidecar is available with [durabletask-java]. We don't want to require an external service though,
238+
even if it's *just* a sidecar.
239+
240+
There are also some features that the DTFx implementations do not offer, for example scheduling, prioritization,
241+
and concurrency control. But the foundation for everything else we want is there. And judging by [durabletask-go]'s
242+
code, implementing a DTFx-like engine is doable without a lot of complexity. Various patterns enabled by the DTFx model
243+
can be seen in the [Dapr Workflow Patterns] documentation.
244+
245+
In short, the plan for this solution is as follows:
246+
247+
* Build a Java-based, embeddable port of DTFx, roughly based on the logic of [durabletask-go].
248+
* Omit abstractions to support multiple storage backends, focus entirely on Postgres instead.
249+
* Omit features that we do not need to keep the porting effort low.
250+
* Build features we need on top of this foundation.
251+
252+
This approach allows us to fulfill our current requirements, but also adapt more quickly to new ones in the future.
253+
We completely avoid introduction of additional infrastructure or services, keeping operational complexity low.
254+
The solution is also not entirely bespoke, since we can lean on design decisions of the mature and proven DTFx ecosystem.
255+
256+
## Decision
257+
258+
We will follow option *D*.
259+
260+
## Consequences
261+
262+
A DTFx port as outlined in [D: Build our own](#d-build-our-own) needs to be developed.
263+
264+
This work was already started to evaluate feasibility of this approach.
265+
The current state of the code is available here:
266+
267+
* [`DependencyTrack/hyades-apiserver@workflow-v2`](https://github.com/DependencyTrack/hyades-apiserver/tree/workflow-v2/src/main/java/org/dependencytrack/workflow)
268+
* [`DependencyTrack/hyades-frontend@workflow-v2`](https://github.com/DependencyTrack/hyades-frontend/tree/workflow-v2/src/views/workflowRuns)
269+
270+
A corresponding design document that describes our implementation will follow.
271+
272+
273+
[ADR-001]: ./001-drop-kafka-dependency.md
274+
[awesome-workflow-engines]: https://github.com/meirwah/awesome-workflow-engines
275+
[Cadence]: https://cadenceworkflow.io/
276+
[COPPER]: https://copper-engine.org/
277+
[Dapr Workflow]: https://docs.dapr.io/developing-applications/building-blocks/workflow/workflow-overview/
278+
[Dapr Workflow Patterns]: https://docs.dapr.io/developing-applications/building-blocks/workflow/workflow-patterns/
279+
[db-scheduler]: https://github.com/kagkarlsson/db-scheduler
280+
[Durable Task Framework]: https://github.com/Azure/durabletask
281+
[durabletask-go]: https://github.com/microsoft/durabletask-go
282+
[durabletask-java]: https://github.com/microsoft/durabletask-java
283+
[JobRunr]: https://www.jobrunr.io/en/
284+
[JWorkflow]: https://github.com/danielgerlag/jworkflow
285+
[MVCC]: https://www.postgresql.org/docs/current/mvcc.html
286+
[nFlow]: https://nflow.io/
287+
[Quartz]: https://www.quartz-scheduler.org/
288+
[routing slip pattern]: https://www.enterpriseintegrationpatterns.com/patterns/messaging/RoutingTable.html
289+
[scatter-gather]: https://www.enterpriseintegrationpatterns.com/patterns/messaging/BroadcastAggregate.html
290+
[Temporal]: https://temporal.io/

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@ nav:
9090
- Decisions:
9191
- Overview: architecture/decisions/000-index.md
9292
- "ADR-001: Drop Kafka Dependency": architecture/decisions/001-drop-kafka-dependency.md
93+
- "ADR-002: Workflow Orchestration": architecture/decisions/002-workflow-orchestration.md
9394
- Design:
9495
- Workflow State Tracking: architecture/design/workflow-state-tracking.md
9596
- Operations:

0 commit comments

Comments
 (0)