|
| 1 | +| Status | Date | Author(s) | |
| 2 | +|:---------|:-----------|:-------------------------------------| |
| 3 | +| Proposed | 2025-01-16 | [@nscuro](https://github.com/nscuro) | |
| 4 | + |
| 5 | +## Context |
| 6 | + |
| 7 | +By dropping the Kafka dependency ([ADR-001]), we are now missing a way to enqueue |
| 8 | +and distribute work for asynchronous execution. |
| 9 | + |
| 10 | +Dependency-Track v4 used an in-memory queue, which is neither reliable (queued messages are lost on restart), |
| 11 | +nor cluster-friendly (queue is not shared by multiple instances). |
| 12 | + |
| 13 | +As part of [ADR-001], we explored options such as introducing a different, more lightweight message |
| 14 | +broker. Another alternative was the introduction of an in-memory data grid (IMDG). |
| 15 | +We decided against these options, and opted for leveraging the existing Postgres infrastructure instead. |
| 16 | + |
| 17 | +We also expressed having grown unhappy with the choreography-style architecture, |
| 18 | +as it complicates observability and is hard to grasp. A solution that is more akin to an |
| 19 | +orchestrator is desirable. |
| 20 | + |
| 21 | +### Requirements |
| 22 | + |
| 23 | +To better understand our requirements, we explain them based on workflows we're dealing with today. |
| 24 | +One of the core workflows in Dependency-Track is processing uploaded BOMs: |
| 25 | + |
| 26 | +```mermaid |
| 27 | +--- |
| 28 | +title: BOM Processing Workflow |
| 29 | +--- |
| 30 | +flowchart LR |
| 31 | + A@{ shape: circle, label: "Start" } |
| 32 | + B["BOM Ingestion"] |
| 33 | + C@{ shape: fork, label: "Fork" } |
| 34 | + D["Vulnerability Analysis"] |
| 35 | + E["Repository Metadata Analysis"] |
| 36 | + F@{ shape: fork, label: "Join" } |
| 37 | + G["Policy Evaluation"] |
| 38 | + H["Metrics Update"] |
| 39 | + I@{ shape: dbl-circ, label: "Stop" } |
| 40 | + A --> B |
| 41 | + B --> C |
| 42 | + C --> D |
| 43 | + C --> E |
| 44 | + D --> F |
| 45 | + E --> F |
| 46 | + F --> G |
| 47 | + G --> H |
| 48 | + H --> I |
| 49 | +``` |
| 50 | + |
| 51 | +> Workflow simplified for brevity. Omitted failure compensations and further fork-join patterns. |
| 52 | +> In reality, *Vulnerability Analysis* would split into more steps, one for each enabled analyzer. |
| 53 | +
|
| 54 | +Each step is either I/O intensive, or relies on external systems. It is not practical to execute |
| 55 | +all steps synchronously. Depending on the size of the BOM and system load, an execution of this |
| 56 | +workflow can take anywhere from a few milliseconds, to multiple minutes. |
| 57 | + |
| 58 | +However, BOM uploads are not the only way in which a project analysis may be triggered: |
| 59 | + |
| 60 | +* All projects are re-analyzed on a recurring basis, at least daily. |
| 61 | +* Users can manually request a re-analysis of specific projects. |
| 62 | + |
| 63 | +This means that project analysis should ideally be its own, reusable workflow: |
| 64 | + |
| 65 | +```mermaid |
| 66 | +--- |
| 67 | +title: Project Analysis Workflow |
| 68 | +--- |
| 69 | +flowchart LR |
| 70 | + A@{ shape: circle, label: "Start" } |
| 71 | + C@{ shape: fork, label: "Fork" } |
| 72 | + D["Vulnerability Analysis"] |
| 73 | + E["Repository Metadata Analysis"] |
| 74 | + F@{ shape: fork, label: "Join" } |
| 75 | + G["Policy Evaluation"] |
| 76 | + H["Metrics Update"] |
| 77 | + I@{ shape: dbl-circ, label: "Stop" } |
| 78 | + A --> C |
| 79 | + C --> D |
| 80 | + C --> E |
| 81 | + D --> F |
| 82 | + E --> F |
| 83 | + F --> G |
| 84 | + G --> H |
| 85 | + H --> I |
| 86 | +``` |
| 87 | + |
| 88 | +It could then be launched individually, or reused as follows: |
| 89 | + |
| 90 | +```mermaid |
| 91 | +--- |
| 92 | +title: Goal BOM Processing Workflow |
| 93 | +--- |
| 94 | +flowchart LR |
| 95 | + A@{ shape: circle, label: "Start" } |
| 96 | + B["BOM Ingestion"] |
| 97 | + C@{ shape: subproc, label: "Project Analysis Workflow" } |
| 98 | + D@{ shape: dbl-circ, label: "Stop" } |
| 99 | + A --> B |
| 100 | + B --> C |
| 101 | + C --> D |
| 102 | +``` |
| 103 | + |
| 104 | +#### Concurrency control |
| 105 | + |
| 106 | +To prevent race conditions and redundant work, an instance of the *Project Analysis* workflow for any given |
| 107 | +project should prevent other instances of *Project Analysis* for the same project from executing concurrently. |
| 108 | +This should be true even if the *Project Analysis* is part of another workflow, e.g. *BOM Upload Processing*. |
| 109 | +As such, **workflow instances must be treated as dedicated units of work, rather than a simple chain of tasks**. |
| 110 | + |
| 111 | +#### Fairness |
| 112 | + |
| 113 | +Since workflows can be triggered as part of a client interaction (i.e. BOM upload, explicit request), *and* on schedule, |
| 114 | +it is critical that the former take precedence over the latter. The system performing a scheduled analysis of all projects |
| 115 | +in the portfolio should not prevent clients from receiving feedback in a timely manner. There must be a mechanism |
| 116 | +to prioritize work. |
| 117 | + |
| 118 | +We expect to run multiple types of workflows on the system. Many instances of workflow `A` being scheduled for |
| 119 | +execution should not cause execution of instances of workflow `B` to be delayed. Work queues must be isolated. |
| 120 | + |
| 121 | +#### Performance |
| 122 | + |
| 123 | +Throughput is more important than latency for Dependency-Track. This is particularly true |
| 124 | +as we aim to raise the ceiling of how many projects we can support in a portfolio. |
| 125 | + |
| 126 | +Due to tasks being I/O-heavy, speed of scheduling and queueing of tasks in itself is not anticipated to be a |
| 127 | +limiting factor of the system. If the execution of a single task takes longer than a second, it's acceptable |
| 128 | +for dequeueing of `N >= 1` tasks to take up to a few hundreds of milliseconds. Given proper indexing, autovacuum |
| 129 | +configuration, and fast disks, Postgres should support dequeue queries *well below* 100ms, even for very large queues. |
| 130 | + |
| 131 | +Since we have at least one workflow that will be scheduled for each project in the portfolio, at least daily, |
| 132 | +the system must be able to handle large queue depths. Assuming a portfolio of 100k projects, a queue depth of |
| 133 | +at least 100k tasks must be supported with little to no performance degradation as the bare minimum. |
| 134 | + |
| 135 | +We chose to rely on Postgres. A relational, [MVCC]-based database. The workflow orchestration solution |
| 136 | +should avoid usage patterns that are suboptimal for Postgres (e.g. lots of small transactions), |
| 137 | +and make use of features that allow Postgres to operate more efficiently (e.g. batching, advisory locks, |
| 138 | +`FOR UPDATE SKIP LOCKED`, partitioned tables). |
| 139 | + |
| 140 | +It is expected that the solution can optionally be pointed to a Postgres instance separate from the main |
| 141 | +application. This may be necessary for larger deployments, as more tasks are enqueued and more workers |
| 142 | +poll for tasks. |
| 143 | + |
| 144 | +#### Resiliency |
| 145 | + |
| 146 | +We frequently interact with external systems which can experience temporary downtime, |
| 147 | +or enforce rate limiting, causing RPCs to fail or be rejected. In the majority of cases, |
| 148 | +retries are sufficient to recover from such failures. |
| 149 | + |
| 150 | +Individual workflow steps must be retryable. It should be possible to define a backoff strategy, |
| 151 | +as well as a limit after which no more retries will be attempted. A workflow step being retried |
| 152 | +should not block the execution of steps belonging to other workflow instances. |
| 153 | + |
| 154 | +In case of node failures, it is expected that uncompleted work is not lost, |
| 155 | +and will be picked up by the same node upon restart, or other nodes in the cluster. |
| 156 | + |
| 157 | +#### Scheduling |
| 158 | + |
| 159 | +We have many tasks that need to run repeatedly on a schedule. The currently implemented solution is based on |
| 160 | +a combination of in-memory timers and database-backed locks to prevent duplicate executions. |
| 161 | + |
| 162 | +The orchestration system should allow us to schedule recurring workflow executions, e.g. using cron expressions. |
| 163 | + |
| 164 | +Ideally, the system would allow for schedules to be adjusted at runtime, instead of requiring a restart. |
| 165 | + |
| 166 | +#### Observability |
| 167 | + |
| 168 | +Based on interactions with the community, it is clear that not everyone is able or willing to operate infrastructure |
| 169 | +for centralized log monitoring and alerting. Further, it is very common for users, even administrative ones, |
| 170 | +to *not* have direct access to application logs. |
| 171 | + |
| 172 | +We can thus not rely on logging and metrics instrumentation to ensure observability. We need a solution that has |
| 173 | +observability built in, and allows us to expose it on the application layer. |
| 174 | + |
| 175 | +In extension, records of workflow instances should be retained for a specified amount of time. |
| 176 | +This allows retrospective investigation of failures. The retention duration should be configurable. |
| 177 | + |
| 178 | +### Possible solutions |
| 179 | + |
| 180 | +#### A: Adopt a job scheduling framework |
| 181 | + |
| 182 | +Multiple open source frameworks for job scheduling using relational databases exist. |
| 183 | +Among them [db-scheduler], [JobRunr] and [Quartz]. While they focus on execution of individual |
| 184 | +jobs, they usually ship with basic workflow capabilities as well. |
| 185 | + |
| 186 | +Quartz jobs can be chained to form workflows, using [`JobChainingJobListener`](https://www.quartz-scheduler.org/api/2.3.0/org/quartz/listeners/JobChainingJobListener.html). |
| 187 | +This functionality is self-described as "poor man's workflow". |
| 188 | + |
| 189 | +[JobRunr] has a [similar concept](https://www.jobrunr.io/en/documentation/pro/job-chaining/), |
| 190 | +but it's only available in the commercial Pro version. As an open source project, we cannot use |
| 191 | +commercial tooling. |
| 192 | + |
| 193 | +[db-scheduler] appears to be the most capable option, while being entirely free without commercial offering. |
| 194 | +It, too supports job chaining to form workflows of multiple steps. |
| 195 | + |
| 196 | +Job chaining is an implementation of the [routing slip pattern]. This model is akin to what Dependency-Track v4 |
| 197 | +relied on, and we found it to be both unreliable and limiting at times. Since there is no orchestration layer |
| 198 | +in this setup, failure compensations are harder to implement. |
| 199 | + |
| 200 | +Fork-join / [scatter-gather] patterns are not supported by this approach. We need this capability to run certain |
| 201 | +workflow steps concurrently, e.g. *Vulnerability Analysis* and *Repository Metadata Analysis*. Additional state-keeping |
| 202 | +and coordination is necessary to achieve the desired behavior with job chaining. |
| 203 | + |
| 204 | +Also, none of these options allow job chains to be treated as logical units of work, |
| 205 | +which complicates concurrency control, fairness, and monitoring. |
| 206 | + |
| 207 | +#### B: Adopt an embedded workflow engine |
| 208 | + |
| 209 | +Per the [awesome-workflow-engines] list, available Java-based embedded workflow engines |
| 210 | +that can cater to our requirements include [COPPER] and [nFlow]. |
| 211 | + |
| 212 | +At the time of writing, the [COPPER] repository has not seen any activity for over 6 months. |
| 213 | +The documentation is scarce, but [COPPER] appears to be achieving durability by serializing stack |
| 214 | +frames of Java code, which leads to many restrictions when it comes to modifying workflows. |
| 215 | + |
| 216 | +[nFlow] requires Spring Framework which we do not use, and we don't plan on changing this. |
| 217 | + |
| 218 | +#### C: Adopt an external workflow engine |
| 219 | + |
| 220 | +Out of all external engines we encountered, [Temporal] is a perfect match. It supports everything we need, |
| 221 | +allows us to write workflows as code, and is open source. The same is true for [Cadence], the spiritual predecessor |
| 222 | +of [Temporal]. |
| 223 | + |
| 224 | +But systems like the above are built so scale, and scale in such a way that can support |
| 225 | +entire enterprises in running their workflows on shared infrastructure. This is great, |
| 226 | +but comes with a lot of additional complexity that we can't justify to introduce. |
| 227 | +We just need something that supports our own use cases, without additional operational burden. |
| 228 | + |
| 229 | +#### D: Build our own |
| 230 | + |
| 231 | +[Temporal] is a spiritual successor to Microsoft's [Durable Task Framework] (DTFx). DTFx is not an external service, |
| 232 | +but an embeddable library. Unfortunately, DTFx is written in C#, so we can not use it. |
| 233 | + |
| 234 | +Microsoft has further invested into a Go port of DTFx: [durabletask-go], which can be used as a Go library, |
| 235 | +but also as an external service (i.e. sidecar). It powers [Dapr Workflow] in this configuration. |
| 236 | +[durabletask-go] can be used with various storage backends, among them Postgres. A Java SDK for the |
| 237 | +[durabletask-go] sidecar is available with [durabletask-java]. We don't want to require an external service though, |
| 238 | +even if it's *just* a sidecar. |
| 239 | + |
| 240 | +There are also some features that the DTFx implementations do not offer, for example scheduling, prioritization, |
| 241 | +and concurrency control. But the foundation for everything else we want is there. And judging by [durabletask-go]'s |
| 242 | +code, implementing a DTFx-like engine is doable without a lot of complexity. Various patterns enabled by the DTFx model |
| 243 | +can be seen in the [Dapr Workflow Patterns] documentation. |
| 244 | + |
| 245 | +In short, the plan for this solution is as follows: |
| 246 | + |
| 247 | +* Build a Java-based, embeddable port of DTFx, roughly based on the logic of [durabletask-go]. |
| 248 | +* Omit abstractions to support multiple storage backends, focus entirely on Postgres instead. |
| 249 | +* Omit features that we do not need to keep the porting effort low. |
| 250 | +* Build features we need on top of this foundation. |
| 251 | + |
| 252 | +This approach allows us to fulfill our current requirements, but also adapt more quickly to new ones in the future. |
| 253 | +We completely avoid introduction of additional infrastructure or services, keeping operational complexity low. |
| 254 | +The solution is also not entirely bespoke, since we can lean on design decisions of the mature and proven DTFx ecosystem. |
| 255 | + |
| 256 | +## Decision |
| 257 | + |
| 258 | +We will follow option *D*. |
| 259 | + |
| 260 | +## Consequences |
| 261 | + |
| 262 | +A DTFx port as outlined in [D: Build our own](#d-build-our-own) needs to be developed. |
| 263 | + |
| 264 | +This work was already started to evaluate feasibility of this approach. |
| 265 | +The current state of the code is available here: |
| 266 | + |
| 267 | +* [`DependencyTrack/hyades-apiserver@workflow-v2`](https://github.com/DependencyTrack/hyades-apiserver/tree/workflow-v2/src/main/java/org/dependencytrack/workflow) |
| 268 | +* [`DependencyTrack/hyades-frontend@workflow-v2`](https://github.com/DependencyTrack/hyades-frontend/tree/workflow-v2/src/views/workflowRuns) |
| 269 | + |
| 270 | +A corresponding design document that describes our implementation will follow. |
| 271 | + |
| 272 | + |
| 273 | +[ADR-001]: ./001-drop-kafka-dependency.md |
| 274 | +[awesome-workflow-engines]: https://github.com/meirwah/awesome-workflow-engines |
| 275 | +[Cadence]: https://cadenceworkflow.io/ |
| 276 | +[COPPER]: https://copper-engine.org/ |
| 277 | +[Dapr Workflow]: https://docs.dapr.io/developing-applications/building-blocks/workflow/workflow-overview/ |
| 278 | +[Dapr Workflow Patterns]: https://docs.dapr.io/developing-applications/building-blocks/workflow/workflow-patterns/ |
| 279 | +[db-scheduler]: https://github.com/kagkarlsson/db-scheduler |
| 280 | +[Durable Task Framework]: https://github.com/Azure/durabletask |
| 281 | +[durabletask-go]: https://github.com/microsoft/durabletask-go |
| 282 | +[durabletask-java]: https://github.com/microsoft/durabletask-java |
| 283 | +[JobRunr]: https://www.jobrunr.io/en/ |
| 284 | +[JWorkflow]: https://github.com/danielgerlag/jworkflow |
| 285 | +[MVCC]: https://www.postgresql.org/docs/current/mvcc.html |
| 286 | +[nFlow]: https://nflow.io/ |
| 287 | +[Quartz]: https://www.quartz-scheduler.org/ |
| 288 | +[routing slip pattern]: https://www.enterpriseintegrationpatterns.com/patterns/messaging/RoutingTable.html |
| 289 | +[scatter-gather]: https://www.enterpriseintegrationpatterns.com/patterns/messaging/BroadcastAggregate.html |
| 290 | +[Temporal]: https://temporal.io/ |
0 commit comments