Skip to content

Commit 8d5c86f

Browse files
v4: simplified release concurrency system and status changes (#2284)
* WIP * Make release concurrency system extremely simple, everything just releases all the time * update the deadlock detection to use the new lockedQueueReleaseConcurrencyOnWaitpoint column * WIP new release concurrency system * Remove releaseConcurrency and releaseConcurrencyOnWaitpoint Also removed deadlock detection, and added environment burst concurrency * Added new DEQUEUED status Cleaned up the API run statuses, including now detecting new clients and not breaking older clients by adding an API version header to all requests * Introduce the new "current dequeued concurrency set" * Remove QUEUED_EXECUTING because we no longer "eagerly" release before checkpointing * Remove waitpoint test for QUEUED_EXECUTING * Add isWaiting * Add changeset * Use createdAt for ordering realtime runs instead of number * Clarify the envCurrentDequeuedKey usage * mock the db.server file to fix the tests * Updated changset "EXECUTED" -> "EXECUTING" --------- Co-authored-by: Matt Aitken <[email protected]>
1 parent a9b0b4f commit 8d5c86f

File tree

98 files changed

+1665
-6754
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

98 files changed

+1665
-6754
lines changed

.changeset/little-birds-appear.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
---
2+
"@trigger.dev/sdk": patch
3+
---
4+
5+
Removes the `releaseConcurrencyOnWaitpoint` option on queues and the `releaseConcurrency` option on various wait functions. Replaced with the following default behavior:
6+
7+
- Concurrency is never released when a run is first blocked via a waitpoint, at either the env or queue level.
8+
- Concurrency is always released when a run is checkpointed and shutdown, at both the env and queue level.
9+
10+
Additionally, environment concurrency limits now have a new "Burst Factor", defaulting to 2.0x. The "Burst Factor" allows the environment-wide concurrency limit to be higher than any individual queue's concurrency limit. For example, if you have an environment concurrency limit of 100, and a Burst Factor of 2.0x, then you can execute up to 200 runs concurrently, but any one task/queue can still only execute 100 runs concurrently.
11+
12+
We've done some work cleaning up the run statuses. The new statuses are:
13+
14+
- `PENDING_VERSION`: Task is waiting for a version update because it cannot execute without additional information (task, queue, etc.)
15+
- `QUEUED`: Task is waiting to be executed by a worker
16+
- `DEQUEUED`: Task has been dequeued and is being sent to a worker to start executing.
17+
- `EXECUTING`: Task is currently being executed by a worker
18+
- `WAITING`: Task has been paused by the system, and will be resumed by the system
19+
- `COMPLETED`: Task has been completed successfully
20+
- `CANCELED`: Task has been canceled by the user
21+
- `FAILED`: Task has failed to complete, due to an error in the system
22+
- `CRASHED`: Task has crashed and won't be retried, most likely the worker ran out of resources, e.g. memory or storage
23+
- `SYSTEM_FAILURE`: Task has failed to complete, due to an error in the system
24+
- `DELAYED`: Task has been scheduled to run at a specific time
25+
- `EXPIRED`: Task has expired and won't be executed
26+
- `TIMED_OUT`: Task has reached it's maxDuration and has been stopped
27+
28+
We've removed the following statuses:
29+
30+
- `WAITING_FOR_DEPLOY`: This is no longer used, and is replaced by `PENDING_VERSION`
31+
- `FROZEN`: This is no longer used, and is replaced by `WAITING`
32+
- `INTERRUPTED`: This is no longer used
33+
- `REATTEMPTING`: This is no longer used, and is replaced by `EXECUTING`
34+
35+
We've also added "boolean" helpers to runs returned via the API and from Realtime:
36+
37+
- `isQueued`: Returns true when the status is `QUEUED`, `PENDING_VERSION`, or `DELAYED`
38+
- `isExecuting`: Returns true when the status is `EXECUTING`, `DEQUEUED`. These count against your concurrency limits.
39+
- `isWaiting`: Returns true when the status is `WAITING`. These do not count against your concurrency limits.
40+
- `isCompleted`: Returns true when the status is any of the completed statuses.
41+
- `isCanceled`: Returns true when the status is `CANCELED`
42+
- `isFailed`: Returns true when the status is any of the failed statuses.
43+
- `isSuccess`: Returns true when the status is `COMPLETED`
44+
45+
This change adds the ability to easily detect which runs are being counted against your concurrency limit by filtering for both `EXECUTING` or `DEQUEUED`.

apps/webapp/app/api/versions.ts

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
import {
2+
API_VERSION_HEADER_NAME,
3+
API_VERSION as CORE_API_VERSION,
4+
} from "@trigger.dev/core/v3/serverOnly";
5+
import { z } from "zod";
6+
7+
export const CURRENT_API_VERSION = CORE_API_VERSION;
8+
9+
export const NON_SPECIFIC_API_VERSION = "none";
10+
11+
export type API_VERSIONS = typeof CURRENT_API_VERSION | typeof NON_SPECIFIC_API_VERSION;
12+
13+
export function getApiVersion(request: Request): API_VERSIONS {
14+
const apiVersion = request.headers.get(API_VERSION_HEADER_NAME);
15+
16+
if (apiVersion === CURRENT_API_VERSION) {
17+
return apiVersion;
18+
}
19+
20+
return NON_SPECIFIC_API_VERSION;
21+
}
22+
23+
// This has been copied from the core package to allow us to use these types in the webapp
24+
export const RunStatusUnspecifiedApiVersion = z.enum([
25+
/// Task is waiting for a version update because it cannot execute without additional information (task, queue, etc.). Replaces WAITING_FOR_DEPLOY
26+
"PENDING_VERSION",
27+
/// Task hasn't been deployed yet but is waiting to be executed
28+
"WAITING_FOR_DEPLOY",
29+
/// Task is waiting to be executed by a worker
30+
"QUEUED",
31+
/// Task is currently being executed by a worker
32+
"EXECUTING",
33+
/// Task has failed and is waiting to be retried
34+
"REATTEMPTING",
35+
/// Task has been paused by the system, and will be resumed by the system
36+
"FROZEN",
37+
/// Task has been completed successfully
38+
"COMPLETED",
39+
/// Task has been canceled by the user
40+
"CANCELED",
41+
/// Task has been completed with errors
42+
"FAILED",
43+
/// Task has crashed and won't be retried, most likely the worker ran out of resources, e.g. memory or storage
44+
"CRASHED",
45+
/// Task was interrupted during execution, mostly this happens in development environments
46+
"INTERRUPTED",
47+
/// Task has failed to complete, due to an error in the system
48+
"SYSTEM_FAILURE",
49+
/// Task has been scheduled to run at a specific time
50+
"DELAYED",
51+
/// Task has expired and won't be executed
52+
"EXPIRED",
53+
/// Task has reached it's maxDuration and has been stopped
54+
"TIMED_OUT",
55+
]);
56+
57+
export type RunStatusUnspecifiedApiVersion = z.infer<typeof RunStatusUnspecifiedApiVersion>;

apps/webapp/app/components/runs/v3/RunFilters.tsx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,11 @@ import {
1010
import { Form, useFetcher } from "@remix-run/react";
1111
import { IconToggleLeft } from "@tabler/icons-react";
1212
import type { BulkActionType, TaskRunStatus, TaskTriggerSource } from "@trigger.dev/database";
13-
import { ListChecks, ListFilterIcon } from "lucide-react";
13+
import { ListFilterIcon } from "lucide-react";
1414
import { matchSorter } from "match-sorter";
1515
import { type ReactNode, useCallback, useEffect, useMemo, useState } from "react";
1616
import { z } from "zod";
17+
import { ListCheckedIcon } from "~/assets/icons/ListCheckedIcon";
1718
import { StatusIcon } from "~/assets/icons/StatusIcon";
1819
import { TaskIcon } from "~/assets/icons/TaskIcon";
1920
import { AppliedFilter } from "~/components/primitives/AppliedFilter";
@@ -55,8 +56,6 @@ import {
5556
TaskRunStatusCombo,
5657
} from "./TaskRunStatus";
5758
import { TaskTriggerSourceIcon } from "./TaskTriggerSource";
58-
import { ListCheckedIcon } from "~/assets/icons/ListCheckedIcon";
59-
import { cn } from "~/utils/cn";
6059

6160
export const RunStatus = z.enum(allTaskRunStatuses);
6261

apps/webapp/app/components/runs/v3/TaskRunStatus.tsx

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,13 @@ export const allTaskRunStatuses = [
2424
"WAITING_FOR_DEPLOY",
2525
"PENDING_VERSION",
2626
"PENDING",
27+
"DEQUEUED",
2728
"EXECUTING",
2829
"RETRYING_AFTER_FAILURE",
2930
"WAITING_TO_RESUME",
3031
"COMPLETED_SUCCESSFULLY",
31-
"CANCELED",
3232
"COMPLETED_WITH_ERRORS",
33+
"CANCELED",
3334
"TIMED_OUT",
3435
"CRASHED",
3536
"PAUSED",
@@ -42,16 +43,15 @@ export const filterableTaskRunStatuses = [
4243
"PENDING_VERSION",
4344
"DELAYED",
4445
"PENDING",
45-
"WAITING_TO_RESUME",
46+
"DEQUEUED",
4647
"EXECUTING",
47-
"RETRYING_AFTER_FAILURE",
48+
"WAITING_TO_RESUME",
4849
"COMPLETED_SUCCESSFULLY",
49-
"CANCELED",
5050
"COMPLETED_WITH_ERRORS",
5151
"TIMED_OUT",
5252
"CRASHED",
53-
"INTERRUPTED",
5453
"SYSTEM_FAILURE",
54+
"CANCELED",
5555
"EXPIRED",
5656
] as const satisfies Readonly<Array<TaskRunStatus>>;
5757

@@ -60,6 +60,7 @@ const taskRunStatusDescriptions: Record<TaskRunStatus, string> = {
6060
PENDING: "Task is waiting to be executed.",
6161
PENDING_VERSION: "Run cannot execute until a version includes the task and queue.",
6262
WAITING_FOR_DEPLOY: "Run cannot execute until a version includes the task and queue.",
63+
DEQUEUED: "Task has been dequeued from the queue but is not yet executing.",
6364
EXECUTING: "Task is currently being executed.",
6465
RETRYING_AFTER_FAILURE: "Task is being reattempted after a failure.",
6566
WAITING_TO_RESUME: `You have used a "wait" function. When the wait is complete, the task will resume execution.`,
@@ -82,6 +83,7 @@ export const QUEUED_STATUSES = [
8283
] satisfies TaskRunStatus[];
8384

8485
export const RUNNING_STATUSES = [
86+
"DEQUEUED",
8587
"EXECUTING",
8688
"RETRYING_AFTER_FAILURE",
8789
"WAITING_TO_RESUME",
@@ -164,6 +166,8 @@ export function TaskRunStatusIcon({
164166
case "PENDING_VERSION":
165167
case "WAITING_FOR_DEPLOY":
166168
return <RectangleStackIcon className={cn(runStatusClassNameColor(status), className)} />;
169+
case "DEQUEUED":
170+
return <RectangleStackIcon className={cn(runStatusClassNameColor(status), className)} />;
167171
case "EXECUTING":
168172
return <Spinner className={cn(runStatusClassNameColor(status), className)} />;
169173
case "WAITING_TO_RESUME":
@@ -205,6 +209,7 @@ export function runStatusClassNameColor(status: TaskRunStatus): string {
205209
return "text-amber-500";
206210
case "EXECUTING":
207211
case "RETRYING_AFTER_FAILURE":
212+
case "DEQUEUED":
208213
return "text-pending";
209214
case "WAITING_TO_RESUME":
210215
return "text-charcoal-500";
@@ -240,6 +245,8 @@ export function runStatusTitle(status: TaskRunStatus): string {
240245
case "PENDING_VERSION":
241246
case "WAITING_FOR_DEPLOY":
242247
return "Pending version";
248+
case "DEQUEUED":
249+
return "Dequeued";
243250
case "EXECUTING":
244251
return "Executing";
245252
case "WAITING_TO_RESUME":

apps/webapp/app/database-types.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ export const TaskRunStatus = {
3030
PENDING: "PENDING",
3131
PENDING_VERSION: "PENDING_VERSION",
3232
WAITING_FOR_DEPLOY: "WAITING_FOR_DEPLOY",
33+
DEQUEUED: "DEQUEUED",
3334
EXECUTING: "EXECUTING",
3435
WAITING_TO_RESUME: "WAITING_TO_RESUME",
3536
RETRYING_AFTER_FAILURE: "RETRYING_AFTER_FAILURE",

apps/webapp/app/env.server.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,7 @@ const EnvironmentSchema = z.object({
200200
PUBSUB_REDIS_CLUSTER_MODE_ENABLED: z.string().default("0"),
201201

202202
DEFAULT_ENV_EXECUTION_CONCURRENCY_LIMIT: z.coerce.number().int().default(100),
203+
DEFAULT_ENV_EXECUTION_CONCURRENCY_BURST_FACTOR: z.coerce.number().default(1.0),
203204
DEFAULT_ORG_EXECUTION_CONCURRENCY_LIMIT: z.coerce.number().int().default(300),
204205
DEFAULT_DEV_ENV_EXECUTION_ATTEMPTS: z.coerce.number().int().positive().default(1),
205206

apps/webapp/app/models/taskRun.server.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,7 @@ export function batchTaskRunItemStatusForRunStatus(
129129
case TaskRunStatus.WAITING_FOR_DEPLOY:
130130
case TaskRunStatus.WAITING_TO_RESUME:
131131
case TaskRunStatus.RETRYING_AFTER_FAILURE:
132+
case TaskRunStatus.DEQUEUED:
132133
case TaskRunStatus.EXECUTING:
133134
case TaskRunStatus.PAUSED:
134135
case TaskRunStatus.DELAYED:

0 commit comments

Comments
 (0)