Commit e97704d
Run Engine 2.0 (WIP) (#1575)
* bump worker version
* Suggested glossary for the RunEngine, TBC
* Removed BatchTaskRun changes from this branch, they were done in main
* Set the BatchTaskRun status to completed when all runs are completed
* When dequeuing respect passed in maxResources
* Ported over the new run props: idempotencyKeyExpiresAt, versions, oneTimeUseToken, maxDurationInSeconds
* Didn’t hit save… the new props when triggering tasks passed through
* Idempotency expiration + waitpoint edge case
* WIP on creating checkpoint, parking for now
* fix worker routes
* upgrade webapp node types to support generic event emitter
* separate event bus handler singleton and run failure alerts
* duration waits
* fix execution snapshot debug spans
* task waits
* fix event bus types
* temporary fix for react hook run handle type
* disable run notifications for now
* convert any typecasts to expect errors to more easily fix later
* fix webapp types after node types upgrade
* updateEnvConcurrencyLimits across marqs and the runqueue
* Pass proper values into the run engine
* RunQueue settings and removed unused rebalancing workers
* Remove rebalancing prop
* Tidied more things up
* Update/remove queue limits for MARQS and RunQueue
* taskQueue/concurrencyLimit changes ported back into the RunEngine
* Reworked completing waitpoints to improve performance and reduce race conditions
* Improved test robustness
* Down to a single run lock only when a run is totally unblocked and ready to continue
* warm starts, worker notifications, wait fixes
* Fix for Run Engine poll interval env var
* Expect the waitpoint to be completed quickly
* If a run is locked then it’s too late to expire it
* Added VALKEY_ env vars and plugged them into the run engine
* Extracted and updated the guard queue function so it can be used when batching
* Added logging and universal concurrency changes to trigger task v1
* Added notes back in
* Bump @trigger.dev/worker to 3.3.7
* reportInvocationUsage for the runAttemptStarted event
* improve execution snapshot span debug span start times
* Unfriendly IDs
* update lockfile
* Created a shared determineEngineVersion function
* disable unfinished commands
* save new cli config to different location, misc fixes
* add basic engine version check via current deploy
* new run engine will default to node 22 runtime
* block some actions for projects on previous run engine
* fix worker group tests
* fix triggerAndWait test
* one typescript version to rule them all
* redlock type patch
* fix type issues caused by ts-reset
* improve cleanup scripts
* add missing socket.io dep
* fix run notification handler type
* fix worker group test again
* generate prisma client for e2e tests
* remove worker group tests for now
* prevent image pull rate limits during unit tests
* increase timeout for queue concurrency limit test
* generate prisma client for preview release
* same node types everywhere
* Updated engine readme, removed legacy system notes
* use default machine preset from platform package
* worker instances plural in schema
* disable pnpm update notifications
* return worker group details from connect call
* add workers admin route
* fix heartbeat route return type
* move deployment labels to core apps
* refactor run controller env schema
* Add firstAttemptStartedAt to TaskRun
* RunEngine 2.0 batch trigger support (#1581)
* Make it clear when BatchTriggerV2Service is used
* Copy of BatchTriggerV2Service
* WIP batch triggering
* Allow blocking a run with multiple waitpoints at once. Made it atomic
* Removed unused param
* New batch service
* Pass through the parentRunId and resumeParentOnCompletion
* Use the new batch service, and correct trigger task version
* Force V1 engine if using BatchTriggerV2Service, we’ve already done the check at this point
* Removed the $transaction and early exit if nothing changed
* Adedd a simple batch task to the hello world reference catalog
* Fix for batch waits not working
* Added parentRunId in a couple more places
* Removed waitForBatch log
* Added another parentRunId
* Expanded the example to include all the different triggers
* More changes to blocking to support continuing after idempotent completed runs
* Fix for the wrong type when blocking a run
* remove @Map
* optimise worker auth query
* add engine version header to core api client requests
* remove unique constraint for default group id
* consolidate migrations
* the first managed worker becomes the global default
* Debug events off by default, added an admin toggle to show them
* worker group name can't be an empty string
* add exec helper to core
* move machine resources to core
* add pre-dequeue callback to determine max resources
* optionally skip dequeue
* bump worker package
* move worker to core
* fix ReadableStream type error
* fix another type issue
* update a few more tsconfigs
* add metadata changes introduced in #1563
* Run Engine 2.0 trigger idempotency (#1613)
* Return isCached from the trigger API endpoint
* Fix for the wrong type when blocking a run
* Render the idempotent run in the inspector
* Event repository for idempotency
* Debug events off by default, added an admin toggle to show them
* triggerAndWait idempotency span
* Some improvements to the reference idempotency task
* Removed the cached tracing from the SDK
* Server-side creating cached span
* Improved idempotency test task
* Create cached task spans in a better way
* Idempotency span support inc batch trigger
* Simplified how the spans are done, using more of the existing code
* Improved the idempotency test task
* Added Waitpoint Batch type, add to TaskRunWaitpoint with order
* Pass batch ids through to the run engine when triggering
* Added batchIndex
* Better batch support in the run engine
* Added settings to batch trigger service, before major overhaul
* Allow the longer run/batch ids in the filters
* Changed how batching works, includes breaking changes in CLI
* Removed batch idempotency because it gets put on the runs instead
* Added `runs` to the batch.retrieve call/API
* Set firstAttemptStartedAt when creating the first attempt
* Do nothing when receiving a BATCH waitpoint
* Some fixes in the new batch trigger service… mostly just passing missing optional params through
* Tweaked the idempotency test task for more situations
* Only block with a batch if it’s a batchTriggerAndWait… 🤦♂️
* Added another case to the idempotency test task: multiple of the same idempotencyKey in a single batch
* Support for the same run multiple times in the same batch
* Small tweaks
* Make sure to complete batches, even if they’re not andWait ones
* Export RunDuplicateIdempotencyKeyError from the run engine
* Latest lockfile
* Trigger with a machine (old run engine)
* RE2, allow setting machine when triggering
* Fix for new glob patterns
* add max run count to dequeue from version route
* add worker instance name env var and header
* queue consumer pre skip callback
* poll for more runs after final execution errors
* fix dequeue search param schema
* add shortcut to debug switch
* expose run engine timeouts as env vars
* make warm start durations configurable
* add optional status to json reply helper
* fix preSkip hook, add debug logs
* BLOCKED_BY_WAITPOINTS -> SUSPENDED
* exit controller when run suspended
* check if already replied before http reply
* run controller will wait for next run after the current one is suspended
* cancel run button shortcut
* minimal event repository environment type
* fix update metadata call
* run suspension and misc fixes wip
* change debug shortcut to shift + D
* Started work on the Dev supervisor
* Formatting
* Fix for bad imports
* Before rebuilding SSE
* Presence updating from the CLI working via SSE
* add worker notification debug logs
* send run:stop when exiting run phase
* skip current snapshot poll on worker notification
* add more logs and route to submit run debug logs
* add worker and runner ids to snapshots
* improve run notification debug logs
* add workload debug log route
* misc run controller fixes and refactor
* prevent parallel execution of critical functions
* update bun to 1.2.1
* WIP with dev dequeuing
* Method to convert friendlyIds to non-friendly, do nothing with actual ids
* Set the engine on BackgroundWorker, lazily upgrade projects to engine V2
* Runs with ttls were getting immediately expired… oops.
* Pass the Waiting for deploy reason through, so we have it on the execution snapshots
* Fixed the logic for getting the right background worker for a run
* Use the correct ID when dequeuing…
* determineEngineVersion is now fully functional
* Rate limiter ignores the dev endpoints
* Retrieving a batch gives you the runIds
* Set a unique version for the RE2 BatchTaskRun
* add provisional changeset
* The start of dev run execution is working
* First dev run working
* Moved the dev run controller closer to what Nick did with the managed one
* export exec output type
* Heartbeat fix: don’t heartbeat if _isHeartbeating == false
* Dev runs get notifications, some dev bug fixes
* Improved logging or dequeuing
* We need to dequeue runs from the latest version too, for triggerAndWait
* Ported Eric’s validateWorkerManifest with nicer errors
* When flattening an idempotency key if part is undefined, return undefined
* Dev logging fixes
* Remove sigterm listener
* Deprecating workers. Don’t specify a BackgroundWorker when dequeuing an environment
* Deleted some old files. Renamed “managed” to “deploy”
* When a build finishes, always copy the build dir (otherwise the first one gets trampled on by the 2nd)
* Dev master queues should work differently
* Deleting old workers
* Added debounce function to core
* Improvement to canceling
* WIP on debounce canceling on socket disconnection
* Added environment data to execution snapshots
* Dev runs that have stalled get “Canceled” with a reason explaining why
* Show CLI messaged when a connection to the platform is lost/restored
* Fix TriggerTask after merge
* Add trigger task v2 max attempts, replace some findUniques
* Port the new queue logic to the run engine
* More fixes post-merge
* We weren’t setting a `retryConfig` up for the tests… it’s now required
* Start the Redis worker inside the Run Engine… 🤦♂️
* Trying to make the testcontainers more reliable
* Added keyPrefix: "engine:”
* Badly placed bracket in trigger task
* Better Redis namespacing
* Fix for expired run not getting removed from the queue
* Don’t create a redis client in the testcontainers, return the redisOptions instead
* Cleanup redis client in the run lock tests
* Fix for the RunQueue not supporting keyPrefix
* Updated more of the RunQueue scripts rebalancing
* Trying to make Redis more robust in the tests…
* Improved test resiliciency more
* Fix for delays (checkpoint check)
* Increase the timeout slightly to fix ttl test
* Added priority support when triggering
* More wip trying to make test containers more reliable
* batchTriggerAndWait test is still failing… some wip to try fix it
* Fixed redis tests now we’re not providing a client
* Separate Redis clients for the run engine worker/queue/runlock
* Made the wait for duration test more resilient
* Added idempotencyKeyExpiresAt to Waitpoints
* Waitpoint timeouts and idempotency expiry
* Use finishWaitpoint, removed extra worker job
* Added waitpoint idempotency tests
* Creating resume tokens is working
* Some improvements to the resume tokens
* Moved resumeTokens to just be wait functions 🥳
* Delete old RuntimeManagers
* Wait for token is working
* Better test for the wait tokens
* Improved the test task some more
* Hide the accessories in the span inspector
* WIP on waitpoint inspector
* WIP on complete waitpoint form
* Span overview panel can be changed based on the entity type
* Improved the waitpoint display
* WIP on completing waitpoint form
* Use the existing CodeBlock for the tip
* Style improvements
* Complete waitpoint
* All waitpoint sidebar variants
* Waits now use a pause icon
* Durations waits use the API to create/block with a waitpoint, not the runtime
* Fix for engine.blockRunWithWaitpoint required org id
* Removed old wait code from the run controllers/task run process
* Form action for skipping a datetime waitpoint
* Move testDockerCheckpoint to a separate core package export (it can’t be bundled on the client)
* Fix for glitchy hourglass animation
* Completed waitpoints display better
* Increase Redis maxRetriesPerRequest to 20 (default)
* Completing and skipping waitpoints is working
* Remove the database prisma dev command, since we need to use create only now. Updated docs
* Added skip timeout, reworked the UI
* Tweaked spacing
* Added payload limit to waitpoint token completion from dashboard
* Test idempotency works on wait.for and wait.until
* Moved the worker-actions to /engine/ from /api/
* Moved dev engine endpoints to /engine/ from /api/
* Separate /engine/ rate limiter
* Added parallel wait prevention, it’s working for duration waits but not well for triggerAndWait yet
* WIP post-merge conflicts
* Set taskEventStore column in the new engine
* Remove duplicate keys
* Post-merge fixes
* Fix for span merge layout
* Use executedAt instead of firstAttemptStartedAt
---------
Co-authored-by: Matt Aitken <[email protected]>1 parent c519a5a commit e97704d
File tree
307 files changed
+31829
-5638
lines changed- .changeset
- .configs
- .github/workflows
- .vscode
- apps
- coordinator
- src
- docker-provider
- src
- kubernetes-provider
- proxy
- webapp
- app
- assets/icons
- components
- code
- primitives
- runs/v3
- models
- presenters/v3
- routes
- _app.orgs.$organizationSlug.projects.v3.$projectParam.deployments.$deploymentParam
- _app.orgs.$organizationSlug.projects.v3.$projectParam.deployments
- _app.orgs.$organizationSlug.projects.v3.$projectParam.runs.$runParam
- _app.orgs.$organizationSlug.projects.v3.$projectParam.test.tasks.$taskParam
- resources.orgs.$organizationSlug.projects.$projectParam.waitpoints.$waitpointFriendlyId.complete
- resources.orgs.$organizationSlug.projects.v3.$projectParam.runs.$runParam.spans.$spanParam
- services
- routeBuilders
- utils
- v3
- marqs
- models
- services
- worker
- prisma
- test
- internal-packages
- database
- prisma
- migrations
- 20250103152909_add_run_engine_v2
- 20250106172943_added_span_id_to_complete_to_task_run_waitpoint
- 20250109131442_added_batch_and_index_to_task_run_waitpoint_and_task_run_execution_snapshot
- 20250109173506_waitpoint_added_batch_type
- 20250109175955_waitpoint_added_completed_by_batch_id_index
- 20250114153223_task_run_waitpoint_unique_constraint_added_batch_index
- 20250116115746_rename_blocked_by_waitpoints_to_suspended
- 20250128160520_add_runner_id_to_execution_snapshots
- 20250130173941_background_worker_added_engine_version_column
- 20250207104914_added_environment_and_environment_type_to_task_run_execution_snapshot
- 20250219140441_waitpoint_added_idempotency_key_expires_at
- 20250304184614_remove_task_run_first_attempt_started_at_column
- emails
- otlp-importer
- redis-worker/src
- run-engine
- src
- engine
- db
- tests
- run-queue
- testcontainers
- src
- zod-worker
- packages
- build
- cli-v3
- src
- build
- cli
- commands
- workers
- deploy
- dev
- entryPoints
- executions
- utilities
- core
- src
- v3
- apiClient
- apps
- build
- checkpoints
- dev
- prod
- runEngineWorker
- supervisor
- workload
- runtime
- schemas
- types
- utils
- workers
- react-hooks
- rsc
- trigger-sdk
- src/v3
- references
- bun-catalog
- src/trigger
- hello-world
- src/trigger
- init-shell
- nextjs-realtime
- v3-catalog
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
307 files changed
+31829
-5638
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
27 | 34 | | |
28 | 35 | | |
29 | 36 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
| 4 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
133 | 133 | | |
134 | 134 | | |
135 | 135 | | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
136 | 144 | | |
137 | 145 | | |
138 | 146 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
230 | 230 | | |
231 | 231 | | |
232 | 232 | | |
233 | | - | |
| 233 | + | |
234 | 234 | | |
235 | 235 | | |
236 | | - | |
| 236 | + | |
237 | 237 | | |
238 | 238 | | |
239 | | - | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
240 | 248 | | |
241 | 249 | | |
242 | 250 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | 26 | | |
28 | 27 | | |
29 | | - | |
30 | | - | |
| 28 | + | |
31 | 29 | | |
32 | 30 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | | - | |
4 | 2 | | |
5 | | - | |
| 3 | + | |
6 | 4 | | |
7 | 5 | | |
8 | 6 | | |
| |||
0 commit comments