Support a separate set of Sisyphus worker nodes per workflow in order to:

* enable app-specific worker GCE VM parameters (RAM, disk, CPUs, GPUs, TPUs, ACLs, ...),
* prevent large workflows from starving small ones,
* enable staging servers to be independent,
* allow log filtering by run,
* save time pulling Docker images (locality),
* prevent cascading problems like repeated-retry-on-failure from spreading between workflows.

Simplest approach:

- [ ] A separate RabbitMQ task queue per workflow.
  - [ ] Eventually delete the workflow's task queue and remove its resources from Gaia memory.
- [ ] In the workflow builder, launch the workers with the workflow name as metadata, and use that to find the task queue.
  - [ ] Ditto when resuming a workflow. Improve the usability somehow.

Smarter:

- [ ] Auto-launch and shut down workers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support a separate set of Sisyphus worker nodes per workflow in order to: #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support a separate set of Sisyphus worker nodes per workflow in order to: #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions