* enable app-specific worker GCE VM parameters (RAM, disk, CPUs, GPUs, TPUs, ACLs, ...), * prevent large workflows from starving small ones, * enable staging servers to be independent, * allow log filtering by run, * save time pulling Docker images (locality), * prevent cascading problems like repeated-retry-on-failure from spreading between workflows. Simplest approach: - [ ] A separate RabbitMQ task queue per workflow. - [ ] Eventually delete the workflow's task queue and remove its resources from Gaia memory. - [ ] In the workflow builder, launch the workers with the workflow name as metadata, and use that to find the task queue. - [ ] Ditto when resuming a workflow. Improve the usability somehow. Smarter: - [ ] Auto-launch and shut down workers.