Skip to content

changing the code for launching workers (and shutting them down?) #29

@1fish2

Description

@1fish2

Here's the incremental plan I have in mind (open to change). Meanwhile, we can use the current code (albeit inconsistent between wcEcoli and the Gaia Python client) unless/until there are other clients besides wcEcoli.

  1. I'll add a requested-worker-count property to Workflow properties wcEcoli#755 . (The workflow builder's client and user are in a good position to decide how many workers to allocate.)
  2. We add Gaia code to be able to launch workers via the GCE API. This will be better than pushing so hard on shell scripts. It'll need some parameters sent from the client that are currently in wcEcoli's runscripts/cloud/launch-workers.sh, some parameters that Gaia can get from gcloud (I further configured it on gaia-prime), and some added to its config file. This is easy.
  3. As an interim step, maybe add a Gaia endpoint to launch workers, change the Gaia python client to use it, call that from the workflow builder, and dump both shell scripts. Or skip this step.
  4. Make the Gaia server in charge of when to launch workers, which is whenever it starts or resumes running a workflow. With the requested-worker-count it doesn't have to decide how many.
    • The main advantage of this step is resuming a workflow without the user having to know to launch workers.
    • This step might require changing the way workers shut down or making Gaia monitor them because the timeouts won't fit every situation, e.g. if most workers time out while waiting for one worker to finish a long task, the workflow might need more workers afterwards.
  5. Another day we make the Gaia server able to decide how many workers to launch, perhaps with more hints from the workflow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions