Skip to content

On demand aws master #944

@YuryHrytsuk

Description

@YuryHrytsuk

How to swtich off / on aws master

Read instructions in https://git.speag.com/oSparc/osparc-ops-deployment-configuration/-/merge_requests/1381

Proposal

Stopping

  • enable redis maintenance logout
    • fail if redis key is already set (add boolean flag to override this behavior) -- overwrite by default
    • end date of maintenance? Which value to choose? -- 2h after start of shutting down
  • wait until sidecars are shutdown with timeout
    • if some fail stop the shutdown procedure -- continue but report + force terminate
    • have a boolean flag that tells "continue even if dy-sidecars cannot terminate gracefully ~~ lose data" -- by default
    • report on non-terminated dy-sidecars (have means to keep this information)
  • wait until autoscaled EC2 are terminated
    • if some fail, stop the procedure until special flags allows hanging Autoscaled EC2 --> retry and continue
    • report on non-terminated autoscaled ec2 (have means to keep this information)
  • Remove all swarm stacks
  • Stop static machines

Starting

  • start machines
  • deploy Swarm Stacks --> wait for autodeployer or manually trigger CI
  • remove maintenance key

Extra:

  • disable all e2e tests when deployment is off

Nuances:

  • What happens to AWS NLB
  • Data on hanging sidecars is going to be lost (--> no shutdown if sidecar is hanging?)
    • Will it? If EC2 is stopped no data loss shall occur

Extra requirements

All procedures

  • Can be run manually

Stopping Usage Usecases

  • can be run by backenders on demand
  • idempotent

Starting Usage Usecases

  • can be done by backenders on demand
  • automatically scheduled after automatic stopping
  • idempotent

Possible use cases

  • shutdown procedure has started but deployment is already under maintenance (via redis and / or maintenance pages stack)
  • shutdown procedure is completed but developers suddenly need aws master
  • there is a special week when aws master needs to be up all the time
  • shutdown procedure has started while deployment is down (e.g. Hardware issues)
  • shutdown procedure fail to complete

Sub-issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions