Skip to content

Recovery steps documentation #177

@aravindavk

Description

@aravindavk
  • When a Manager node goes down - No management operations are possible. Mounted Volumes continue to work but no new mounts are possible.

    • Temporary - Wait till the Management nodes come back online.
    • Permanent failure (Notify/Update Mgr URL in all Storage nodes)
      • Setup a new node with the same or different IP/hostname. Restore the Config data from the backup OR
      • Promote any one existing Storage node and restore the Config data from the backup.
  • When a Storage node goes down

    • Temporary - No need to worry, once the node comes back online then everything will be fine.
    • Permanent failure
      • Setup a new node with the same IP/Hostname and call node re-add command to add the node to the Pool.
      • Setup a new node with a different IP/Hostname, then call node re-add command with flag --new-name=NEW_HOSTNAME
  • Create a new Token for Mgr to Node and Node to Mgr communication(Key Rotate)

      kadalu node new-token PROD/server1.example.com
    

Identify the changes required to Code and update documentation once implemented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions