Skip to content

Latest commit

 

History

History
337 lines (234 loc) · 11.2 KB

File metadata and controls

337 lines (234 loc) · 11.2 KB

Example Orchestration Plan

Pre Cut-Over Activities

  1. Start a root shell:

    sudo -i
  2. Change to orchestration directory:

    cd /root/orchestration
  3. Checkout master branch:

    git checkout master
  4. Get latest updates:

    git pull
    ./update.sh
  5. Reset logging and tracking for orchestration:

    rm -rf logs/ && rm -rf .meta && rm -rf /tmp/mudra && mkdir /tmp/mudra
  6. Set the MUDRA_ENVIRONMENT environment variable to the environment to be migrated

    export MUDRA_ENVIRONMENT=<migration environment>

Orchestration Cutover Execution

Phase 0

Pre-flight checks - database, topics, services

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype Kafka --preflight --maxworkers 10
./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype Database --preflight --skipnodes data-key-service-db --maxworkers 10
./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype App --preflight --maxworkers 10

Check preflight results

List nodes that failed preflight:

grep "failed preflight" logs/mudra.log
ls -la logs/failed_preflight

Tail all App node logs that failed preflight:

for app in logs/failed_preflight/App_*; do tail logs/node_logs/App/${app#logs/failed_preflight/App_}-preflight.log; done

Tail all Kafka node logs that failed preflight:

for kafka in logs/failed_preflight/Kafka_*; do tail logs/node_logs/Kafka/${kafka#logs/failed_preflight/Kafka_}-check.log; done

Tail all Database node logs that failed preflight:

for database in logs/failed_preflight/Database_*; do tail logs/node_logs/Database/${database#logs/failed_preflight/Database_}-preflight.log; done

Scale Down All services on GCP

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype App --action scaletargetdown --maxworkers 10 --force

Phase 1

Show the list of phase 1 services, topics, databases

./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="1" type=App --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="1" type=Database --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="1" type=Kafka --maxworkers 10

Execute Phase 1 Orchestration

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --phase 1 --maxworkers 10

Validate Phase 1

Check sql tasks (from pipenv shell)
./interface.sh --datafiles orchestration_datafiles --environment ${MUDRA_ENVIRONMENT} database status --phase 1 && column -t -s"," /tmp/mudra/rds/db_status.csv

Phase 2

Show the list of phase 2 services, topics, databases

./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="2" type=App --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="2" type=Database --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="2" type=Kafka --maxworkers 10

Execute Phase 2 Orchestration

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --phase 2 --maxworkers 10

Account for special cases like assets and inventory

Check service migration status (check app logs)

Re-run phase 2 service migration for failed services

Cut over phase 2 databases - only for databases associated with phase2 services, some special cases might apply (assets and inventory)

This happens automatically at the end of Phase 2

Check phase 2 db migration status (check database logs)

To check the task states of sql migration: ./interface.sh --datafiles orchestration_datafiles --environment ${MUDRA_ENVIRONMENT} database status --phase 2 && column -t -s"," /tmp/mudra/rds/db_status.csv

Re-run db migration for failed cutover database

Cut over DNS for phase 2 services - I expect this will be manual tomorrow

Phase 3

Show the list of phase 3 services, topics, databases

./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="3" type=App --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="3" type=Database --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="3" type=Kafka --maxworkers 10

Execute Phase 3 Orchestration

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --phase 3 --maxworkers 10

Swap & Cut over phase 3 databases

This happens automatically at the end of Phase 3

Check phase 3 db migration status

Cut over DNS for phase 3 services

Phase 4

Show the list of phase 4 services, topics, databases

./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="4" type=App --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="4" type=Database --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="4" type=Kafka --maxworkers 10

Execute Phase 4 Orchestration

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --phase 4 --maxworkers 10

Swap & Cut over phase 4 databases

This happens automatically at the end of Phase 4

Check phase 4 db migration status

phase 4 DNS cutover

Phase 5

Show the list of phase 5 services, topics, databases, buckets

./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="5" type=App --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="5" type=Database --maxworkers 10
./orchestrate.sh --datafiles orchestration_datafiles --gettree phase="5" type=Kafka --maxworkers 10

Execute Phase 5 Orchestration

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --phase 5 --maxworkers 10

Swap & Cut over phase 5 databases

This happens automatically at the end of Phase 5

Check db migration status

Final DNS cutover

Clean up databases (cutover)

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype Database --action cleanup --maxworkers 10 --force

Rollback Activities

Scale down all services on GCP

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype App --action scaletargetdown --maxworkers 10 --force

Scale AWS back to original

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype App --action rollbacksource --maxworkers 10 --force

Un-swap databases

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype Database --action unswap --maxworkers 10 --force

Clean up databases (rollback)

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype Database --action cleanup --maxworkers 10 --force

Scale all services on GCP to targets

./orchestrate.sh --environment ${MUDRA_ENVIRONMENT} --datafiles orchestration_datafiles --nodetype App --action scaletarget --maxworkers 10 --force