Name	Name	Last commit message	Last commit date
parent directory ..
aws-hypervisor	aws-hypervisor
openshift-clusters	openshift-clusters
Makefile	Makefile
README.md	README.md

Deploy

This directory contains deployment tools and scripts for setting up EC2 instances and OpenShift clusters for development purposes.

Prerequisites

AWS CLI

If you are using the EC2 hypervisor option you will need to have the AWS CLI configured and the AWS_PROFILE environment variable set.

For getting and configuring the CLI: https://docs.aws.amazon.com/cli/

You can check if you have the AWS CLI properly configured by running:

$ aws configure list
      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile            openshift-dev              env    ['AWS_PROFILE', 'AWS_DEFAULT_PROFILE']
access_key     ****************4SU3 shared-credentials-file    
secret_key     ****************z0DF shared-credentials-file    
    region                us-east-2      config-file    ~/.aws/config

Dependencies

The following programs must be present in your local environment:

make
aws
jq
rsync
golang
ansible

Extra dependencies

For automatic Redfish Pacemaker configuration on 4.19, you also need:

Python3 kubernetes library (https://pypi.org/project/kubernetes/)

Additionally, if you're using Mac OS, you might not have timeout, so you might also need to install coreutils, for example via brew: brew install coreutils

Configuration

Before deployment, configure your environment by setting up the aws-hypervisor/instance.env file. Copy aws-hypervisor/instance.env.template to aws-hypervisor/instance.env and set all variables to valid values for your user.

Available Commands

To see all available commands, run:

$ make help

Quick Start

# Create, initialize, and update inventory for a new EC2 instance
$ make deploy

This will create the instance, initialize it, and update the inventory in one command, placing you in a login shell for the EC2 instance.

Recommended Instance Reuse Workflow

For quickly reusing an existing instance with a fresh cluster deployment:

# Force stop the instance (bypasses cluster checks)
$ make force-stop

# Start the instance
$ make start  

# Deploy a fresh cluster
$ make redeploy-cluster

This sequence is the fastest way to reset and reuse an instance when cluster preservation is not needed.

Basic Instance Operations

# Create new EC2 instance
$ make create

# Initialize deployed instance  
$ make init

# Update inventory.ini with current instance IP
$ make inventory

# SSH into the EC2 instance
$ make ssh

# Get instance info
$ make info

# Start a stopped instance
$ make start

# Stop a running instance (with cluster management options)
$ make stop

# Force stop instance immediately (no cluster checks)
$ make force-stop

# Completely destroy the instance
$ make destroy

OpenShift Cluster Management

When running OpenShift clusters on the instance (using dev-scripts), you have several options for managing cluster lifecycle:

Quick deployment commands:

make fencing-ipi, make fencing-agent, make arbiter-ipi, make arbiter-agent, make fencing-kcli, make arbiter-kcli provide non-interactive deployment for specific topologies
These commands automatically call the underlying setup.yml playbook with the appropriate configuration
Useful for automation and when you know exactly which topology you want to deploy

Option 1: Force Stop and Redeploy (Recommended for instance reuse)

# Force stop the instance immediately (no prompts, cluster will be lost)
$ make force-stop

# Start the instance
$ make start

# Redeploy the cluster from scratch
$ make redeploy-cluster

This is the recommended workflow for quickly reusing an instance when you don't need to preserve cluster state.

Alternative forcible stop method:

# Using interactive stop with forcible option
$ make stop
# Choose option 1 when prompted for forcible stop

Option 2: Redeploy Cluster (Clean and Rebuild)

# Redeploy the cluster (clean existing and rebuild)
$ make redeploy-cluster

This option:

Automatically cleans up the existing cluster
Supports interactive mode selection (arbiter or fencing)
Preserves the original installation method (IPI or Agent) from the previous deployment
Intelligently detects cluster topology changes
For same topology: Uses make redeploy (fast, preserves cached data)
For topology changes: Uses make realclean + full installation (slower but clean)
Integrates with Ansible playbooks for orchestration

Installation method preservation: The redeploy command reads the installation method from the cluster state file (aws-hypervisor/instance-data/cluster-vm-state.json) and uses the same method for redeployment. If you originally deployed with make deploy fencing-agent, redeploy will use make agent. If you deployed with make deploy fencing-ipi, it will use make all (IPI).

To manually override the installation method during redeploy, you can edit the state file before running redeploy:

# Check current method
jq '.installation_method' deploy/aws-hypervisor/instance-data/cluster-vm-state.json

# Change to AGENT (if needed)
jq '.installation_method = "AGENT"' deploy/aws-hypervisor/instance-data/cluster-vm-state.json > /tmp/state.json && \
mv /tmp/state.json deploy/aws-hypervisor/instance-data/cluster-vm-state.json

When to use redeploy:

When you want to refresh the cluster with the latest changes
When the cluster is in an inconsistent state
For testing deployment changes
When switching between cluster modes (arbiter ↔ fencing)

Option 3: Delete Cluster and Clean Server

# Delete the cluster and clean the server
$ make clean

# Stop the instance
$ make stop

# When restarted, you'll need to redeploy the cluster from scratch
$ make start
# Quick deployment over clean server
$ make fencing-ipi    # Deploy fencing topology (IPI method)
$ make fencing-agent  # Deploy fencing topology (Agent method) (WIP Experimental)
$ make fencing-kcli   # Deploy fencing topology (kcli method)
$ make arbiter-ipi    # Deploy arbiter topology (IPI method)
$ make arbiter-agent  # Deploy arbiter topology (Agent method)
$ make arbiter-kcli   # Deploy arbiter topology (kcli method)

Option 4: Graceful Cluster Shutdown/Startup (Not recommended due to speed and consistency)

# Gracefully shutdown the cluster VMs before stopping the instance
$ make shutdown-cluster

# Stop the instance (cluster VMs are preserved in shutdown state)
$ make stop

# Start the instance again
$ make start

# Start up the cluster VMs and proxy container
$ make startup-cluster

Cluster Management Details

Important: "Clean" operations delete the cluster completely. All cluster data, configurations, and workloads will be permanently lost.

Clean Options

# Standard clean: Remove cluster while preserving cached data for faster redeployment
$ make clean

# Full clean: Complete cleanup including all cached data (slower but thorough)
$ make full-clean

When to Use:

make clean: Standard cluster cleanup while preserving cached data for faster subsequent deployments
make full-clean: Complete cleanup when you want to start completely fresh or troubleshoot deployment issues

When to Use Each Method:

Redeploy: For changing configurations, updating cluster deployment, switching cluster modes
Delete and Clean: For planned maintenance, manual control over cleanup
Force Stop: For instance reuse with cluster reinstallation, when cluster is corrupted, or when cluster preservation is not needed
Shutdown/Startup: For cluster state preservation

Interactive Stop Script

When running make stop on an instance with a running OpenShift cluster, you'll be presented with options:

Shutdown the cluster VMs: Runs make shutdown-cluster first (recommended)
Delete cluster and clean server: Runs Ansible cleanup playbook
Continue with forcible stop: Stops instance immediately (cluster lost)

The script automatically detects:

OpenShift dev-scripts installations
Running cluster VMs
Cluster state and provides appropriate options

Instance Recovery Options

After restarting an instance with make start, you'll see guidance for:

If you previously shutdown your cluster:

# Start up the existing cluster
$ make startup-cluster

If you need to create or redeploy a cluster:

# Option 1: Automated redeploy with mode selection
$ make redeploy-cluster

# Option 2: Manual clean and setup approach
$ make clean
$ make arbiter-ipi #(for example)

Cluster Utilities

# Build resource-agents RPM and patch all cluster nodes
$ make patch-nodes

# Collect cluster logs from all VMs
$ make get-tnf-logs

Troubleshooting Cluster Management

If cluster startup fails:

# Check cluster status manually
$ make ssh
$ cd ~/openshift-metal3/dev-scripts
$ oc --kubeconfig=ocp/<cluster-name>/auth/kubeconfig get nodes

# If cluster is unrecoverable, clean and redeploy
$ make redeploy-cluster

If VMs don't start properly:

# Check VM states manually
$ make ssh
$ sudo virsh list --all
$ sudo virsh domstate <vm-name>

# Manually start VMs if needed
$ sudo virsh start <vm-name>

If proxy container issues occur:

# Check proxy container status
$ make ssh
$ podman ps --filter name=external-squid

# Restart proxy if needed
$ podman restart external-squid

Advanced Features

Cluster State Tracking:

State saved in aws-hypervisor/instance-data/cluster-vm-state.json
Tracks deployment topology (arbiter/fencing) and installation method (IPI/Agent)
Preserves installation method across redeploys
Detects configuration changes for intelligent cleanup

VM Infrastructure Management:

Automatic detection of VM configuration changes
Safe cleanup when switching between cluster types
Preservation of VM infrastructure when possible

Proxy Container Management:

Automatic proxy container lifecycle management
Integration with cluster startup/shutdown workflows
Status checking and recovery capabilities

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Deploy

Prerequisites

AWS CLI

Dependencies

Extra dependencies

Configuration

Available Commands

Quick Start

Recommended Instance Reuse Workflow

Basic Instance Operations

OpenShift Cluster Management

Option 1: Force Stop and Redeploy (Recommended for instance reuse)

Option 2: Redeploy Cluster (Clean and Rebuild)

Option 3: Delete Cluster and Clean Server

Option 4: Graceful Cluster Shutdown/Startup (Not recommended due to speed and consistency)

Cluster Management Details

Clean Options

Interactive Stop Script

Instance Recovery Options

Cluster Utilities

Troubleshooting Cluster Management

Advanced Features

FilesExpand file tree

deploy

Directory actions

More options

Directory actions

More options

Latest commit

History

deploy

Folders and files

parent directory

README.md

Deploy

Prerequisites

AWS CLI

Dependencies

Extra dependencies

Configuration

Available Commands

Quick Start

Recommended Instance Reuse Workflow

Basic Instance Operations

OpenShift Cluster Management

Option 1: Force Stop and Redeploy (Recommended for instance reuse)

Option 2: Redeploy Cluster (Clean and Rebuild)

Option 3: Delete Cluster and Clean Server

Option 4: Graceful Cluster Shutdown/Startup (Not recommended due to speed and consistency)

Cluster Management Details

Clean Options

Interactive Stop Script

Instance Recovery Options

Cluster Utilities

Troubleshooting Cluster Management

Advanced Features