From 754845c5501d63ad8cf382164ee0408ccbfefaf9 Mon Sep 17 00:00:00 2001 From: Alex Krzos Date: Tue, 13 Jan 2026 08:56:49 -0500 Subject: [PATCH 1/4] Add CLAUDE.md documentation for Claude Code integration MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Provides comprehensive development guidance for Claude Code when working with the Jetlag codebase, including: - Essential commands for environment setup and cluster deployment - Clear workflows for Red Hat Labs (Scale/Performance) and IBMcloud environments - Coverage of all cluster types: MNO, SNO, VMNO, and hybrid deployments - Architecture overview and critical configuration variables - Environment-specific considerations and deployment patterns 🤖 Generated with [Claude Code](https://claude.ai/code) Assisted-by: Claude (claude-sonnet-4@20250514) --- CLAUDE.md | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+) create mode 100644 CLAUDE.md diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 00000000..99fd6c09 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,169 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Overview + +Jetlag is an OpenShift cluster deployment tool that uses Ansible automation to deploy Multi Node OpenShift (MNO) and Single Node OpenShift (SNO) clusters via the Assisted Installer. It supports Red Hat performance labs, Scale Labs, and IBMcloud environments. + +## Essential Commands + +### Environment Setup +```bash +# Bootstrap ansible virtual environment (run from repo root) +source bootstrap.sh + +# Red Hat Labs (Scale Lab/Performance Lab) +# Copy and edit configuration file +cp ansible/vars/all.sample.yml ansible/vars/all.yml +# Edit all.yml with your lab configuration (lab, lab_cloud, cluster_type, etc.) + +# Create inventory file for your lab environment +ansible-playbook ansible/create-inventory.yml +# Setup bastion host (replace cloud99.local with your inventory file) +ansible-playbook -i ansible/inventory/cloud99.local ansible/setup-bastion.yml + +# IBMcloud +# Copy and edit configuration file +cp ansible/vars/ibmcloud.sample.yml ansible/vars/ibmcloud.yml +# Edit ibmcloud.yml with your IBMcloud configuration (cluster_type, worker_node_count, etc.) + +# Create inventory file from IBMcloud CLI data +ansible-playbook ansible/ibmcloud-create-inventory.yml +# Setup bastion host for IBMcloud +ansible-playbook -i ansible/inventory/ibmcloud.local ansible/ibmcloud-setup-bastion.yml +``` + +### Cluster Deployment +```bash +# Red Hat Labs (Scale Lab/Performance Lab) +# Deploy Multi Node OpenShift cluster +ansible-playbook -i ansible/inventory/cloud99.local ansible/mno-deploy.yml + +# Deploy Single Node OpenShift clusters +ansible-playbook -i ansible/inventory/cloud99.local ansible/sno-deploy.yml + +# Deploy Virtual Multi Node OpenShift (VMNO) - requires hypervisor setup first +ansible-playbook -i ansible/inventory/cloud99.local ansible/hv-setup.yml +ansible-playbook -i ansible/inventory/cloud99.local ansible/hv-vm-create.yml +ansible-playbook -i ansible/inventory/cloud99.local ansible/mno-deploy.yml + +# IBMcloud +# Deploy Multi Node OpenShift on IBMcloud +ansible-playbook -i ansible/inventory/ibmcloud.local ansible/ibmcloud-mno-deploy.yml + +# Deploy Single Node OpenShift on IBMcloud +ansible-playbook -i ansible/inventory/ibmcloud.local ansible/ibmcloud-sno-deploy.yml +``` + +### Cluster Management +```bash +# Scale out MNO cluster +ansible-playbook ansible/ocp-scale-out.yml + +# Setup hypervisor nodes for VMs +ansible-playbook ansible/hv-setup.yml + +# Create VMs on hypervisor nodes +ansible-playbook ansible/hv-vm-create.yml + +# Delete VMs from hypervisor nodes +ansible-playbook ansible/hv-vm-delete.yml + +# Replace VMs on hypervisor nodes (delete + recreate) +ansible-playbook ansible/hv-vm-replace.yml + +# Sync OpenShift releases +ansible-playbook ansible/sync-ocp-release.yml +``` + +## Project Architecture + +### Key Configuration Files +- `ansible/vars/all.yml` - Main configuration for Red Hat labs (copy from `ansible/vars/all.sample.yml`) +- `ansible/vars/ibmcloud.yml` - IBMcloud-specific configuration (copy from `ansible/vars/ibmcloud.sample.yml`) +- `pull-secret.txt` - OpenShift pull secret (place in repo root) +- `ansible/inventory/$CLOUDNAME.local` - Generated inventory file for your lab + +### Critical Variables +- `lab`: Environment type (`performancelab`, `scalelab`, or `ibmcloud`) +- `lab_cloud`: Specific cloud allocation (e.g., `cloud42`) +- `cluster_type`: Either `mno`, `sno`, or `vmno` +- `worker_node_count`: Number of bare metal worker nodes for MNO clusters +- `hybrid_worker_count`: Number of virtual worker nodes for hybrid MNO clusters (requires hypervisor setup) +- `ocp_build`: OpenShift build type (`ga`, `dev`, or `ci`) +- `ocp_version`: OpenShift version (e.g., `latest-4.20`) + +### Ansible Role Structure +Jetlag uses a modular Ansible role architecture: + +- **Bastion roles**: `bastion-*` roles configure the bastion host with services like Assisted Installer, DNS, registry +- **Installation roles**: `install-cluster`, `sno-post-cluster-install` handle cluster deployment +- **Hypervisor roles**: `hv-*` roles manage VM infrastructure on hypervisor nodes +- **Utility roles**: `boot-iso`, `sync-*` roles provide supporting functionality + +### Cluster Types +- **MNO (Multi Node OpenShift)**: 3 control-plane nodes + configurable bare metal worker nodes +- **SNO (Single Node OpenShift)**: Single node clusters, one per available machine +- **VMNO (Virtual Multi Node OpenShift)**: MNO cluster using VMs instead of bare metal (Jetlag-specific term) +- **Hybrid MNO**: MNO cluster with both bare metal and virtual worker nodes +- **Hypervisor nodes**: Unused machines become VM hosts for additional clusters or hybrid workers + +#### Virtual and Hybrid Cluster Details +- **VMNO clusters** allow multi-node deployment with fewer physical machines (minimum: 1 bastion + 1-2 hypervisors) +- **Hybrid clusters** combine bare metal workers (`worker_node_count`) with virtual workers (`hybrid_worker_count`) +- Virtual workers are created as VMs on hypervisor nodes and added to the worker inventory section +- VM placement distributed across available hypervisors based on hardware-specific VM count configurations + +### Lab Environment Support +- **Performance Lab**: Dell r750, 740xd hardware +- **Scale Lab**: Various Dell models (r750, r660, r650, r640, r630, fc640), Supermicro systems +- **IBMcloud**: Supermicro E5-2620, Lenovo SR630 bare metal + +## Development Workflow + +### Standard MNO/SNO Deployment (Red Hat Labs) +1. Edit `ansible/vars/all.yml` with your lab configuration +2. Run `ansible-playbook ansible/create-inventory.yml` to generate inventory +3. Run `ansible-playbook -i ansible/inventory/cloud99.local ansible/setup-bastion.yml` to configure bastion host +4. Run deployment playbook (`ansible/mno-deploy.yml` or `ansible/sno-deploy.yml`) +5. Access clusters using kubeconfig files in `/root/mno/` or `/root/sno/` + +### IBMcloud MNO/SNO Deployment +1. Edit `ansible/vars/ibmcloud.yml` with your IBMcloud configuration +2. Run `ansible-playbook ansible/ibmcloud-create-inventory.yml` to generate `ansible/inventory/ibmcloud.local` from IBMcloud CLI data +3. Run `ansible-playbook -i ansible/inventory/ibmcloud.local ansible/ibmcloud-setup-bastion.yml` to configure bastion host +4. Run deployment playbook (`ansible-playbook -i ansible/inventory/ibmcloud.local ansible/ibmcloud-mno-deploy.yml` or `ansible/ibmcloud-sno-deploy.yml`) +5. Access clusters using kubeconfig files in `/root/mno/` or `/root/sno/` + +### VMNO Deployment (Red Hat Labs Only) +1. Edit `ansible/vars/all.yml` with `cluster_type: vmno` and VM-specific settings +2. Edit `ansible/vars/hv.yml` for hypervisor configuration +3. Run `ansible-playbook ansible/create-inventory.yml` to generate inventory with VM entries +4. Run `ansible-playbook -i ansible/inventory/cloud99.local ansible/setup-bastion.yml` to configure bastion host +5. Run `ansible-playbook -i ansible/inventory/cloud99.local ansible/hv-setup.yml` to configure hypervisor nodes +6. Run `ansible-playbook -i ansible/inventory/cloud99.local ansible/hv-vm-create.yml` to create VMs +7. Run `ansible-playbook -i ansible/inventory/cloud99.local ansible/mno-deploy.yml` to deploy cluster to VMs +8. Access cluster using kubeconfig in `/root/vmno/` + +### Hybrid Cluster Deployment (Red Hat Labs Only) +1. Configure both `worker_node_count` (bare metal) and `hybrid_worker_count` (VMs) in `ansible/vars/all.yml` +2. Ensure hypervisor nodes are available in allocation +3. Follow standard Red Hat Labs MNO workflow - hybrid workers automatically added to inventory + +## Special Considerations + +- Inventory files are generated, not manually created (except for "Bring Your Own Lab" scenarios) +- Bastion machine is always the first machine in allocation and hosts Assisted Installer +- Unused machines in MNO deployments become hypervisor nodes +- SNO deployments create one cluster per available machine after bastion +- Public VLAN support available for routable environments (`public_vlan: true`) +- Disconnected/air-gapped deployments supported with registry mirroring + +### Virtual and Hybrid Cluster Considerations +- **Hardware Requirements**: VMNO requires additional CPU/memory capacity for VM overhead +- **VM Management**: Use `hv-vm-delete.yml` or `hv-vm-replace.yml` between VMNO deployments to avoid conflicts +- **Resource Planning**: Configure `hw_vm_counts` per hardware type to optimize VM distribution across hypervisors +- **Disk Configuration**: VMs can span multiple disks on hypervisors (e.g., default disk + nvme for higher VM counts) +- **Network Configuration**: VMs use libvirt networking with static IP assignment from controlplane network range +- **Scale Lab/Performance Lab Only**: VMNO and hybrid deployments only supported in Scale Lab and Performance Lab environments \ No newline at end of file From 5e1fa4ce2cab6dd69bcd51012db6382d21f8c5af Mon Sep 17 00:00:00 2001 From: Alex Krzos Date: Tue, 13 Jan 2026 13:56:45 -0500 Subject: [PATCH 2/4] Fix cluster types section - move hypervisor nodes to implementation details Hypervisor nodes are not a cluster type but rather a node type used to host VMs for virtual and hybrid clusters. Moved this information to the Virtual and Hybrid Cluster Details section where it belongs. Assisted-by: Claude (claude-sonnet-4@20250514) --- CLAUDE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CLAUDE.md b/CLAUDE.md index 99fd6c09..b5fbd4fe 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -107,11 +107,11 @@ Jetlag uses a modular Ansible role architecture: - **SNO (Single Node OpenShift)**: Single node clusters, one per available machine - **VMNO (Virtual Multi Node OpenShift)**: MNO cluster using VMs instead of bare metal (Jetlag-specific term) - **Hybrid MNO**: MNO cluster with both bare metal and virtual worker nodes -- **Hypervisor nodes**: Unused machines become VM hosts for additional clusters or hybrid workers #### Virtual and Hybrid Cluster Details - **VMNO clusters** allow multi-node deployment with fewer physical machines (minimum: 1 bastion + 1-2 hypervisors) - **Hybrid clusters** combine bare metal workers (`worker_node_count`) with virtual workers (`hybrid_worker_count`) +- **Hypervisor nodes**: Unused machines become VM hosts for additional clusters or hybrid workers - Virtual workers are created as VMs on hypervisor nodes and added to the worker inventory section - VM placement distributed across available hypervisors based on hardware-specific VM count configurations From 1678b4bf6fca57f71d75d9ebce701781a408ab7b Mon Sep 17 00:00:00 2001 From: Alex Krzos Date: Tue, 13 Jan 2026 14:12:41 -0500 Subject: [PATCH 3/4] Add comprehensive troubleshooting and tips section Adds new section referencing existing documentation resources: - Links to docs/troubleshooting.md for common deployment issues - References docs/tips-and-vars.md for advanced configuration - Provides quick reference for most common issues (focuses on DNS, not DHCP) - Guides Claude Code instances to consult appropriate documentation This will help future Claude instances quickly locate and use the extensive existing troubleshooting knowledge in the codebase. Assisted-by: Claude (claude-sonnet-4@20250514) --- CLAUDE.md | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/CLAUDE.md b/CLAUDE.md index b5fbd4fe..554d7a44 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -166,4 +166,36 @@ Jetlag uses a modular Ansible role architecture: - **Resource Planning**: Configure `hw_vm_counts` per hardware type to optimize VM distribution across hypervisors - **Disk Configuration**: VMs can span multiple disks on hypervisors (e.g., default disk + nvme for higher VM counts) - **Network Configuration**: VMs use libvirt networking with static IP assignment from controlplane network range -- **Scale Lab/Performance Lab Only**: VMNO and hybrid deployments only supported in Scale Lab and Performance Lab environments \ No newline at end of file +- **Scale Lab/Performance Lab Only**: VMNO and hybrid deployments only supported in Scale Lab and Performance Lab environments + +## Troubleshooting and Tips + +When encountering issues with Jetlag deployments, consult these comprehensive documentation resources: + +### Primary Troubleshooting Resources +- **[docs/troubleshooting.md](docs/troubleshooting.md)**: Comprehensive troubleshooting guide covering: + - Common deployment issues and solutions + - Hardware-specific problems (Dell, Supermicro) + - Bastion configuration and recovery procedures + - BMC/iDRAC reset procedures + - Virtual media and discovery issues + +- **[docs/tips-and-vars.md](docs/tips-and-vars.md)**: Advanced configuration guidance including: + - Network interface configuration and overrides + - Install disk configuration options + - OCP version management + - NVMe disk configuration for install and etcd + - Post-deployment tasks and optimizations + - Bastion registry management + +### Common Issues to Check First +1. **Network Configuration**: Verify `bastion_lab_interface` and `bastion_controlplane_interface` match your hardware +2. **BMC Access**: Ensure BMC credentials and network connectivity are correct +3. **DNS Services**: Check bastion DNS services are running and configured correctly +4. **Disk Selection**: Verify install disk paths and available storage +5. **Resource Limits**: Ensure sufficient CPU/memory for VM deployments (VMNO/hybrid) + +### When to Consult Documentation +- Before troubleshooting deployment failures, read the relevant sections in `troubleshooting.md` +- For advanced configuration needs, reference the specific sections in `tips-and-vars.md` +- When working with specific hardware vendors, check the hardware-specific troubleshooting sections \ No newline at end of file From 03884689395fe1a1c4893737b1accf1b888d604f Mon Sep 17 00:00:00 2001 From: Alex Krzos Date: Tue, 13 Jan 2026 14:22:48 -0500 Subject: [PATCH 4/4] Add Claude Code command for Jetlag PR reviews Adds .claude/commands/jetlag-review.md providing a comprehensive framework for reviewing Jetlag pull requests with: - GitHub CLI integration for fetching and checking out PRs - Comprehensive Ansible-specific review criteria - Focus on idempotency, module selection, and best practices - YAML formatting and structure validation - Security and performance considerations This will standardize PR review quality and ensure consistent evaluation of changes to the Jetlag codebase. Assisted-by: Claude (claude-sonnet-4@20250514) --- .claude/commands/jetlag-review.md | 42 +++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 .claude/commands/jetlag-review.md diff --git a/.claude/commands/jetlag-review.md b/.claude/commands/jetlag-review.md new file mode 100644 index 00000000..d4783c49 --- /dev/null +++ b/.claude/commands/jetlag-review.md @@ -0,0 +1,42 @@ +--- +description: Fetch and review a GitHub PR +--- + +You are tasked with reviewing a GitHub Pull Request. Follow these steps: + +1. **Fetch PR details**: Use `gh pr view {{arg:1}}` to get PR information (title, description, author, files changed) + +2. **Checkout the PR**: Use `gh pr checkout {{arg:1}}` to check out the PR branch locally + +3. **Analyze the changes**: + - Use `gh pr diff {{arg:1}}` to see the full diff + - Read the modified files to understand the context + - Pay attention to the PR description and any linked issues + +4. **Provide a comprehensive review** covering: + - **Summary**: Brief overview of what the PR does + - **Code Quality**: Architecture, patterns, readability, maintainability + - **Potential Issues**: Bugs, edge cases, security concerns, performance issues + - **Testing**: Are tests adequate? Are there missing test cases? + - **Documentation**: Is documentation updated if needed? + - **Ansible-Specific Checks**: + - **Idempotency**: Tasks should be idempotent (can be run multiple times safely) + - **Module Selection**: Use of appropriate Ansible modules (avoid shell/command when native modules exist) + - **Variable Naming**: Follow consistent naming conventions, proper scoping (group_vars, host_vars, role defaults) + - **Task Naming**: All tasks have clear, descriptive names + - **YAML Formatting**: Proper YAML syntax, consistent indentation, use of multi-line strings where appropriate + - **Handlers**: Proper use of handlers for service restarts and notify/listen patterns + - **Jinja2 Templates**: Correct usage of filters, tests, and variable references + - **Error Handling**: Use of failed_when, changed_when, ignore_errors appropriately + - **Tags**: Meaningful tags for task organization and selective execution + - **Secrets Management**: No plain-text passwords, proper use of ansible-vault if applicable + - **Conditionals**: Proper use of when clauses, check for undefined variables + - **Loops**: Efficient use of loop, with_items, etc. + - **Role Structure**: Follows standard role directory structure if roles are modified + - **Deprecations**: No use of deprecated Ansible features or modules + - **Performance**: Consideration of serial, async, poll for long-running tasks + - **Suggestions**: Specific, actionable improvements with code examples where helpful + +5. **Format your review** in a clear, structured markdown format that's easy to read + +Be thorough but constructive. Focus on meaningful feedback that helps improve the code. \ No newline at end of file