diff --git a/plugins/openshift/README.md b/plugins/openshift/README.md index b97a168..8570920 100644 --- a/plugins/openshift/README.md +++ b/plugins/openshift/README.md @@ -4,6 +4,18 @@ OpenShift development utilities and workflow helpers for Claude Code. ## Commands +### `/openshift:install-vsphere` + +Install OpenShift on vSphere with an automated workflow designed for IBM Cloud Classic infrastructure. + +Features: +- Auto-installs and configures `govc` for vSphere discovery +- Interactive dropdown selection for datacenters, clusters, datastores, and networks +- Automated VIP selection using Route53 queries and ping verification +- Auto-creates Route53 DNS records (api, api-int, *.apps) +- Generates install-config.yaml from gathered information +- Guides through complete installation process + ### `/openshift:new-e2e-test` Write and validate new OpenShift E2E tests using the Ginkgo framework. diff --git a/plugins/openshift/commands/install-vsphere.md b/plugins/openshift/commands/install-vsphere.md new file mode 100644 index 0000000..99dadf9 --- /dev/null +++ b/plugins/openshift/commands/install-vsphere.md @@ -0,0 +1,730 @@ +--- +description: Install OpenShift on vSphere with automated workflow +argument-hint: [openshift-version] +--- + +## Name +openshift:install-vsphere + +## Synopsis +``` +/openshift:install-vsphere [openshift-version] +``` + +## Description +The `openshift:install-vsphere` command automates the complete workflow for installing OpenShift on VMware vSphere using the IPI (Installer-Provisioned Infrastructure) method. This command is specifically designed for IBM Cloud Classic vSphere environments with pre-configured VLANs and Route53 DNS management. + +This command is designed to streamline the installation process by: +- Interactively gathering all required vSphere connection details and credentials +- Guiding through the IBM Cloud Classic network configuration workflow +- Assisting with VIP selection and Route53 DNS record creation +- Validating vSphere prerequisites (permissions, resources, network configuration) +- Generating a customized install-config.yaml file +- Downloading the appropriate openshift-install binary +- Executing the installation and monitoring progress +- Providing troubleshooting guidance if issues occur + +The installation uses the IPI method, where the OpenShift installer automatically provisions the virtual machines and configures the infrastructure on vSphere. + +**Environment-Specific Details:** +- Works with IBM Cloud Classic infrastructure with vSphere +- Supports mixed vCenter versions (7.x and 8.x) +- Pre-configured VLANs and subnets associated with vSphere port groups +- Requires manual Route53 DNS A record creation for api, api-int, and *.apps wildcard +- VIP selection via Route53 lookup and ping verification + +## Implementation + +### Phase 1: Prerequisites Check + +1. **Check for required tools**: + + a. **jq**: Required for JSON parsing + ```bash + if ! which jq &>/dev/null; then + echo "jq not found. Please install jq:" + echo " macOS: brew install jq" + echo " Linux: sudo apt-get install jq (or yum install jq)" + exit 1 + fi + ``` + + b. **go**: Required for building vsphere-helper + ```bash + if ! which go &>/dev/null; then + echo "Go not found. Please install Go 1.23+:" + echo " https://golang.org/doc/install" + exit 1 + fi + ``` + + c. **openshift-install**: Auto-download based on selected version + - Will be downloaded in Phase 1, step 2 after version is determined + + d. **oc/kubectl**: Optional but recommended + - Check if present, inform user they can be installed later + + **Note on govc:** NOT required - we use `vsphere-helper` (Go binary with govmomi) instead. + - govc is only used as fallback if vsphere-helper fails + - Will be auto-installed if needed via `plugins/openshift/scripts/install-govc.sh` + +2. **Determine OpenShift version**: + - If `$1` (openshift-version) is provided, use that version + - If not provided, fetch available versions from the mirror and ask the user: + ```bash + # Fetch available "latest-*" versions from the mirror + AVAILABLE_VERSIONS=$(curl -sL "https://mirror.openshift.com/pub/openshift-v4/clients/ocp/" | \ + grep -oE 'href="latest-[^"]*/"' | \ + sed 's/href="latest-//g' | \ + sed 's/\/"//g' | \ + sort -V -r | \ + head -5) + + echo "Available OpenShift versions:" + echo "$AVAILABLE_VERSIONS" + ``` + - Present the versions (e.g., "4.20", "4.19", "4.18", "4.17") to the user using AskUserQuestion + - Once version is selected (e.g., "4.20"), download the installer: + ```bash + # Use the reusable installer download script + VERSION="{selected-version}" # e.g., "4.20" + bash plugins/openshift/scripts/download-openshift-installer.sh "$VERSION" . + ``` + - Script location: `plugins/openshift/scripts/download-openshift-installer.sh` + - Auto-detects OS (macOS/Linux) and architecture + - Supports version formats: `4.20`, `latest-4.20`, `stable-4.20` + - Downloads and extracts to current directory + - Makes binary executable automatically + +### Phase 2: Gather vSphere Configuration + +**ALWAYS USE:** `vsphere-discovery` skill for vSphere infrastructure discovery. + +**Why vsphere-helper (NOT govc):** +- ✅ Correct vSphere inventory paths (native govmomi library, no text parsing) +- ✅ Structured JSON output for easy parsing +- ✅ 5x faster performance (persistent session vs multiple govc calls) +- ✅ Detailed error messages +- ✅ Built-in retry logic and connection handling + +**Fallback:** Only use govc if vsphere-helper build fails (auto-install via `plugins/openshift/scripts/install-govc.sh`) + +**Skill location:** `plugins/openshift/skills/vsphere-discovery/` + +**Implementation:** See SKILL.md for detailed usage instructions. + +**CRITICAL SECURITY REQUIREMENT:** + +⚠️ **CREDENTIAL HANDLING POLICY** ⚠️ + +When implementing this command, you MUST follow these security rules: + +1. **NEVER display passwords, tokens, or credentials in:** + - Command output + - Bash commands + - Log files + - Error messages + - Any terminal output + +2. **Credential Collection Process:** + - Prompt user interactively for: vCenter server, username, password + - Store values in variables (never echo/display them) + - Create environment file: `.work/.vcenter-env` + - Set file permissions: `chmod 600 .work/.vcenter-env` + - Source the file for all subsequent commands + +3. **Environment File Format:** + ```bash + # .work/.vcenter-env (chmod 600) + export GOVC_URL="https://${VCENTER_SERVER}/sdk" + export GOVC_USERNAME="${VCENTER_USERNAME}" + export GOVC_PASSWORD="${VCENTER_PASSWORD}" + export GOVC_INSECURE=true + ``` + +4. **Using Credentials:** + ```bash + # Source environment file before using govc + source .work/.vcenter-env + + # Commands reference env vars (never display values) + govc about + govc ls / + ``` + +5. **Reference:** See `SECURITY.md` for complete security policy + +**END SECURITY REQUIREMENT** + +1. **vCenter Connection Details and Credential Setup**: + + **Step 1: Collect Credentials from User** + - Prompt for vCenter server URL (e.g., "vcenter.ci.ibmc.devcluster.openshift.com") + - Prompt for vCenter username (e.g., "user@vsphere.local") + - Prompt for vCenter password (**NEVER display or log this value**) + - Prompt for certificate validation preference (true/false for GOVC_INSECURE) + + **Step 2: Create Secure Environment File** + ```bash + # Create .work directory if it doesn't exist + mkdir -p .work + + # Create environment file with collected credentials + # CRITICAL: Use variables, NEVER hardcode or echo credential values + cat > .work/.vcenter-env < {api-vip} + - `api-int.{cluster-name}.{base-domain}` -> {api-vip} + - `*.apps.{cluster-name}.{base-domain}` -> {ingress-vip} + - Re-verify DNS resolution using `dig`: + ```bash + dig +short api.{cluster-name}.{base-domain} + dig +short api-int.{cluster-name}.{base-domain} + dig +short randomtest.apps.{cluster-name}.{base-domain} + ``` + - All queries should return the expected VIP addresses before proceeding + +3. **Certificate Validation**: + - If vCenter uses self-signed certificates, may need to set up certificate trust + - Provide guidance on certificate handling if needed + +### Phase 5: Execute Installation + +1. **Run the installer**: + ```bash + ./openshift-install create cluster --dir=.work/openshift-vsphere-install/{cluster-name} --log-level=info + ``` + +2. **Monitor installation progress**: + - The installation typically takes 30-45 minutes + - Display progress updates to the user + - Watch for common errors (VIP conflicts, network issues, insufficient resources) + +3. **Handle installation output**: + - Save installation logs to `.work/openshift-vsphere-install/{cluster-name}/.openshift_install.log` + - Monitor for ERROR or FATAL messages + - If errors occur, parse logs and provide troubleshooting guidance + +### Phase 6: Post-Installation + +1. **Display cluster credentials**: + - Extract kubeadmin password from `.work/openshift-vsphere-install/{cluster-name}/auth/kubeadmin-password` + - Show cluster console URL: `https://console-openshift-console.apps.{cluster-name}.{base-domain}` + - Show API endpoint: `https://api.{cluster-name}.{base-domain}:6443` + +2. **Set up kubeconfig**: + ```bash + export KUBECONFIG=.work/openshift-vsphere-install/{cluster-name}/auth/kubeconfig + ``` + +3. **Verify cluster health**: + ```bash + oc get nodes + oc get co # Check cluster operators + oc get clusterversion + ``` + +4. **Provide next steps**: + - How to access the web console + - How to add additional users + - Where to find documentation for post-install configuration + - How to delete the cluster (using `openshift-install destroy cluster`) + +### Error Handling + +Common installation failures and resolutions: + +1. **VIP already in use**: + - Verify VIPs are not assigned to other devices + - Check DHCP range doesn't overlap with VIPs + - Suggest using `ping` to test VIP availability + +2. **Insufficient permissions**: + - Verify vCenter user has all required privileges + - Reference: https://docs.openshift.com/container-platform/latest/installing/installing_vsphere/installing-vsphere-installer-provisioned.html#installation-vsphere-installer-infra-requirements_installing-vsphere-installer-provisioned + +3. **DNS resolution failures**: + - Verify DNS records are created and propagated + - Test with `dig api.{cluster-name}.{base-domain}` + - Test with `dig test.apps.{cluster-name}.{base-domain}` + +4. **Network connectivity issues**: + - Verify vSphere network allows required traffic + - Check firewall rules + - Ensure DHCP is available on the network (for bootstrap) + +5. **Insufficient resources**: + - Check available CPU, memory, and storage in vSphere cluster + - May need to reduce worker count or VM sizes + +6. **Certificate errors**: + - If using self-signed certs, may need to add to trust store + - Consider using `--skip-tls-verify` for testing (not recommended for production) + +If installation fails, provide: +- Relevant error messages from logs +- Specific troubleshooting steps based on the error +- Links to relevant documentation +- Option to retry with different configuration + +## Return Value +- **Working directory**: `.work/openshift-vsphere-install/{cluster-name}/` +- **Kubeconfig**: `.work/openshift-vsphere-install/{cluster-name}/auth/kubeconfig` +- **Admin credentials**: `.work/openshift-vsphere-install/{cluster-name}/auth/kubeadmin-password` +- **Installation logs**: `.work/openshift-vsphere-install/{cluster-name}/.openshift_install.log` +- **Cluster information**: API endpoint, console URL, and access credentials displayed in terminal + +## Examples + +1. **Install latest stable OpenShift version**: + ``` + /openshift:install-vsphere + ``` + The command will interactively prompt for all required configuration. + +2. **Install specific OpenShift version**: + ``` + /openshift:install-vsphere 4.15.0 + ``` + Installs OpenShift 4.15.0 specifically. + +3. **Install using stable channel**: + ``` + /openshift:install-vsphere stable-4.16 + ``` + Installs the latest stable release from the 4.16 channel. + +## Arguments +- $1: (Optional) OpenShift version to install. Can be a specific version (e.g., "4.15.0") or a channel (e.g., "stable-4.16", "fast-4.16"). If not provided, defaults to "stable" which installs the latest stable release. + +## Notes + +**IBM Cloud Classic Specific:** +- Port group must be obtained from infrastructure team before starting +- Route53 DNS records (api, api-int, *.apps) must be created manually before VIP selection +- Use Route53 queries and ping to determine available IP addresses for VIPs +- VIPs must be within the subnet CIDR associated with the port group +- Mixed vCenter versions (7.x and 8.x) are supported + +**General Installation:** +- Installation typically takes 30-45 minutes to complete +- Ensure DNS records are created and verified before starting installation +- Keep the install-config.yaml.backup file for reference +- The installer will create a bootstrap VM that is automatically deleted after installation +- Default VM sizes: Control plane (4 vCPU, 16GB RAM, 120GB disk), Worker (2 vCPU, 8GB RAM, 120GB disk) +- Minimum cluster: 3 control plane + 2 worker nodes (can scale workers to 0 for minimal install) +- The .work directory contents should not be committed to git (already in .gitignore) + +**Performance Optimization for VPN/Slow Connections:** +- Pre-uploading the RHCOS OVA template (Phase 2, step 8) significantly speeds up installation over VPN +- Without a template: installer uploads ~1GB OVA during installation (can take 30-60 minutes over VPN) +- With a template: installer clones the existing VM template (typically 2-5 minutes) +- The template can be reused for multiple cluster installations with the same OpenShift version +- Template files are cached in `.work/openshift-vsphere-install/ova-cache/` for reuse + +## References +- Official vSphere IPI Installation Guide: https://docs.openshift.com/container-platform/latest/installing/installing_vsphere/installing-vsphere-installer-provisioned.html +- vSphere Prerequisites: https://docs.openshift.com/container-platform/latest/installing/installing_vsphere/installing-vsphere-installer-provisioned.html#installation-vsphere-installer-infra-requirements_installing-vsphere-installer-provisioned +- Download OpenShift installer: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/ +- Get Pull Secret: https://console.redhat.com/openshift/install/pull-secret +- RHCOS OVA metadata: https://github.com/openshift/installer/tree/main/data/data/coreos (per-version branches) diff --git a/plugins/openshift/scripts/README.md b/plugins/openshift/scripts/README.md new file mode 100644 index 0000000..110cb49 --- /dev/null +++ b/plugins/openshift/scripts/README.md @@ -0,0 +1,103 @@ +# OpenShift Plugin Scripts + +This directory contains reusable utility scripts for the OpenShift plugin commands. + +## Available Scripts + +### install-govc.sh + +Automatically downloads and installs the latest version of `govc` (VMware vSphere CLI) for the current platform. + +**Usage:** +```bash +bash plugins/openshift/scripts/install-govc.sh +``` + +**Features:** +- Auto-detects OS (Linux/macOS) and architecture (x86_64/arm64) +- Downloads the latest release from GitHub +- Installs to user-local directory (`~/.local/bin` or `~/bin`) when possible +- Falls back to system-wide installation (`/usr/local/bin`) if needed +- Verifies installation and reports version + +**Used by:** +- `/openshift:install-vsphere` - For vSphere infrastructure auto-discovery +- `/openshift:create-cluster` - For cluster creation workflows + +**Requirements:** +- `curl` - For downloading files +- `jq` - For parsing GitHub API responses +- `tar` - For extracting archives +- `sudo` - Only if installing to `/usr/local/bin` + +--- + +### install-vcenter-certs.sh + +Downloads and installs vCenter SSL certificates to the system trust store. + +**Usage:** +```bash +bash plugins/openshift/scripts/install-vcenter-certs.sh +``` + +**Example:** +```bash +bash plugins/openshift/scripts/install-vcenter-certs.sh vcenter.example.com +``` + +**Features:** +- Downloads certificate bundle from vCenter +- Validates ZIP archive integrity +- Installs certificates to OS-specific trust store: + - **macOS:** System Keychain + - **Linux:** `/usr/local/share/ca-certificates/` + `update-ca-certificates` +- Automatic cleanup of temporary files +- Detailed error messages and troubleshooting guidance + +**Used by:** +- `/openshift:install-vsphere` - For secure govc communication with vCenter + +**Requirements:** +- `curl` - For downloading certificates +- `unzip` - For extracting certificate bundle +- `sudo` - Required for installing certificates to system trust store + +--- + +### download-openshift-installer.sh + +Downloads the `openshift-install` binary for the current platform. + +**Usage:** +```bash +bash plugins/openshift/scripts/download-openshift-installer.sh [output-directory] +``` + +**Examples:** +```bash +# Download latest 4.20.x to current directory +bash plugins/openshift/scripts/download-openshift-installer.sh 4.20 + +# Download to specific directory +bash plugins/openshift/scripts/download-openshift-installer.sh 4.20 /usr/local/bin + +# Use explicit channel +bash plugins/openshift/scripts/download-openshift-installer.sh stable-4.19 +``` + +**Features:** +- Auto-detects OS (Linux/macOS) and architecture (x86_64/arm64) +- Supports version formats: `4.20`, `latest-4.20`, `stable-4.20`, `fast-4.20` +- Downloads from official OpenShift mirror +- Validates download and extraction +- Makes binary executable automatically +- Shows version after successful download + +**Used by:** +- `/openshift:install-vsphere` - For cluster installation +- `/openshift:create-cluster` - For cluster provisioning + +**Requirements:** +- `curl` - For downloading installer +- `tar` - For extracting archive diff --git a/plugins/openshift/scripts/download-openshift-installer.sh b/plugins/openshift/scripts/download-openshift-installer.sh new file mode 100755 index 0000000..777ddf3 --- /dev/null +++ b/plugins/openshift/scripts/download-openshift-installer.sh @@ -0,0 +1,123 @@ +#!/usr/bin/env bash +# download-openshift-installer.sh - Download openshift-install binary +# Works on both macOS and Linux with automatic OS/architecture detection + +set -euo pipefail + +# Function to show usage +usage() { + echo "Usage: $0 [output-directory]" + echo "" + echo "Downloads openshift-install binary for the current platform" + echo "" + echo "Arguments:" + echo " version OpenShift version (e.g., '4.20', '4.19', 'stable-4.20', 'latest-4.20')" + echo " output-directory Optional: Directory to extract binary to (default: current directory)" + echo "" + echo "Examples:" + echo " $0 4.20 # Downloads latest 4.20.x release" + echo " $0 stable-4.19 # Downloads latest stable 4.19.x release" + echo " $0 4.20 /usr/local/bin # Downloads and extracts to /usr/local/bin" + exit 1 +} + +# Check for required argument +if [ $# -lt 1 ]; then + usage +fi + +VERSION="$1" +OUTPUT_DIR="${2:-.}" # Default to current directory + +# Normalize version format +# If version is just "4.20", prepend "latest-" +if [[ "$VERSION" =~ ^[0-9]+\.[0-9]+$ ]]; then + VERSION="latest-${VERSION}" +fi + +# If version doesn't start with "latest-" or "stable-", prepend "latest-" +if [[ ! "$VERSION" =~ ^(latest-|stable-|fast-|candidate-) ]]; then + VERSION="latest-${VERSION}" +fi + +echo "OpenShift version: $VERSION" + +# Detect OS +OS=$(uname -s) +echo "Detected OS: $OS" + +# Detect architecture +ARCH=$(uname -m) +echo "Detected Architecture: $ARCH" + +# Construct binary filename based on OS and architecture +if [ "$OS" = "Darwin" ]; then + # macOS + if [ "$ARCH" = "arm64" ]; then + BINARY="openshift-install-mac-arm64.tar.gz" + else + BINARY="openshift-install-mac.tar.gz" + fi +elif [ "$OS" = "Linux" ]; then + # Linux + if [ "$ARCH" = "aarch64" ] || [ "$ARCH" = "arm64" ]; then + BINARY="openshift-install-linux-arm64.tar.gz" + else + BINARY="openshift-install-linux.tar.gz" + fi +else + echo "Error: Unsupported OS: $OS (only Darwin/macOS and Linux are supported)" + exit 1 +fi + +echo "Binary to download: $BINARY" + +# Construct download URL +DOWNLOAD_URL="https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${VERSION}/${BINARY}" +echo "Download URL: $DOWNLOAD_URL" + +# Create output directory if it doesn't exist +if [ ! -d "$OUTPUT_DIR" ]; then + echo "Creating output directory: $OUTPUT_DIR" + mkdir -p "$OUTPUT_DIR" +fi + +# Download and extract +echo "Downloading openshift-install..." +TEMP_DIR=$(mktemp -d) +trap "rm -rf $TEMP_DIR" EXIT + +if ! curl -f -L "$DOWNLOAD_URL" -o "$TEMP_DIR/installer.tar.gz"; then + echo "Error: Failed to download openshift-install from $DOWNLOAD_URL" + echo "" + echo "Possible issues:" + echo " - Version '$VERSION' may not exist" + echo " - Network connectivity issues" + echo " - Binary '$BINARY' may not be available for this version" + echo "" + echo "Try checking available versions at:" + echo " https://mirror.openshift.com/pub/openshift-v4/clients/ocp/" + exit 1 +fi + +echo "Extracting to $OUTPUT_DIR..." +tar xzf "$TEMP_DIR/installer.tar.gz" -C "$OUTPUT_DIR" + +# Verify extraction +if [ ! -f "$OUTPUT_DIR/openshift-install" ]; then + echo "Error: openshift-install binary not found after extraction" + exit 1 +fi + +# Make executable +chmod +x "$OUTPUT_DIR/openshift-install" + +echo "✓ openshift-install downloaded successfully" +echo "Location: $OUTPUT_DIR/openshift-install" + +# Show version +if "$OUTPUT_DIR/openshift-install" version 2>/dev/null | head -1; then + true +else + echo "Note: Run '$OUTPUT_DIR/openshift-install version' to verify" +fi diff --git a/plugins/openshift/scripts/install-govc.sh b/plugins/openshift/scripts/install-govc.sh new file mode 100755 index 0000000..59ab3f3 --- /dev/null +++ b/plugins/openshift/scripts/install-govc.sh @@ -0,0 +1,85 @@ +#!/usr/bin/env bash +# install-govc.sh - Download and install govc for macOS or Linux +# Works on both macOS and Linux with automatic OS/architecture detection + +set -euo pipefail + +# Detect OS +OS=$(uname -s) +echo "Detected OS: $OS" + +# Detect architecture +ARCH=$(uname -m) +echo "Detected Architecture: $ARCH" + +# Validate OS is supported +if [[ "$OS" != "Linux" && "$OS" != "Darwin" ]]; then + echo "Error: Unsupported OS: $OS (only Linux and Darwin/macOS are supported)" + exit 1 +fi + +# Construct asset pattern for GitHub release +# GitHub uses capitalized OS names: Linux, Darwin +ASSET_PATTERN="govc_${OS}_${ARCH}.tar.gz" +echo "Looking for asset: $ASSET_PATTERN" + +# Fetch latest release info from GitHub API +echo "Fetching latest govc release from GitHub..." +RELEASE_JSON=$(curl -s https://api.github.com/repos/vmware/govmomi/releases/latest) + +# Extract version +VERSION=$(echo "$RELEASE_JSON" | jq -r '.tag_name') +echo "Latest version: $VERSION" + +# Find matching asset download URL +DOWNLOAD_URL=$(echo "$RELEASE_JSON" | jq -r ".assets[] | select(.name == \"$ASSET_PATTERN\") | .browser_download_url") + +if [[ -z "$DOWNLOAD_URL" ]]; then + echo "Error: Could not find asset matching pattern: $ASSET_PATTERN" + echo "Available assets:" + echo "$RELEASE_JSON" | jq -r '.assets[].name' | grep "^govc_" + exit 1 +fi + +echo "Download URL: $DOWNLOAD_URL" + +# Download and extract to /tmp +echo "Downloading govc..." +TEMP_DIR=$(mktemp -d) +trap "rm -rf $TEMP_DIR" EXIT + +curl -L "$DOWNLOAD_URL" -o "$TEMP_DIR/govc.tar.gz" +echo "Extracting..." +tar xzf "$TEMP_DIR/govc.tar.gz" -C "$TEMP_DIR" +chmod +x "$TEMP_DIR/govc" + +# Determine installation location +# Prefer user-local directories, fall back to system-wide +if [[ -d "$HOME/.local/bin" ]]; then + INSTALL_DIR="$HOME/.local/bin" +elif [[ -d "$HOME/bin" ]]; then + INSTALL_DIR="$HOME/bin" +else + INSTALL_DIR="/usr/local/bin" + echo "Note: Installing to $INSTALL_DIR (may require sudo)" +fi + +# Install govc +echo "Installing govc to $INSTALL_DIR..." +if [[ "$INSTALL_DIR" == "/usr/local/bin" ]]; then + sudo mv "$TEMP_DIR/govc" "$INSTALL_DIR/govc" +else + mv "$TEMP_DIR/govc" "$INSTALL_DIR/govc" +fi + +echo "✓ govc installed successfully to $INSTALL_DIR/govc" + +# Verify installation +if command -v govc &>/dev/null; then + echo "✓ govc is in PATH" + govc version +else + echo "⚠ govc is not in PATH. You may need to add $INSTALL_DIR to your PATH" + echo " Add this to your ~/.bashrc or ~/.zshrc:" + echo " export PATH=\"$INSTALL_DIR:\$PATH\"" +fi diff --git a/plugins/openshift/scripts/install-vcenter-certs.sh b/plugins/openshift/scripts/install-vcenter-certs.sh new file mode 100755 index 0000000..e13a501 --- /dev/null +++ b/plugins/openshift/scripts/install-vcenter-certs.sh @@ -0,0 +1,126 @@ +#!/usr/bin/env bash +# install-vcenter-certs.sh - Download and install vCenter certificates +# Works on both macOS and Linux + +set -euo pipefail + +# Check for required argument +if [ $# -lt 1 ]; then + echo "Usage: $0 " + echo "Example: $0 vcenter.example.com" + exit 1 +fi + +VCENTER_SERVER="$1" +CERT_URL="https://${VCENTER_SERVER}/certs/download.zip" + +echo "Downloading vCenter certificates from: $CERT_URL" + +# Create temporary directory +TEMP_DIR=$(mktemp -d) +trap "rm -rf $TEMP_DIR" EXIT + +# Download certificates +if ! curl -sk "$CERT_URL" -o "$TEMP_DIR/vcenter-certs.zip"; then + echo "Error: Failed to download certificates from $CERT_URL" + echo "Please verify the vCenter server address is correct and accessible" + exit 1 +fi + +# Verify we got a valid zip file +if ! file "$TEMP_DIR/vcenter-certs.zip" | grep -q "Zip archive"; then + echo "Error: Downloaded file is not a valid ZIP archive" + echo "The vCenter server may be unreachable or the certificates endpoint is unavailable" + exit 1 +fi + +# Extract certificates +echo "Extracting certificates..." +unzip -q -o "$TEMP_DIR/vcenter-certs.zip" -d "$TEMP_DIR/vcenter-certs" + +# Check if extraction was successful +if [ ! -d "$TEMP_DIR/vcenter-certs/certs" ]; then + echo "Error: Certificate extraction failed - certs directory not found" + exit 1 +fi + +# Detect OS-specific certificate directory +OS_TYPE=$(uname -s) +if [ "$OS_TYPE" = "Darwin" ]; then + CERT_SOURCE_DIR="$TEMP_DIR/vcenter-certs/certs/mac" + CERT_PATTERN="*.0" +elif [ "$OS_TYPE" = "Linux" ]; then + CERT_SOURCE_DIR="$TEMP_DIR/vcenter-certs/certs/lin" + CERT_PATTERN="*.0" +else + CERT_SOURCE_DIR="$TEMP_DIR/vcenter-certs/certs/win" + CERT_PATTERN="*.0.crt" +fi + +# Count certificates (only .0 files, not .r0 CRL files) +CERT_COUNT=$(find "$CERT_SOURCE_DIR" -name "$CERT_PATTERN" -not -name "*.r0" | wc -l) +if [ "$CERT_COUNT" -eq 0 ]; then + echo "Error: No certificate files found in $CERT_SOURCE_DIR" + exit 1 +fi + +echo "Found $CERT_COUNT certificate(s) to install" + +# Install certificates to system trust store +OS=$(uname -s) + +if [ "$OS" = "Darwin" ]; then + # macOS + echo "Installing certificates to macOS System Keychain (requires sudo)..." + for cert in "$CERT_SOURCE_DIR"/*.0; do + [ -e "$cert" ] || continue # Skip if no matches + [ "$(basename "$cert")" != "*.r0" ] || continue # Skip CRL files + CERT_NAME=$(basename "$cert") + echo " Installing: $CERT_NAME" + if ! sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain "$cert"; then + echo " Warning: Failed to install $CERT_NAME" + fi + done + echo "✓ Certificates installed to macOS System Keychain" + +elif [ "$OS" = "Linux" ]; then + # Linux - detect RHEL/Fedora vs Debian/Ubuntu + echo "Installing certificates to Linux CA trust store (requires sudo)..." + + if [ -d "/etc/pki/ca-trust/source/anchors" ]; then + # RHEL/Fedora/CentOS + echo "Detected RHEL/Fedora system" + for cert in "$CERT_SOURCE_DIR"/*.0; do + [ -e "$cert" ] || continue # Skip if no matches + [ "$(basename "$cert")" != "*.r0" ] || continue # Skip CRL files + CERT_NAME=$(basename "$cert" .0) + sudo cp "$cert" "/etc/pki/ca-trust/source/anchors/${CERT_NAME}.crt" + echo " Installed: ${CERT_NAME}.crt" + done + sudo update-ca-trust extract + echo "✓ Certificates installed to RHEL/Fedora CA trust store" + elif [ -d "/usr/local/share/ca-certificates" ]; then + # Debian/Ubuntu + echo "Detected Debian/Ubuntu system" + for cert in "$CERT_SOURCE_DIR"/*.0; do + [ -e "$cert" ] || continue # Skip if no matches + [ "$(basename "$cert")" != "*.r0" ] || continue # Skip CRL files + CERT_NAME=$(basename "$cert" .0) + sudo cp "$cert" "/usr/local/share/ca-certificates/${CERT_NAME}.crt" + echo " Installed: ${CERT_NAME}.crt" + done + sudo update-ca-certificates + echo "✓ Certificates installed to Debian/Ubuntu CA trust store" + else + echo "Error: Could not find CA trust store directory" + echo "Tried: /etc/pki/ca-trust/source/anchors (RHEL/Fedora)" + echo " /usr/local/share/ca-certificates (Debian/Ubuntu)" + exit 1 + fi + +else + echo "Error: Unsupported OS: $OS (only Darwin/macOS and Linux are supported)" + exit 1 +fi + +echo "✓ vCenter certificates installed successfully" diff --git a/plugins/openshift/scripts/list-openshift-versions.sh b/plugins/openshift/scripts/list-openshift-versions.sh new file mode 100755 index 0000000..a3029f0 --- /dev/null +++ b/plugins/openshift/scripts/list-openshift-versions.sh @@ -0,0 +1,82 @@ +#!/usr/bin/env bash +# list-openshift-versions.sh - List available OpenShift versions from mirror +# Usage: ./list-openshift-versions.sh [--count N] [--format json|text] + +set -euo pipefail + +# Default values +COUNT=5 +FORMAT="text" +MIRROR_URL="https://mirror.openshift.com/pub/openshift-v4/clients/ocp/" + +# Parse arguments +while [[ $# -gt 0 ]]; do + case "$1" in + --count) + COUNT="$2" + shift 2 + ;; + --format) + FORMAT="$2" + shift 2 + ;; + --help|-h) + cat <&2 + echo "Use --help for usage information" >&2 + exit 1 + ;; + esac +done + +# Fetch versions from mirror +VERSIONS=$(curl -sL "$MIRROR_URL" | \ + grep -oE 'href="latest-[^"]*/"' | \ + sed 's/href="latest-//g' | \ + sed 's/\/"//g' | \ + sort -V -r | \ + head -n "$COUNT") + +# Check if any versions were found +if [ -z "$VERSIONS" ]; then + echo "Error: Failed to fetch OpenShift versions from mirror" >&2 + exit 1 +fi + +# Output in requested format +if [ "$FORMAT" = "json" ]; then + # Convert to JSON array + echo "$VERSIONS" | jq -R -s -c 'split("\n") | map(select(length > 0))' +else + # Text output (one per line) + echo "$VERSIONS" +fi diff --git a/plugins/openshift/scripts/setup-vcenter-env.sh b/plugins/openshift/scripts/setup-vcenter-env.sh new file mode 100755 index 0000000..6c6a26e --- /dev/null +++ b/plugins/openshift/scripts/setup-vcenter-env.sh @@ -0,0 +1,70 @@ +#!/usr/bin/env bash +# setup-vcenter-env.sh - Securely prompt for vCenter credentials and create environment file +# Usage: source $(bash setup-vcenter-env.sh) + +set -euo pipefail + +# Output file for environment variables +ENV_FILE=".work/.vcenter-env" + +echo "=== vCenter Connection Setup ===" +echo "" + +# Prompt for vCenter server +read -p "vCenter server URL (e.g., vcenter.example.com): " VCENTER_SERVER +if [ -z "$VCENTER_SERVER" ]; then + echo "Error: vCenter server URL cannot be empty" + exit 1 +fi + +# Prompt for username +read -p "vCenter username (e.g., user@vsphere.local): " VCENTER_USERNAME +if [ -z "$VCENTER_USERNAME" ]; then + echo "Error: Username cannot be empty" + exit 1 +fi + +# Prompt for password (securely - no echo) +read -s -p "vCenter password: " VCENTER_PASSWORD +echo "" +if [ -z "$VCENTER_PASSWORD" ]; then + echo "Error: Password cannot be empty" + exit 1 +fi + +# Prompt for insecure mode (skip certificate validation) +read -p "Skip certificate validation? (true/false, default: true): " VCENTER_INSECURE +VCENTER_INSECURE=${VCENTER_INSECURE:-true} + +# Create .work directory if it doesn't exist +mkdir -p .work + +# Create environment file +cat > "$ENV_FILE" < [options] +``` + +**Options:** +- `--zone-id ` - Route53 hosted zone ID for DNS checking +- `--max-candidates ` - Max IPs to return (default: 10) +- `--skip-first ` - Skip first N IPs (default: 10) +- `--skip-last ` - Skip last N IPs (default: 10) +- `--max-workers ` - Parallel workers (default: 20) +- `--verbose` - Show progress +- `--pretty` - Pretty-print JSON + +**Examples:** +```bash +# Basic scan +python3 scan-available-ips.py 10.0.0.0/24 + +# Scan with Route53 integration +python3 scan-available-ips.py 10.0.0.0/24 --zone-id Z1234567890ABC --verbose + +# Scan with custom range +python3 scan-available-ips.py 172.16.0.0/16 --skip-first 100 --max-candidates 20 +``` + +**Output:** +```json +[ + { + "ip": "10.0.0.100", + "available": true, + "ping_response": false, + "in_route53": false, + "route53_record": null + } +] +``` + +--- + +### manage-dns.sh + +Manage DNS records for OpenShift clusters (Route53 or manual). + +**Commands:** + +#### get-zone-id +Get Route53 hosted zone ID for a domain. + +```bash +manage-dns.sh get-zone-id --domain example.com +``` + +#### create-route53 +Create Route53 DNS A records for cluster endpoints. + +```bash +manage-dns.sh create-route53 \ + --cluster-name \ + --base-domain \ + --api-vip \ + --ingress-vip \ + [--zone-id ] \ + [--ttl ] +``` + +**Example:** +```bash +bash manage-dns.sh create-route53 \ + --cluster-name prod \ + --base-domain example.com \ + --api-vip 10.0.0.100 \ + --ingress-vip 10.0.0.101 +``` + +**Creates:** +- `api.prod.example.com` → 10.0.0.100 +- `api-int.prod.example.com` → 10.0.0.100 +- `*.apps.prod.example.com` → 10.0.0.101 + +#### verify +Verify DNS records resolve to expected IPs. + +```bash +manage-dns.sh verify \ + --cluster-name \ + --base-domain \ + --api-vip \ + --ingress-vip \ + [--timeout ] +``` + +**Example:** +```bash +bash manage-dns.sh verify \ + --cluster-name prod \ + --base-domain example.com \ + --api-vip 10.0.0.100 \ + --ingress-vip 10.0.0.101 \ + --timeout 60 +``` + +## Requirements + +### Python 3 +- Pre-installed on most systems +- Used for subnet scanning + +### dig (DNS lookup) +- **Linux**: `sudo yum install bind-utils` or `sudo apt-get install dnsutils` +- **macOS**: Pre-installed + +### AWS CLI (Optional) +- Only required for Route53 automation +- Install: https://aws.amazon.com/cli/ +- Configure: `aws configure` + +## How It Works + +### Subnet Scanning + +1. **Parse CIDR**: Convert CIDR to IP range +2. **Parallel Ping**: Ping IPs concurrently (default: 20 workers) +3. **Route53 Check**: Query Route53 for existing A records (optional) +4. **Filter Available**: Return IPs that don't respond to ping and aren't in Route53 +5. **Output JSON**: Structured data for easy parsing + +**Performance:** +- /24 subnet (~254 IPs): ~10-15 seconds +- /23 subnet (~510 IPs): ~20-30 seconds +- Stops after finding enough candidates + +### DNS Management + +**Route53 Mode:** +1. Auto-detect or use provided hosted zone ID +2. Create UPSERT change batch for 3 records +3. Apply changes via AWS API +4. Return zone ID + +**Manual Mode:** +1. Display DNS records user needs to create +2. Wait for user confirmation +3. Proceed to verification + +**Verification:** +1. Query DNS using `dig` command +2. Compare resolved IP to expected IP +3. Retry every 5 seconds (up to timeout) +4. Report success or failure + +## Use Cases + +### 1. Fully Automated (Route53) + +Best for AWS-based workflows with Route53 access. + +```bash +# 1. Scan subnet +IPS=$(python3 scan-available-ips.py 10.0.0.0/24 --verbose) + +# 2. User selects from available IPs +# (via UI or AskUserQuestion tool) + +# 3. Create DNS automatically +bash manage-dns.sh create-route53 \ + --cluster-name mycluster \ + --base-domain example.com \ + --api-vip 10.0.0.100 \ + --ingress-vip 10.0.0.101 + +# 4. Verify +bash manage-dns.sh verify \ + --cluster-name mycluster \ + --base-domain example.com \ + --api-vip 10.0.0.100 \ + --ingress-vip 10.0.0.101 +``` + +### 2. Manual DNS + +Best for non-AWS environments or when Route53 access is not available. + +```bash +# 1. Scan subnet +IPS=$(python3 scan-available-ips.py 10.0.0.0/24 --verbose) + +# 2. User selects VIPs + +# 3. Display instructions +echo "Create these DNS records:" +echo " api.cluster.example.com → 10.0.0.100" +echo " api-int.cluster.example.com → 10.0.0.100" +echo " *.apps.cluster.example.com → 10.0.0.101" + +# 4. User creates records manually + +# 5. Verify once created +bash manage-dns.sh verify \ + --cluster-name cluster \ + --base-domain example.com \ + --api-vip 10.0.0.100 \ + --ingress-vip 10.0.0.101 \ + --timeout 300 +``` + +### 3. Pre-existing DNS + +When DNS records already exist: + +```bash +# Just verify existing DNS +bash manage-dns.sh verify \ + --cluster-name existing \ + --base-domain example.com \ + --api-vip 10.0.0.100 \ + --ingress-vip 10.0.0.101 +``` + +## Environment Variables + +| Variable | Description | Required | +|----------|-------------|----------| +| `AWS_PROFILE` | AWS profile to use | No | +| `AWS_REGION` | AWS region | No | + +## Examples + +### Example 1: Basic Workflow + +```bash +# Scan and select VIPs +python3 scan-available-ips.py 10.0.0.0/24 --pretty + +# Create DNS +bash manage-dns.sh create-route53 \ + --cluster-name dev \ + --base-domain test.com \ + --api-vip 10.0.0.50 \ + --ingress-vip 10.0.0.51 + +# Verify +bash manage-dns.sh verify \ + --cluster-name dev \ + --base-domain test.com \ + --api-vip 10.0.0.50 \ + --ingress-vip 10.0.0.51 +``` + +### Example 2: Large Subnet + +```bash +# Scan /16 subnet with more workers +python3 scan-available-ips.py 172.16.0.0/16 \ + --max-workers 50 \ + --max-candidates 20 \ + --skip-first 100 \ + --verbose +``` + +### Example 3: Custom TTL + +```bash +# Development cluster with low TTL +bash manage-dns.sh create-route53 \ + --cluster-name dev \ + --base-domain dev.example.com \ + --api-vip 10.0.0.100 \ + --ingress-vip 10.0.0.101 \ + --ttl 60 +``` + +## Troubleshooting + +### No Available IPs Found + +**Problem**: Scanner finds 0 available IPs + +**Solutions:** +- Subnet may be fully allocated +- Try different CIDR +- Reduce `--skip-first` and `--skip-last` +- Check firewall isn't blocking ping + +### AWS Credentials Not Configured + +**Problem**: `Error: AWS credentials not configured` + +**Solution:** +```bash +aws configure +# Enter access key, secret key, region +``` + +### DNS Not Resolving + +**Problem**: Verification fails after timeout + +**Solutions:** +- Wait longer: `--timeout 300` +- Check DNS provider for propagation status +- Verify records were created correctly +- Flush local DNS cache + +### Permission Denied + +**Problem**: Route53 permission error + +**Solution:** +- Verify IAM user has `route53:ChangeResourceRecordSets` permission +- Check AWS credentials are for correct account + +## Integration + +### With `/openshift:install-vsphere` + +This skill is used in Phase 2, Step 5 (Network Configuration and VIP Selection). + +### With install-config.yaml + +Use the VIPs in your OpenShift install configuration: + +```yaml +platform: + vsphere: + apiVIP: 10.0.0.100 # From this skill + ingressVIP: 10.0.0.101 # From this skill + +networking: + machineNetwork: + - cidr: 10.0.0.0/24 # Scanned subnet +``` + +## Performance + +| Operation | Time | +|-----------|------| +| Scan /24 subnet | 10-15 seconds | +| Scan /16 subnet | 40-60 seconds | +| Route53 creation | 2-5 seconds | +| Route53 propagation | 5-30 seconds | +| DNS verification | <5 seconds | + +## Files + +``` +network-vip-configurator/ +├── scan-available-ips.py # Subnet scanner (Python) +├── manage-dns.sh # DNS management (Bash) +├── SKILL.md # AI skill instructions +└── README.md # This file +``` + +## Related + +- **Skills**: `plugins/openshift/skills/vsphere-discovery` - vSphere infrastructure discovery +- **Skills**: `plugins/openshift/skills/rhcos-template-manager` - RHCOS template management +- **Command**: `/openshift:install-vsphere` - Uses this skill for VIP configuration + +## Benefits + +✅ **Automated** - No manual IP hunting +✅ **Validated** - Ensures IPs are truly available +✅ **Fast** - Parallel scanning for quick results +✅ **Flexible** - Route53 or manual DNS +✅ **Reliable** - Verifies DNS before installation +✅ **Safe** - Prevents VIP conflicts diff --git a/plugins/openshift/skills/network-vip-configurator/SKILL.md b/plugins/openshift/skills/network-vip-configurator/SKILL.md new file mode 100644 index 0000000..4f9e6f8 --- /dev/null +++ b/plugins/openshift/skills/network-vip-configurator/SKILL.md @@ -0,0 +1,573 @@ +--- +name: Network VIP Configurator +description: Configure network VIPs and DNS records for OpenShift vSphere installations with automated subnet scanning and DNS verification +--- + +# Network VIP Configurator Skill + +This skill manages network VIP (Virtual IP) configuration and DNS record setup for OpenShift vSphere installations. It handles subnet scanning, VIP selection, DNS record creation (Route53 or manual), and DNS verification. + +## When to Use This Skill + +Use this skill when you need to: +- Find available IP addresses in a subnet for API and Ingress VIPs +- Configure DNS records for OpenShift cluster endpoints +- Verify DNS resolution before installation +- Automate VIP selection from subnet CIDR +- Support both Route53 (automated) and manual DNS workflows + +**Why use this skill?** +- **Automation**: Scans subnet to find available IPs automatically +- **Validation**: Verifies IPs are not in use (ping + Route53 check) +- **DNS Management**: Creates and verifies DNS records +- **Flexibility**: Supports Route53 or manual DNS setup +- **Error Prevention**: Reduces VIP conflicts and DNS issues + +This skill is used by: +- `/openshift:install-vsphere` - For network and VIP configuration (Phase 2, step 5) +- `/openshift:create-cluster` - For automated cluster provisioning + +## Prerequisites + +Before starting, ensure these tools are available: + +1. **Python 3** + - Check if available: `which python3` + - Required for subnet scanning + - Usually pre-installed on Linux and macOS + +2. **dig (DNS lookup)** + - Check if available: `which dig` + - Required for DNS verification + - Install if missing: + - Linux: `sudo yum install bind-utils` or `sudo apt-get install dnsutils` + - macOS: Pre-installed + +3. **AWS CLI** (Optional - for Route53 integration) + - Check if available: `which aws` + - Only required if using Route53 for DNS + - Skip if using manual DNS setup + - Install: https://aws.amazon.com/cli/ + +4. **Network Access** + - Ability to ping IPs in the subnet + - Access to Route53 (if using automated DNS) + +## Input Format + +The user will provide: + +1. **Subnet CIDR** - e.g., "10.0.0.0/24", "172.16.10.0/24" +2. **Cluster name** - e.g., "mycluster" +3. **Base domain** - e.g., "example.com", "devcluster.openshift.com" +4. **DNS mode**: + - `route53`: Automated DNS record creation using AWS Route53 + - `manual`: User creates DNS records manually, we verify +5. **Route53 hosted zone ID** (if using Route53 mode, optional - can be auto-detected) + +## Output Format + +Return a structured result containing VIP and DNS information: + +```json +{ + "api_vip": "10.0.0.100", + "ingress_vip": "10.0.0.101", + "dns_mode": "route53", + "dns_verified": true, + "dns_records": [ + "api.mycluster.example.com → 10.0.0.100", + "api-int.mycluster.example.com → 10.0.0.100", + "*.apps.mycluster.example.com → 10.0.0.101" + ], + "zone_id": "Z1234567890ABC" +} +``` + +## Implementation Steps + +### Step 1: Determine DNS Mode + +Ask the user which DNS mode they want to use: + +``` +How would you like to configure DNS for the cluster? + +1. Route53 (Automated) - Automatically create DNS records in AWS Route53 +2. Manual - You will create DNS records manually, and we'll verify them + +Recommended: Route53 if you have AWS access +``` + +**Set DNS_MODE based on user's choice:** +- `route53` - We will create DNS records automatically +- `manual` - User creates DNS records, we verify + +### Step 2: Get Subnet CIDR + +The subnet CIDR should be obtained from the user or from vSphere network discovery. This is the machine network CIDR associated with the port group. + +Example: "10.0.0.0/24" or "172.16.10.0/24" + +### Step 3: Scan Subnet for Available IPs + +Use the Python scanner to find available IP addresses: + +```bash +# Scan subnet for available IPs +# The script will: +# - Ping each IP to check if it responds +# - Check Route53 for existing A records (if zone ID provided) +# - Return list of available IPs + +AVAILABLE_IPS=$(python3 plugins/openshift/skills/network-vip-configurator/scan-available-ips.py \ + "${SUBNET_CIDR}" \ + --max-candidates 10 \ + --skip-first 10 \ + --skip-last 10 \ + --verbose) + +# Parse JSON to get list of available IPs +echo "$AVAILABLE_IPS" | jq -r '.[].ip' +``` + +**With Route53 Integration (optional):** +If Route53 zone ID is known, include it to check for existing DNS records: + +```bash +ZONE_ID=$(bash plugins/openshift/skills/network-vip-configurator/manage-dns.sh get-zone-id --domain "${BASE_DOMAIN}") + +AVAILABLE_IPS=$(python3 plugins/openshift/skills/network-vip-configurator/scan-available-ips.py \ + "${SUBNET_CIDR}" \ + --zone-id "${ZONE_ID}" \ + --max-candidates 10 \ + --verbose) +``` + +**Error Handling:** +- If no available IPs found: Suggest expanding CIDR or checking network +- If scan fails: Check network connectivity and permissions + +### Step 4: Present Available IPs to User + +Parse the JSON and present IPs in a user-friendly way: + +```bash +# Parse available IPs +AVAILABLE_IPS_LIST=$(echo "$AVAILABLE_IPS" | jq -r '.[].ip') + +# Count available IPs +AVAILABLE_COUNT=$(echo "$AVAILABLE_IPS" | jq 'length') + +if [ "$AVAILABLE_COUNT" -lt 2 ]; then + echo "Error: Need at least 2 available IPs (one for API, one for Ingress)" + echo "Found only $AVAILABLE_COUNT available IP(s)" + exit 1 +fi + +echo "Found $AVAILABLE_COUNT available IP addresses in subnet $SUBNET_CIDR:" +echo "$AVAILABLE_IPS_LIST" +``` + +Use `AskUserQuestion` tool to present dropdowns for: +1. **API VIP** - Select from available IPs +2. **Ingress VIP** - Select from remaining available IPs (different from API VIP) + +**Important:** API VIP and Ingress VIP must be different IPs! + +### Step 5: Configure DNS Records + +Based on DNS_MODE, either create Route53 records or guide user to create manual DNS. + +#### Option A: Route53 Mode (Automated) + +```bash +# Create Route53 DNS records +# The script will: +# - Auto-detect or use provided hosted zone ID +# - Create/update A records for api, api-int, *.apps +# - Return zone ID for reference + +ZONE_ID=$(bash plugins/openshift/skills/network-vip-configurator/manage-dns.sh create-route53 \ + --cluster-name "${CLUSTER_NAME}" \ + --base-domain "${BASE_DOMAIN}" \ + --api-vip "${API_VIP}" \ + --ingress-vip "${INGRESS_VIP}") + +echo "DNS records created in Route53 zone: $ZONE_ID" +``` + +**What this creates:** +- `api.${CLUSTER_NAME}.${BASE_DOMAIN}` → API VIP +- `api-int.${CLUSTER_NAME}.${BASE_DOMAIN}` → API VIP +- `*.apps.${CLUSTER_NAME}.${BASE_DOMAIN}` → Ingress VIP + +**Error Handling:** +- AWS credentials not configured: Guide user to run `aws configure` +- Zone not found: List available zones and ask user to select +- Permission denied: Verify IAM permissions for Route53 + +#### Option B: Manual Mode + +Guide the user to create DNS records manually: + +``` +Please create the following DNS A records in your DNS provider: + + api.${CLUSTER_NAME}.${BASE_DOMAIN} → ${API_VIP} + api-int.${CLUSTER_NAME}.${BASE_DOMAIN} → ${API_VIP} + *.apps.${CLUSTER_NAME}.${BASE_DOMAIN} → ${INGRESS_VIP} + +These records are required for the OpenShift installer to function correctly. + +Press ENTER when you have created the DNS records and they have propagated... +``` + +Wait for user confirmation before proceeding to verification. + +### Step 6: Verify DNS Records + +Always verify DNS records resolve correctly, regardless of DNS mode: + +```bash +# Verify DNS records using dig +# The script will: +# - Query each DNS record (api, api-int, *.apps) +# - Verify they resolve to expected VIPs +# - Wait for DNS propagation (up to timeout) +# - Return success if all records verified + +bash plugins/openshift/skills/network-vip-configurator/manage-dns.sh verify \ + --cluster-name "${CLUSTER_NAME}" \ + --base-domain "${BASE_DOMAIN}" \ + --api-vip "${API_VIP}" \ + --ingress-vip "${INGRESS_VIP}" \ + --timeout 60 + +if [ $? -eq 0 ]; then + echo "✓ All DNS records verified successfully" +else + echo "✗ DNS verification failed" + echo "Please check DNS records and wait for propagation" + exit 1 +fi +``` + +**What is verified:** +1. `api.${CLUSTER_NAME}.${BASE_DOMAIN}` resolves to API VIP +2. `api-int.${CLUSTER_NAME}.${BASE_DOMAIN}` resolves to API VIP +3. `test.apps.${CLUSTER_NAME}.${BASE_DOMAIN}` resolves to Ingress VIP (wildcard test) + +**Timeout and Retry:** +- Default timeout: 60 seconds +- Checks every 5 seconds +- Displays progress to user +- If timeout reached, suggest waiting longer or checking DNS provider + +**Error Handling:** +- DNS not resolving: Wait longer or check DNS provider +- Resolves to wrong IP: Verify DNS records are correct +- Wildcard not working: Check DNS provider supports wildcard records + +### Step 7: Return Results + +Compile all information and return as structured data: + +```bash +# Create result JSON +cat > /tmp/network-vip-result.json < [options] + +Manage DNS records for OpenShift cluster VIPs. + +Commands: + create-route53 Create Route53 DNS records (api, api-int, *.apps) + verify Verify DNS records resolve correctly + get-zone-id Get Route53 hosted zone ID for domain + +Options for 'create-route53': + --cluster-name Cluster name (required) + --base-domain Base domain (required) + --api-vip API VIP address (required) + --ingress-vip Ingress VIP address (required) + --zone-id Route53 hosted zone ID (auto-detected if not specified) + --ttl DNS TTL in seconds (default: 300) + +Options for 'verify': + --cluster-name Cluster name (required) + --base-domain Base domain (required) + --api-vip Expected API VIP address (required) + --ingress-vip Expected Ingress VIP address (required) + --timeout Verification timeout (default: 60) + +Options for 'get-zone-id': + --domain Base domain (required) + +Environment variables: + AWS_PROFILE AWS profile to use (optional) + AWS_REGION AWS region (optional) + +Examples: + # Get hosted zone ID + $0 get-zone-id --domain example.com + + # Create Route53 DNS records + $0 create-route53 \\ + --cluster-name mycluster \\ + --base-domain example.com \\ + --api-vip 10.0.0.100 \\ + --ingress-vip 10.0.0.101 + + # Verify DNS records + $0 verify \\ + --cluster-name mycluster \\ + --base-domain example.com \\ + --api-vip 10.0.0.100 \\ + --ingress-vip 10.0.0.101 +EOF +} + +# Logging functions +log_info() { + echo -e "${GREEN}[INFO]${NC} $*" +} + +log_warn() { + echo -e "${YELLOW}[WARN]${NC} $*" >&2 +} + +log_error() { + echo -e "${RED}[ERROR]${NC} $*" >&2 +} + +log_debug() { + echo -e "${BLUE}[DEBUG]${NC} $*" >&2 +} + +# Check if AWS CLI is available +check_aws_cli() { + if ! command -v aws &>/dev/null; then + log_error "AWS CLI is not installed" + log_error "Install it from: https://aws.amazon.com/cli/" + exit 1 + fi + + # Test AWS credentials + if ! aws sts get-caller-identity &>/dev/null; then + log_error "AWS credentials not configured" + log_error "Run: aws configure" + exit 1 + fi +} + +# Get Route53 hosted zone ID for domain +get_zone_id() { + local domain="$1" + + log_info "Looking up hosted zone for domain: $domain" + + # Query Route53 for hosted zone + local zone_id + zone_id=$(aws route53 list-hosted-zones \ + --query "HostedZones[?Name=='${domain}.'].Id" \ + --output text | cut -d'/' -f3) + + if [ -z "$zone_id" ]; then + log_error "No hosted zone found for domain: $domain" + log_info "Available zones:" + aws route53 list-hosted-zones \ + --query "HostedZones[].{Name:Name,ID:Id}" \ + --output table + exit 1 + fi + + echo "$zone_id" +} + +# Create Route53 DNS records +create_route53_records() { + local cluster_name="$1" + local base_domain="$2" + local api_vip="$3" + local ingress_vip="$4" + local zone_id="${5:-}" + local ttl="${6:-300}" + + log_info "Creating Route53 DNS records..." + log_info "Cluster: $cluster_name.$base_domain" + log_info "API VIP: $api_vip" + log_info "Ingress VIP: $ingress_vip" + + # Get zone ID if not provided + if [ -z "$zone_id" ]; then + zone_id=$(get_zone_id "$base_domain") + log_info "Detected hosted zone ID: $zone_id" + fi + + # Create change batch JSON + local change_batch_file="/tmp/route53-changes-$$.json" + trap "rm -f $change_batch_file" EXIT + + cat > "$change_batch_file" <&2 + + # Apply changes + log_info "Applying DNS changes to Route53..." + local change_id + change_id=$(aws route53 change-resource-record-sets \ + --hosted-zone-id "$zone_id" \ + --change-batch "file://$change_batch_file" \ + --query 'ChangeInfo.Id' \ + --output text) + + log_info "✓ DNS records created" + log_info "Change ID: $change_id" + echo "" + log_info "Created records:" + log_info " api.${cluster_name}.${base_domain} → ${api_vip}" + log_info " api-int.${cluster_name}.${base_domain} → ${api_vip}" + log_info " *.apps.${cluster_name}.${base_domain} → ${ingress_vip}" + + # Return zone ID for later use + echo "$zone_id" +} + +# Verify DNS records resolve correctly +verify_dns_records() { + local cluster_name="$1" + local base_domain="$2" + local expected_api_vip="$3" + local expected_ingress_vip="$4" + local timeout="${5:-60}" + + log_info "Verifying DNS records..." + + local api_record="api.${cluster_name}.${base_domain}" + local api_int_record="api-int.${cluster_name}.${base_domain}" + local apps_record="test.apps.${cluster_name}.${base_domain}" + + local all_verified=true + local start_time=$(date +%s) + local end_time=$((start_time + timeout)) + + # Wait for DNS propagation with timeout + while [ $(date +%s) -lt $end_time ]; do + all_verified=true + + # Check api record + log_info "Checking $api_record..." + local api_resolved + api_resolved=$(dig +short "$api_record" | tail -n1) + + if [ "$api_resolved" = "$expected_api_vip" ]; then + log_info " ✓ $api_record → $api_resolved" + else + log_warn " ✗ $api_record → $api_resolved (expected: $expected_api_vip)" + all_verified=false + fi + + # Check api-int record + log_info "Checking $api_int_record..." + local api_int_resolved + api_int_resolved=$(dig +short "$api_int_record" | tail -n1) + + if [ "$api_int_resolved" = "$expected_api_vip" ]; then + log_info " ✓ $api_int_record → $api_int_resolved" + else + log_warn " ✗ $api_int_record → $api_int_resolved (expected: $expected_api_vip)" + all_verified=false + fi + + # Check *.apps wildcard + log_info "Checking $apps_record..." + local apps_resolved + apps_resolved=$(dig +short "$apps_record" | tail -n1) + + if [ "$apps_resolved" = "$expected_ingress_vip" ]; then + log_info " ✓ $apps_record → $apps_resolved" + else + log_warn " ✗ $apps_record → $apps_resolved (expected: $expected_ingress_vip)" + all_verified=false + fi + + if [ "$all_verified" = true ]; then + log_info "✓ All DNS records verified successfully" + return 0 + fi + + # Wait before retrying + local remaining=$((end_time - $(date +%s))) + if [ $remaining -gt 0 ]; then + log_info "DNS not fully propagated, waiting 5 seconds... ($remaining seconds remaining)" + sleep 5 + fi + done + + # Timeout reached + if [ "$all_verified" = false ]; then + log_error "DNS verification failed after ${timeout}s timeout" + log_error "Some records did not resolve correctly" + return 1 + fi + + return 0 +} + +# Main command dispatcher +main() { + if [ $# -lt 1 ]; then + usage + exit 1 + fi + + local command="$1" + shift + + case "$command" in + get-zone-id) + local domain="" + + while [ $# -gt 0 ]; do + case "$1" in + --domain) + domain="$2" + shift 2 + ;; + *) + log_error "Unknown option: $1" + exit 1 + ;; + esac + done + + if [ -z "$domain" ]; then + log_error "Missing required option: --domain" + exit 1 + fi + + check_aws_cli + get_zone_id "$domain" + ;; + + create-route53) + local cluster_name="" + local base_domain="" + local api_vip="" + local ingress_vip="" + local zone_id="" + local ttl="300" + + while [ $# -gt 0 ]; do + case "$1" in + --cluster-name) + cluster_name="$2" + shift 2 + ;; + --base-domain) + base_domain="$2" + shift 2 + ;; + --api-vip) + api_vip="$2" + shift 2 + ;; + --ingress-vip) + ingress_vip="$2" + shift 2 + ;; + --zone-id) + zone_id="$2" + shift 2 + ;; + --ttl) + ttl="$2" + shift 2 + ;; + *) + log_error "Unknown option: $1" + exit 1 + ;; + esac + done + + if [ -z "$cluster_name" ] || [ -z "$base_domain" ] || [ -z "$api_vip" ] || [ -z "$ingress_vip" ]; then + log_error "Missing required options: --cluster-name, --base-domain, --api-vip, --ingress-vip" + exit 1 + fi + + check_aws_cli + create_route53_records "$cluster_name" "$base_domain" "$api_vip" "$ingress_vip" "$zone_id" "$ttl" + ;; + + verify) + local cluster_name="" + local base_domain="" + local api_vip="" + local ingress_vip="" + local timeout="60" + + while [ $# -gt 0 ]; do + case "$1" in + --cluster-name) + cluster_name="$2" + shift 2 + ;; + --base-domain) + base_domain="$2" + shift 2 + ;; + --api-vip) + api_vip="$2" + shift 2 + ;; + --ingress-vip) + ingress_vip="$2" + shift 2 + ;; + --timeout) + timeout="$2" + shift 2 + ;; + *) + log_error "Unknown option: $1" + exit 1 + ;; + esac + done + + if [ -z "$cluster_name" ] || [ -z "$base_domain" ] || [ -z "$api_vip" ] || [ -z "$ingress_vip" ]; then + log_error "Missing required options: --cluster-name, --base-domain, --api-vip, --ingress-vip" + exit 1 + fi + + if ! command -v dig &>/dev/null; then + log_error "dig command not found (install bind-utils or dnsutils)" + exit 1 + fi + + verify_dns_records "$cluster_name" "$base_domain" "$api_vip" "$ingress_vip" "$timeout" + ;; + + help|--help|-h) + usage + exit 0 + ;; + + *) + log_error "Unknown command: $command" + usage + exit 1 + ;; + esac +} + +main "$@" diff --git a/plugins/openshift/skills/network-vip-configurator/scan-available-ips.py b/plugins/openshift/skills/network-vip-configurator/scan-available-ips.py new file mode 100755 index 0000000..d8d5a5f --- /dev/null +++ b/plugins/openshift/skills/network-vip-configurator/scan-available-ips.py @@ -0,0 +1,306 @@ +#!/usr/bin/env python3 +""" +scan-available-ips.py - Scan subnet for available IP addresses for VIPs + +This script scans a subnet CIDR to find available IP addresses by: +1. Pinging IPs to check if they respond +2. Optionally checking Route53 for existing A records +3. Returning a list of available IPs suitable for API and Ingress VIPs +""" + +import argparse +import ipaddress +import json +import subprocess +import sys +import concurrent.futures + + +def parse_cidr(cidr): + """Parse CIDR notation to get network object""" + try: + return ipaddress.ip_network(cidr, strict=False) + except ValueError as e: + raise ValueError(f"Invalid CIDR notation '{cidr}': {e}") + + +def ping_ip(ip, timeout=1, count=2): + """ + Ping an IP address to check if it's in use + + Returns: + bool: True if IP responds to ping, False otherwise + """ + try: + # Use -W for timeout on Linux, -t on macOS + cmd = ['ping', '-c', str(count), '-W', str(timeout), str(ip)] + result = subprocess.run( + cmd, + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, + timeout=timeout * count + 1 + ) + return result.returncode == 0 + except subprocess.TimeoutExpired: + return False + except Exception: + return False + + +def check_route53_record(ip, hosted_zone_id=None): + """ + Check if IP exists in Route53 A records + + Args: + ip: IP address to check + hosted_zone_id: Route53 hosted zone ID (optional) + + Returns: + tuple: (exists: bool, record_name: str or None) + """ + if not hosted_zone_id: + # Skip Route53 check if no zone ID provided + return False, None + + try: + # Query Route53 for A records matching this IP + cmd = [ + 'aws', 'route53', 'list-resource-record-sets', + '--hosted-zone-id', hosted_zone_id, + '--query', f"ResourceRecordSets[?Type=='A' && ResourceRecords[0].Value=='{ip}'].Name", + '--output', 'text' + ] + + result = subprocess.run( + cmd, + stdout=subprocess.PIPE, + stderr=subprocess.DEVNULL, + text=True, + timeout=10 + ) + + if result.returncode == 0 and result.stdout.strip(): + return True, result.stdout.strip() + + return False, None + + except subprocess.TimeoutExpired: + print(f"Warning: Route53 query timeout for {ip}", file=sys.stderr) + return False, None + except Exception as e: + print(f"Warning: Route53 query failed for {ip}: {e}", file=sys.stderr) + return False, None + + +def scan_ip(ip, hosted_zone_id=None, verbose=False): + """ + Scan a single IP to determine if it's available + + Returns: + dict: { + 'ip': str, + 'available': bool, + 'ping_response': bool, + 'in_route53': bool, + 'route53_record': str or None + } + """ + ip_str = str(ip) + + if verbose: + print(f"Scanning {ip_str}...", file=sys.stderr) + + # Check ping + ping_response = ping_ip(ip_str) + + # Check Route53 + in_route53, route53_record = check_route53_record(ip_str, hosted_zone_id) + + # IP is available if it doesn't respond to ping AND is not in Route53 + available = not ping_response and not in_route53 + + return { + 'ip': ip_str, + 'available': available, + 'ping_response': ping_response, + 'in_route53': in_route53, + 'route53_record': route53_record + } + + +def scan_subnet(cidr, hosted_zone_id=None, max_candidates=10, skip_first=10, skip_last=10, max_workers=20, verbose=False): + """ + Scan subnet CIDR for available IP addresses + + Args: + cidr: Subnet CIDR (e.g., "10.0.0.0/24") + hosted_zone_id: Route53 hosted zone ID (optional) + max_candidates: Maximum number of available IPs to return + skip_first: Skip first N IPs in subnet (network, gateway, etc.) + skip_last: Skip last N IPs in subnet (broadcast, etc.) + max_workers: Maximum parallel workers for scanning + verbose: Print progress to stderr + + Returns: + list: List of available IP dictionaries + """ + network = parse_cidr(cidr) + + # Get list of IPs to scan + all_ips = list(network.hosts()) + + if len(all_ips) <= skip_first + skip_last: + raise ValueError(f"Subnet too small: only {len(all_ips)} usable IPs") + + # Skip first and last N IPs + ips_to_scan = all_ips[skip_first:-skip_last] if skip_last > 0 else all_ips[skip_first:] + + if verbose: + print(f"Scanning {len(ips_to_scan)} IPs in subnet {cidr}...", file=sys.stderr) + print(f"Skipped first {skip_first} and last {skip_last} IPs", file=sys.stderr) + + # Scan IPs in parallel + available_ips = [] + + with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor: + futures = { + executor.submit(scan_ip, ip, hosted_zone_id, verbose): ip + for ip in ips_to_scan + } + + for future in concurrent.futures.as_completed(futures): + result = future.result() + + if result['available']: + available_ips.append(result) + + if verbose: + print(f"✓ Found available IP: {result['ip']}", file=sys.stderr) + + # Stop when we have enough candidates + if len(available_ips) >= max_candidates: + # Cancel remaining futures + for f in futures: + f.cancel() + break + + # Sort by IP address + available_ips.sort(key=lambda x: ipaddress.ip_address(x['ip'])) + + return available_ips[:max_candidates] + + +def main(): + parser = argparse.ArgumentParser( + description='Scan subnet for available IP addresses for OpenShift VIPs', + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=""" +Examples: + # Scan subnet for available IPs + %(prog)s 10.0.0.0/24 + + # Scan with Route53 integration + %(prog)s 10.0.0.0/24 --zone-id Z1234567890ABC + + # Get more candidates + %(prog)s 10.0.0.0/24 --max-candidates 20 + + # Verbose output + %(prog)s 10.0.0.0/24 --verbose + +Output format (JSON): + [ + { + "ip": "10.0.0.100", + "available": true, + "ping_response": false, + "in_route53": false, + "route53_record": null + }, + ... + ] + """ + ) + + parser.add_argument( + 'cidr', + help='Subnet CIDR to scan (e.g., 10.0.0.0/24)' + ) + + parser.add_argument( + '--zone-id', + help='Route53 hosted zone ID for DNS checking (optional)' + ) + + parser.add_argument( + '--max-candidates', + type=int, + default=10, + help='Maximum number of available IPs to return (default: 10)' + ) + + parser.add_argument( + '--skip-first', + type=int, + default=10, + help='Skip first N IPs in subnet (default: 10)' + ) + + parser.add_argument( + '--skip-last', + type=int, + default=10, + help='Skip last N IPs in subnet (default: 10)' + ) + + parser.add_argument( + '--max-workers', + type=int, + default=20, + help='Maximum parallel workers (default: 20)' + ) + + parser.add_argument( + '--verbose', '-v', + action='store_true', + help='Print progress to stderr' + ) + + parser.add_argument( + '--pretty', + action='store_true', + help='Pretty-print JSON output' + ) + + args = parser.parse_args() + + try: + # Scan subnet + available_ips = scan_subnet( + args.cidr, + hosted_zone_id=args.zone_id, + max_candidates=args.max_candidates, + skip_first=args.skip_first, + skip_last=args.skip_last, + max_workers=args.max_workers, + verbose=args.verbose + ) + + if args.verbose: + print(f"\n✓ Found {len(available_ips)} available IP(s)", file=sys.stderr) + + # Output JSON + if args.pretty: + print(json.dumps(available_ips, indent=2)) + else: + print(json.dumps(available_ips)) + + return 0 + + except Exception as e: + print(f"Error: {e}", file=sys.stderr) + return 1 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/plugins/openshift/skills/rhcos-template-manager/README.md b/plugins/openshift/skills/rhcos-template-manager/README.md new file mode 100644 index 0000000..ca6a771 --- /dev/null +++ b/plugins/openshift/skills/rhcos-template-manager/README.md @@ -0,0 +1,431 @@ +--- +name: RHCOS Template Manager +description: Download and manage RHCOS OVA templates for OpenShift vSphere installations +--- + +# RHCOS Template Manager + +Download, cache, and upload RHCOS (Red Hat CoreOS) OVA templates to vSphere for faster OpenShift installations. + +## Why Use Templates? + +Installing OpenShift without a pre-uploaded template requires the installer to upload a ~1GB OVA during installation: +- **Over VPN**: 30-60 minutes per installation +- **On fast network**: 5-10 minutes per installation + +With a pre-uploaded template: +- **First cluster**: 10-30 minutes (one-time upload) +- **Subsequent clusters**: 2-5 minutes (template clone) + +**Total savings**: 25-55 minutes per cluster after the first one! + +## Quick Start + +### Install a Template + +```bash +# Set up vSphere connection +export VSPHERE_SERVER="vcenter.example.com" +export VSPHERE_USERNAME="administrator@vsphere.local" +export VSPHERE_PASSWORD="your-password" + +# Download and upload RHCOS template for OpenShift 4.20 +bash plugins/openshift/skills/rhcos-template-manager/manage-rhcos-template.sh install 4.20 \ + --datacenter DC1 \ + --datastore /DC1/datastore/datastore1 \ + --cluster /DC1/host/Cluster1 +``` + +### Use in install-config.yaml + +```yaml +platform: + vsphere: + failureDomains: + - name: us-east-1 + topology: + datacenter: DC1 + computeCluster: /DC1/host/Cluster1 + datastore: /DC1/datastore/datastore1 + template: /DC1/vm/rhcos-420.94.202501071309-0-template # ← Template path +``` + +## Commands + +### install - Download and Upload in One Step + +Download RHCOS OVA and upload to vSphere as a template. + +**Usage:** +```bash +manage-rhcos-template.sh install \ + --datacenter \ + --datastore \ + --cluster +``` + +**Example:** +```bash +bash manage-rhcos-template.sh install 4.20 \ + --datacenter DC1 \ + --datastore /DC1/datastore/datastore1 \ + --cluster /DC1/host/Cluster1 +``` + +**Options:** +- `--template-name ` - Custom template name (default: auto-generated) +- `--use-govc` - Force use of govc instead of vsphere-helper + +--- + +### download - Download OVA Only + +Download RHCOS OVA to local cache without uploading to vSphere. + +**Usage:** +```bash +manage-rhcos-template.sh download +``` + +**Example:** +```bash +bash manage-rhcos-template.sh download 4.20 +# Output: .work/openshift-vsphere-install/ova-cache/rhcos-vmware.x86_64.ova +``` + +--- + +### upload - Upload Existing OVA + +Upload a previously downloaded OVA to vSphere. + +**Usage:** +```bash +manage-rhcos-template.sh upload \ + --datacenter \ + --datastore \ + --cluster +``` + +**Example:** +```bash +bash manage-rhcos-template.sh upload /path/to/rhcos.ova \ + --datacenter DC1 \ + --datastore /DC1/datastore/datastore1 \ + --cluster /DC1/host/Cluster1 +``` + +--- + +### list - List Cached OVAs + +Show all cached OVA files and their sizes. + +**Usage:** +```bash +manage-rhcos-template.sh list +``` + +**Example Output:** +``` +Cached OVA files in .work/openshift-vsphere-install/ova-cache: + + rhcos-4.20.ova (1.2G) - RHCOS 420.94.202501071309-0 + rhcos-4.19.ova (1.1G) - RHCOS 419.92.202412345678-0 + +Total cache size: 2.3G +``` + +--- + +### clean - Remove Cached OVAs + +Remove all cached OVA files to free up disk space. + +**Usage:** +```bash +manage-rhcos-template.sh clean +``` + +## Environment Variables + +| Variable | Description | Required | Default | +|----------|-------------|----------|---------| +| `VSPHERE_SERVER` | vCenter server hostname | Yes | - | +| `VSPHERE_USERNAME` | vCenter username | Yes | - | +| `VSPHERE_PASSWORD` | vCenter password | Yes | - | +| `VSPHERE_INSECURE` | Skip SSL verification | No | `false` | +| `CACHE_DIR` | OVA cache directory | No | `.work/openshift-vsphere-install/ova-cache` | + +## How It Works + +### Download Process + +1. **Fetch Metadata**: Query OpenShift installer repository for RHCOS version and OVA URL +2. **Check Cache**: Look for existing OVA in cache directory +3. **Download**: Download OVA from official mirror (if not cached) +4. **Verify**: Validate SHA256 checksum +5. **Cache**: Store OVA for reuse + +### Upload Process + +1. **Check Existence**: Verify if template already exists in vSphere +2. **Import OVA**: Upload OVA to vSphere as a VM using `govc` +3. **Verify**: Confirm template was created successfully +4. **Return Path**: Output template path for use in install-config.yaml + +### Caching Behavior + +- OVAs are cached in `.work/openshift-vsphere-install/ova-cache/` +- Checksums are verified before reuse +- Corrupted files are automatically re-downloaded +- Cache persists across installations +- Use `clean` command to remove cached files + +## Version Mapping + +The skill automatically maps OpenShift versions to RHCOS versions: + +| OpenShift Version | RHCOS Branch | OVA Source | +|-------------------|--------------|------------| +| 4.20.x | release-4.20 | `openshift/installer` GitHub | +| 4.19.x | release-4.19 | `openshift/installer` GitHub | +| 4.18.x | release-4.18 | `openshift/installer` GitHub | + +## Prerequisites + +### Required Tools + +- **Python 3** - For metadata fetching +- **curl** - For downloading OVAs +- **govc** - For uploading to vSphere + +Install govc if missing: +```bash +bash plugins/openshift/scripts/install-govc.sh +``` + +### vSphere Requirements + +- vCenter 7.x or 8.x +- User with permissions to: + - Import OVAs + - Create VMs + - Access datastores + +### Network Access + +- Internet connection to download OVAs +- Access to vCenter server +- GitHub access for metadata + +## SSL Certificates + +For secure connections (recommended), install vCenter certificates: + +```bash +bash plugins/openshift/scripts/install-vcenter-certs.sh vcenter.example.com +``` + +Or use insecure connection (not recommended): +```bash +export VSPHERE_INSECURE=true +``` + +## Examples + +### Example 1: Basic Installation + +```bash +# Set credentials +export VSPHERE_SERVER="vcenter.example.com" +export VSPHERE_USERNAME="admin@vsphere.local" +export VSPHERE_PASSWORD="password" + +# Install template +bash manage-rhcos-template.sh install 4.20 \ + --datacenter DC1 \ + --datastore /DC1/datastore/ds1 \ + --cluster /DC1/host/Cluster1 + +# Output: +# [INFO] Fetching RHCOS metadata for OpenShift 4.20... +# [INFO] RHCOS version: 420.94.202501071309-0 +# [INFO] Downloading RHCOS OVA (this may take several minutes)... +# [INFO] ✓ Download complete +# [INFO] ✓ Checksum verified +# [INFO] Uploading OVA to vSphere (this may take 10-30 minutes)... +# [INFO] ✓ OVA import complete +# [INFO] ✓ Template created successfully: /DC1/vm/rhcos-420.94.202501071309-0-template +# +# /DC1/vm/rhcos-420.94.202501071309-0-template +``` + +### Example 2: Download Then Upload + +```bash +# Download OVA (can be done offline or before vSphere access) +OVA_PATH=$(bash manage-rhcos-template.sh download 4.20) +echo "Downloaded: $OVA_PATH" + +# Later, upload to vSphere +bash manage-rhcos-template.sh upload "$OVA_PATH" \ + --datacenter DC1 \ + --datastore /DC1/datastore/ds1 \ + --cluster /DC1/host/Cluster1 +``` + +### Example 3: Custom Template Name + +```bash +bash manage-rhcos-template.sh install 4.20 \ + --datacenter DC1 \ + --datastore /DC1/datastore/ds1 \ + --cluster /DC1/host/Cluster1 \ + --template-name my-custom-rhcos-template +``` + +### Example 4: Cache Management + +```bash +# List cached OVAs +bash manage-rhcos-template.sh list + +# Clean cache to free space +bash manage-rhcos-template.sh clean +``` + +## Troubleshooting + +### Download Fails + +**Problem**: `Error: failed to fetch RHCOS metadata` + +**Solution**: +- Check internet connection +- Verify OpenShift version exists +- Check GitHub access + +--- + +### Upload Fails + +**Problem**: `Error: failed to connect to vSphere` + +**Solution**: +- Verify vCenter credentials +- Install certificates or use `VSPHERE_INSECURE=true` +- Check network access to vCenter + +--- + +### Template Already Exists + +**Behavior**: Script detects existing template and skips upload + +**Output**: +``` +[INFO] Template already exists: rhcos-420.94.202501071309-0-template +[INFO] Skipping upload +``` + +This is normal and saves time. + +--- + +### Checksum Mismatch + +**Problem**: `Error: Checksum verification failed!` + +**Solution**: Script automatically re-downloads. If persistent: +- Check disk space +- Verify network stability +- Report issue if mirror is corrupted + +--- + +### Permission Denied + +**Problem**: `Error: insufficient privileges` + +**Solution**: +- Verify user has permission to import OVAs +- Check datastore access +- Verify resource pool permissions + +## Performance + +### Network Impact + +| Connection | Download Time | Upload Time | Total | +|------------|---------------|-------------|-------| +| Fast (100+ Mbps) | 1-3 min | 5-10 min | 6-13 min | +| VPN (~10 Mbps) | 10-20 min | 15-30 min | 25-50 min | +| Slow (<5 Mbps) | 20-40 min | 30-60 min | 50-100 min | + +### Disk Space + +- Each OVA: ~1-1.5 GB +- Recommended: 10+ GB free space for cache + +### Time Savings + +| Scenario | Without Template | With Template | Savings | +|----------|------------------|---------------|---------| +| First cluster | 30-60 min | 25-50 min | 5-10 min | +| Second cluster | 30-60 min | 2-5 min | 25-55 min | +| Ten clusters | 300-600 min | 50-75 min | 250-525 min | + +## Integration + +### With `/openshift:install-vsphere` + +This skill is automatically invoked by the install-vsphere command when the user chooses to pre-upload the template. + +### With Custom Workflows + +```bash +#!/bin/bash +# Custom cluster creation script + +# 1. Install template once +TEMPLATE=$(bash manage-rhcos-template.sh install 4.20 \ + --datacenter DC1 \ + --datastore /DC1/datastore/ds1 \ + --cluster /DC1/host/Cluster1) + +# 2. Create multiple clusters using the same template +for cluster in cluster1 cluster2 cluster3; do + # Generate install-config.yaml with template path + cat > install-config.yaml </dev/null | head -1) + +if [ -n "$EXISTING_TEMPLATE" ]; then + echo "✓ Found existing template: $EXISTING_TEMPLATE" + TEMPLATE_PATH="$EXISTING_TEMPLATE" + echo "Using existing template instead of uploading" +else + echo "No existing template found for RHCOS version $RHCOS_VERSION" + TEMPLATE_PATH="" +fi +``` + +**What this checks:** +1. Searches for VMs or templates with the RHCOS version in the name +2. Searches in the entire datacenter (both vm and template folders) +3. If found, uses the existing template path +4. If not found, template path is empty (skip upload or proceed to Step 7) + +**Why check first:** +- Avoids duplicate uploads (saves 10-30 minutes) +- Reuses existing templates from previous installations +- Prevents wasting datastore space + +### Step 7: Upload OVA to vSphere (If Not Exists) + +If no existing template was found in Step 6, upload the OVA: + +```bash +if [ -z "$TEMPLATE_PATH" ]; then + echo "No existing template found. Uploading OVA to vSphere..." + + # Upload OVA to vSphere + # This will: + # - Import OVA as a VM + # - Verify template creation + # - Return template path + + TEMPLATE_PATH=$(bash plugins/openshift/skills/rhcos-template-manager/manage-rhcos-template.sh upload \ + "$OVA_PATH" \ + --datacenter "$DATACENTER_NAME" \ + --datastore "$DATASTORE_PATH" \ + --cluster "$CLUSTER_PATH") + + echo "Template created at: $TEMPLATE_PATH" +else + echo "Skipping upload - using existing template" +fi +``` + +**What happens:** +1. Uses `govc import.ova` to upload +2. Shows progress during upload (can take 10-30 minutes over VPN) +3. Verifies template was created successfully +4. Returns full template path for use in install-config.yaml + +**Template Naming:** +- Auto-generated based on RHCOS version: `rhcos-{version}-template` +- Example: `rhcos-420.94.202501071309-0-template` +- Can be customized with `--template-name` option if needed + +**Error Handling:** +- Connection errors: Verify vSphere credentials and network +- Permission errors: User may not have rights to import OVAs +- Storage errors: Datastore may be full or inaccessible +- Timeout errors: Upload may timeout over very slow connections (increase govc timeout if needed) + +### Step 8: Update install-config.yaml with Template Path + +If a template path was found or created, add it to the install-config.yaml: + +```bash +if [ -n "$TEMPLATE_PATH" ]; then + echo "Template path to use in install-config.yaml: $TEMPLATE_PATH" + + # The template path should be added to the failureDomains topology section: + # platform: + # vsphere: + # failureDomains: + # - topology: + # template: $TEMPLATE_PATH +else + echo "No template available - installer will upload OVA during installation" +fi +``` + +**Important:** +- If TEMPLATE_PATH is set, include the `template:` field in install-config.yaml +- If TEMPLATE_PATH is empty, omit the `template:` field entirely +- The installer will automatically upload the OVA if no template is specified + +### Step 9: Return Results + +Compile all information and return as structured data: + +```bash +# Create result JSON +cat > /tmp/rhcos-template-result.json <` +- Or use insecure connection: `export VSPHERE_INSECURE=true` + +**Permission Denied** +``` +Error: insufficient privileges to perform operation +``` + +**Solution:** +- Verify user has permission to: + - Import OVAs + - Create VMs in the datacenter + - Access the datastore + - Use the resource pool + +**Datastore Full** +``` +Error: no space left on device +``` + +**Solution:** +- Free up space on the datastore +- Choose a different datastore with more capacity +- Delete old templates/VMs + +**Upload Timeout** +``` +Error: context deadline exceeded +``` + +**Solution:** +- Upload may be taking too long over slow connection +- Increase govc timeout: `export GOVC_OPERATION_TIMEOUT=3600` (1 hour) +- Consider uploading OVA manually first, then skip this step + +## Performance Optimization + +### With Template (Recommended) +``` +OVA download: 5-30 minutes (one-time, cached) +OVA upload: 10-30 minutes (one-time) +Installation: 2-5 minutes (template clone) + +Total first cluster: 15-60 minutes +Total subsequent: 2-5 minutes +``` + +### Without Template +``` +Installation: 30-60 minutes (OVA upload per cluster) + +Total per cluster: 30-60 minutes +``` + +**Savings:** 25-55 minutes per cluster after the first one! + +## Integration with install-vsphere + +The template path returned by this skill should be used in the install-config.yaml: + +```yaml +platform: + vsphere: + failureDomains: + - name: us-east-1 + topology: + template: /DC1/vm/rhcos-420.94.202501071309-0-template # ← Use template path here +``` + +If the template field is omitted, the installer will upload the OVA during installation. + +## Example Workflow + +```bash +# 1. Set up vSphere connection +export VSPHERE_SERVER="vcenter.example.com" +export VSPHERE_USERNAME="administrator@vsphere.local" +export VSPHERE_PASSWORD="mypassword" +export VSPHERE_INSECURE="false" + +# 2. Download and upload in one step +TEMPLATE_PATH=$(bash plugins/openshift/skills/rhcos-template-manager/manage-rhcos-template.sh install 4.20 \ + --datacenter DC1 \ + --datastore /DC1/datastore/datastore1 \ + --cluster /DC1/host/Cluster1) + +# 3. Template is ready for use +echo "Template: $TEMPLATE_PATH" +# Output: Template: /DC1/vm/rhcos-420.94.202501071309-0-template + +# 4. Use in install-config.yaml +# template: /DC1/vm/rhcos-420.94.202501071309-0-template +``` + +## Notes + +- **Caching**: OVA files are cached in `.work/openshift-vsphere-install/ova-cache/` and reused +- **Security**: Never log vCenter passwords +- **Compatibility**: Works with vCenter 7.x and 8.x +- **Reusability**: One template can be used for multiple cluster installations +- **Cleanup**: Use `manage-rhcos-template.sh clean` to remove cached OVAs when no longer needed +- **Disk Space**: Each OVA is approximately 1-1.5GB + +## Benefits + +1. **Speed**: 5-10x faster installation after first cluster +2. **Bandwidth**: Save bandwidth on repeated installations +3. **Reliability**: Cached OVAs reduce dependency on external mirrors during installation +4. **Flexibility**: Can pre-stage templates before cluster creation +5. **Reusability**: One template serves multiple clusters diff --git a/plugins/openshift/skills/rhcos-template-manager/check-and-configure-template.sh b/plugins/openshift/skills/rhcos-template-manager/check-and-configure-template.sh new file mode 100644 index 0000000..6630c7d --- /dev/null +++ b/plugins/openshift/skills/rhcos-template-manager/check-and-configure-template.sh @@ -0,0 +1,108 @@ +#!/bin/bash +# check-and-configure-template.sh - Check vSphere for existing RHCOS templates and configure for installation +# +# This script implements the workflow for: +# 1. Fetching RHCOS metadata from the OpenShift installer repo (branch-based) +# 2. Extracting the RHCOS version from the OVA filename +# 3. Checking vSphere for existing templates with that version +# 4. Returning the template path if found, or indicating upload is needed +# +# Usage: +# bash check-and-configure-template.sh +# +# Example: +# bash check-and-configure-template.sh 4.20 cidatacenter +# +# Output: +# - RHCOS version +# - Template path (if exists) or empty (if needs upload) +# - Exit code 0 if template found, 1 if not found + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +# Colors for output +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +NC='\033[0m' # No Color + +# Check arguments +if [ $# -lt 2 ]; then + echo "Usage: $0 " + echo "Example: $0 4.20 cidatacenter" + exit 1 +fi + +OPENSHIFT_VERSION="$1" +DATACENTER="$2" + +# Ensure vCenter credentials are sourced +if [ -z "${GOVC_URL:-}" ]; then + echo "Error: vCenter credentials not set. Please source .work/.vcenter-env first." + exit 1 +fi + +echo "=== RHCOS Template Discovery ===" +echo "" + +# Step 1: Fetch RHCOS metadata from installer repo +# The metadata is stored in the installer repo on a per-version branch: +# - OpenShift 4.20 → branch release-4.20 +# - OpenShift 4.19 → branch release-4.19 +# This ensures we get the correct RHCOS version for the OpenShift release +echo "Fetching RHCOS metadata for OpenShift $OPENSHIFT_VERSION..." +METADATA=$(python3 "$SCRIPT_DIR/fetch-rhcos-metadata.py" "$OPENSHIFT_VERSION" 2>/dev/null) + +if [ $? -ne 0 ]; then + echo "Error: Failed to fetch RHCOS metadata for version $OPENSHIFT_VERSION" + exit 1 +fi + +# Step 2: Extract RHCOS version from OVA filename +# The OVA URL contains the RHCOS version in the filename +# Example: rhcos-9.6.20251015-1-vmware.x86_64.ova → version is 9.6.20251015-1 +OVA_URL=$(echo "$METADATA" | python3 -c "import sys, json; print(json.load(sys.stdin)['url'])") +OVA_FILENAME=$(basename "$OVA_URL") +RHCOS_VERSION=$(echo "$OVA_FILENAME" | sed 's/rhcos-//; s/-vmware.x86_64.ova//') + +echo -e "${GREEN}✓${NC} RHCOS Version: $RHCOS_VERSION" +echo " OVA Filename: $OVA_FILENAME" +echo "" + +# Step 3: Check vSphere for existing templates with this version +# Search for any VM or template that contains the RHCOS version in its name +# This avoids re-uploading templates that already exist from previous installations +echo "Checking vSphere for existing templates in datacenter: $DATACENTER" +echo "Searching for templates with version: $RHCOS_VERSION" + +EXISTING_TEMPLATE=$(govc find "/$DATACENTER" -type m -name "*${RHCOS_VERSION}*" 2>/dev/null | head -1 || true) + +# Step 4: Return results +echo "" +if [ -n "$EXISTING_TEMPLATE" ]; then + echo -e "${GREEN}✓ Found existing template:${NC} $EXISTING_TEMPLATE" + echo "" + echo "This template can be used in install-config.yaml:" + echo " platform:" + echo " vsphere:" + echo " failureDomains:" + echo " - topology:" + echo " template: $EXISTING_TEMPLATE" + echo "" + echo "$EXISTING_TEMPLATE" + exit 0 +else + echo -e "${YELLOW}⚠ No existing template found${NC}" + echo "" + echo "You can either:" + echo "1. Upload the OVA manually using:" + echo " bash $SCRIPT_DIR/manage-rhcos-template.sh install $OPENSHIFT_VERSION \\" + echo " --datacenter $DATACENTER \\" + echo " --datastore /path/to/datastore \\" + echo " --cluster /path/to/cluster" + echo "" + echo "2. Skip template and let installer upload OVA (slower)" + echo "" + exit 1 +fi diff --git a/plugins/openshift/skills/rhcos-template-manager/fetch-rhcos-metadata.py b/plugins/openshift/skills/rhcos-template-manager/fetch-rhcos-metadata.py new file mode 100755 index 0000000..6f1297c --- /dev/null +++ b/plugins/openshift/skills/rhcos-template-manager/fetch-rhcos-metadata.py @@ -0,0 +1,148 @@ +#!/usr/bin/env python3 +""" +fetch-rhcos-metadata.py - Fetch RHCOS metadata from OpenShift installer repository + +This script fetches the rhcos.json metadata file from the openshift/installer +GitHub repository for a specific OpenShift version and extracts the OVA download URL. +""" + +import argparse +import json +import sys +import urllib.request +import urllib.error + + +def parse_version(version): + """Parse OpenShift version to extract major.minor""" + # Remove 'latest-' or 'stable-' prefix if present + version = version.replace('latest-', '').replace('stable-', '').replace('fast-', '').replace('candidate-', '') + + # Extract major.minor from version like "4.20.1" -> "4.20" + parts = version.split('.') + if len(parts) >= 2: + return f"{parts[0]}.{parts[1]}" + + return version + + +def fetch_rhcos_metadata(version): + """ + Fetch RHCOS metadata from openshift/installer GitHub repository + + Args: + version: OpenShift version (e.g., "4.20", "4.19", "latest-4.20") + + Returns: + dict: RHCOS metadata + + Raises: + Exception: If metadata cannot be fetched + """ + major_minor = parse_version(version) + branch = f"release-{major_minor}" + url = f"https://raw.githubusercontent.com/openshift/installer/refs/heads/{branch}/data/data/coreos/rhcos.json" + + print(f"Fetching RHCOS metadata for OpenShift {major_minor}", file=sys.stderr) + print(f"URL: {url}", file=sys.stderr) + + try: + with urllib.request.urlopen(url) as response: + data = response.read() + return json.loads(data) + except urllib.error.HTTPError as e: + if e.code == 404: + raise Exception(f"RHCOS metadata not found for version {major_minor}. Branch '{branch}' may not exist yet.") + raise Exception(f"HTTP error {e.code}: {e.reason}") + except urllib.error.URLError as e: + raise Exception(f"Network error: {e.reason}") + except json.JSONDecodeError as e: + raise Exception(f"Invalid JSON in RHCOS metadata: {e}") + + +def extract_ova_info(metadata): + """ + Extract OVA download URL and related information from RHCOS metadata + + Args: + metadata: RHCOS metadata dictionary + + Returns: + dict: OVA information including URL, SHA256, version + """ + try: + # Navigate to the OVA information (x86_64 architecture) + vmware_artifacts = metadata['architectures']['x86_64']['artifacts']['vmware'] + ova_format = vmware_artifacts['formats']['ova'] + disk_info = ova_format['disk'] + + return { + 'url': disk_info['location'], + 'sha256': disk_info.get('sha256', ''), + 'uncompressed_sha256': disk_info.get('uncompressed-sha256', ''), + 'rhcos_version': metadata.get('oscontainer', {}).get('version', 'unknown'), + 'openshift_version': metadata.get('buildid', 'unknown') + } + except KeyError as e: + raise Exception(f"Missing key in RHCOS metadata: {e}") + + +def main(): + parser = argparse.ArgumentParser( + description='Fetch RHCOS OVA metadata from openshift/installer repository', + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=""" +Examples: + # Fetch metadata for OpenShift 4.20 + %(prog)s 4.20 + + # Fetch metadata for OpenShift 4.19 (works with various formats) + %(prog)s latest-4.19 + %(prog)s stable-4.19 + %(prog)s 4.19.0 + +Output format (JSON): + { + "url": "https://...", + "sha256": "...", + "rhcos_version": "420.94.202501071309-0", + "openshift_version": "..." + } + """ + ) + + parser.add_argument( + 'version', + help='OpenShift version (e.g., 4.20, latest-4.19, stable-4.18)' + ) + + parser.add_argument( + '--pretty', + action='store_true', + help='Pretty-print JSON output' + ) + + args = parser.parse_args() + + try: + # Fetch metadata + metadata = fetch_rhcos_metadata(args.version) + + # Extract OVA info + ova_info = extract_ova_info(metadata) + + # Output JSON + if args.pretty: + print(json.dumps(ova_info, indent=2)) + else: + print(json.dumps(ova_info)) + + return 0 + + except Exception as e: + print(f"Error: {e}", file=sys.stderr) + return 1 + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/plugins/openshift/skills/rhcos-template-manager/manage-rhcos-template.sh b/plugins/openshift/skills/rhcos-template-manager/manage-rhcos-template.sh new file mode 100755 index 0000000..7afb620 --- /dev/null +++ b/plugins/openshift/skills/rhcos-template-manager/manage-rhcos-template.sh @@ -0,0 +1,499 @@ +#!/usr/bin/env bash +# manage-rhcos-template.sh - Download and upload RHCOS OVA template to vSphere +# This script handles the complete workflow for RHCOS template management + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +CACHE_DIR="${CACHE_DIR:-.work/openshift-vsphere-install/ova-cache}" + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +NC='\033[0m' # No Color + +# Usage function +usage() { + cat < [options] + +Manage RHCOS OVA templates for OpenShift vSphere installations. + +Commands: + download Download RHCOS OVA for OpenShift version + upload Upload OVA to vSphere as template + install Download and upload in one step + list List cached OVA files + clean Remove cached OVA files + +Options for 'upload' and 'install': + --datacenter vSphere datacenter (required) + --datastore vSphere datastore path (required) + --cluster vSphere cluster path (required) + --template-name Template name (auto-generated if not specified) + --use-govc Use govc instead of vsphere-helper + +Environment variables: + VSPHERE_SERVER vCenter server (e.g., vcenter.example.com) + VSPHERE_USERNAME vCenter username + VSPHERE_PASSWORD vCenter password + VSPHERE_INSECURE Skip SSL verification (default: false) + CACHE_DIR OVA cache directory (default: .work/openshift-vsphere-install/ova-cache) + +Examples: + # Download OVA for OpenShift 4.20 + $0 download 4.20 + + # Upload OVA to vSphere + $0 upload rhcos-4.20.ova --datacenter DC1 --datastore /DC1/datastore/ds1 --cluster /DC1/host/Cluster1 + + # Download and upload in one step + $0 install 4.20 --datacenter DC1 --datastore /DC1/datastore/ds1 --cluster /DC1/host/Cluster1 + + # List cached OVAs + $0 list + + # Clean cache + $0 clean +EOF +} + +# Logging functions +log_info() { + echo -e "${GREEN}[INFO]${NC} $*" +} + +log_warn() { + echo -e "${YELLOW}[WARN]${NC} $*" >&2 +} + +log_error() { + echo -e "${RED}[ERROR]${NC} $*" >&2 +} + +# Check if required tools are available +check_prerequisites() { + local required_tools=("python3" "curl") + + for tool in "${required_tools[@]}"; do + if ! command -v "$tool" &>/dev/null; then + log_error "$tool is required but not installed" + exit 1 + fi + done +} + +# Download RHCOS OVA +download_ova() { + local version="$1" + + log_info "Fetching RHCOS metadata for OpenShift $version..." + + # Fetch metadata using Python script + local metadata + if ! metadata=$(python3 "$SCRIPT_DIR/fetch-rhcos-metadata.py" "$version"); then + log_error "Failed to fetch RHCOS metadata" + exit 1 + fi + + # Parse JSON + local ova_url + local rhcos_version + local sha256 + ova_url=$(echo "$metadata" | python3 -c "import sys, json; print(json.load(sys.stdin)['url'])") + rhcos_version=$(echo "$metadata" | python3 -c "import sys, json; print(json.load(sys.stdin)['rhcos_version'])") + sha256=$(echo "$metadata" | python3 -c "import sys, json; print(json.load(sys.stdin)['sha256'])") + + log_info "RHCOS version: $rhcos_version" + log_info "OVA URL: $ova_url" + + # Extract filename from URL + local ova_filename + ova_filename=$(basename "$ova_url") + + # Create cache directory + mkdir -p "$CACHE_DIR" + + local ova_path="$CACHE_DIR/$ova_filename" + + # Check if already downloaded + if [ -f "$ova_path" ]; then + log_info "OVA already cached: $ova_path" + + # Verify checksum if available + if [ -n "$sha256" ]; then + log_info "Verifying SHA256 checksum..." + if command -v sha256sum &>/dev/null; then + local computed_sha + computed_sha=$(sha256sum "$ova_path" | awk '{print $1}') + if [ "$computed_sha" = "$sha256" ]; then + log_info "✓ Checksum verified" + else + log_warn "Checksum mismatch! Re-downloading..." + rm -f "$ova_path" + fi + fi + fi + fi + + # Download if not cached or checksum failed + if [ ! -f "$ova_path" ]; then + log_info "Downloading RHCOS OVA (this may take several minutes)..." + log_info "Downloading to: $ova_path" + + if ! curl -L --fail --progress-bar "$ova_url" -o "$ova_path"; then + log_error "Download failed" + rm -f "$ova_path" + exit 1 + fi + + log_info "✓ Download complete" + + # Verify checksum after download + if [ -n "$sha256" ] && command -v sha256sum &>/dev/null; then + log_info "Verifying SHA256 checksum..." + local computed_sha + computed_sha=$(sha256sum "$ova_path" | awk '{print $1}') + if [ "$computed_sha" = "$sha256" ]; then + log_info "✓ Checksum verified" + else + log_error "Checksum verification failed!" + log_error "Expected: $sha256" + log_error "Got: $computed_sha" + exit 1 + fi + fi + fi + + # Output the OVA path and RHCOS version for use by caller + echo "$ova_path" + echo "$rhcos_version" > "$CACHE_DIR/.rhcos_version_${ova_filename}" +} + +# Upload OVA to vSphere +upload_ova() { + local ova_path="$1" + local datacenter="$2" + local datastore="$3" + local cluster="$4" + local template_name="${5:-}" + local use_govc="${6:-false}" + + # Verify OVA file exists + if [ ! -f "$ova_path" ]; then + log_error "OVA file not found: $ova_path" + exit 1 + fi + + # Generate template name if not provided + if [ -z "$template_name" ]; then + local ova_filename + ova_filename=$(basename "$ova_path" .ova) + + # Try to get RHCOS version from cache + local version_file="$CACHE_DIR/.rhcos_version_$(basename "$ova_path")" + if [ -f "$version_file" ]; then + local rhcos_version + rhcos_version=$(cat "$version_file") + template_name="rhcos-${rhcos_version}-template" + else + template_name="${ova_filename}-template" + fi + fi + + log_info "Template name: $template_name" + + # Check if vsphere-helper is available (preferred) + local use_vsphere_helper=false + if [ "$use_govc" = "false" ]; then + if command -v vsphere-helper &>/dev/null || [ -f "plugins/openshift/skills/vsphere-discovery/vsphere-helper" ]; then + use_vsphere_helper=true + log_info "Using vsphere-helper for upload" + else + log_info "vsphere-helper not found, falling back to govc" + fi + fi + + # Check vSphere connection environment variables + if [ -z "${VSPHERE_SERVER:-}" ] || [ -z "${VSPHERE_USERNAME:-}" ] || [ -z "${VSPHERE_PASSWORD:-}" ]; then + log_error "vSphere connection environment variables not set" + log_error "Required: VSPHERE_SERVER, VSPHERE_USERNAME, VSPHERE_PASSWORD" + exit 1 + fi + + # Extract datastore name from path + local datastore_name + datastore_name=$(basename "$datastore") + + # Check if template already exists + log_info "Checking if template already exists..." + + if [ "$use_vsphere_helper" = "true" ]; then + # TODO: Add template check to vsphere-helper + log_info "Skipping existence check (vsphere-helper doesn't support template queries yet)" + else + # Use govc to check + export GOVC_URL="https://${VSPHERE_SERVER}/sdk" + export GOVC_USERNAME="$VSPHERE_USERNAME" + export GOVC_PASSWORD="$VSPHERE_PASSWORD" + export GOVC_INSECURE="${VSPHERE_INSECURE:-false}" + + if govc vm.info "/${datacenter}/vm/${template_name}" &>/dev/null; then + log_info "Template already exists: $template_name" + log_info "Skipping upload" + return 0 + fi + fi + + # Upload OVA + log_info "Uploading OVA to vSphere (this may take 10-30 minutes)..." + + if [ "$use_vsphere_helper" = "true" ]; then + # TODO: Extend vsphere-helper to support OVA import + log_error "vsphere-helper OVA import not yet implemented, falling back to govc" + use_vsphere_helper=false + fi + + # Use govc for upload + export GOVC_URL="https://${VSPHERE_SERVER}/sdk" + export GOVC_USERNAME="$VSPHERE_USERNAME" + export GOVC_PASSWORD="$VSPHERE_PASSWORD" + export GOVC_INSECURE="${VSPHERE_INSECURE:-false}" + + if ! command -v govc &>/dev/null; then + log_error "govc is required for OVA upload but not installed" + log_info "Install using: bash plugins/openshift/scripts/install-govc.sh" + exit 1 + fi + + log_info "Importing OVA as VM..." + if ! govc import.ova \ + -dc="$datacenter" \ + -ds="$datastore_name" \ + -pool="${cluster}/Resources" \ + -name="$template_name" \ + "$ova_path"; then + log_error "OVA import failed" + exit 1 + fi + + log_info "✓ OVA import complete" + + # Verify template was created + log_info "Verifying template..." + if govc vm.info "/${datacenter}/vm/${template_name}" &>/dev/null; then + log_info "✓ Template created successfully: /${datacenter}/vm/${template_name}" + else + log_error "Template verification failed" + exit 1 + fi + + echo "/${datacenter}/vm/${template_name}" +} + +# List cached OVA files +list_cached() { + if [ ! -d "$CACHE_DIR" ] || [ -z "$(ls -A "$CACHE_DIR" 2>/dev/null)" ]; then + log_info "No cached OVA files found" + return 0 + fi + + log_info "Cached OVA files in $CACHE_DIR:" + echo "" + + local total_size=0 + for ova in "$CACHE_DIR"/*.ova; do + if [ -f "$ova" ]; then + local size + size=$(du -h "$ova" | cut -f1) + local name + name=$(basename "$ova") + + # Try to get RHCOS version + local version_file="$CACHE_DIR/.rhcos_version_$(basename "$ova")" + if [ -f "$version_file" ]; then + local rhcos_version + rhcos_version=$(cat "$version_file") + echo " $name ($size) - RHCOS $rhcos_version" + else + echo " $name ($size)" + fi + + total_size=$((total_size + $(stat -c%s "$ova" 2>/dev/null || stat -f%z "$ova"))) + fi + done + + echo "" + log_info "Total cache size: $(numfmt --to=iec-i --suffix=B $total_size 2>/dev/null || echo "$total_size bytes")" +} + +# Clean cached OVA files +clean_cache() { + if [ ! -d "$CACHE_DIR" ] || [ -z "$(ls -A "$CACHE_DIR" 2>/dev/null)" ]; then + log_info "Cache is already empty" + return 0 + fi + + read -p "Remove all cached OVA files from $CACHE_DIR? [y/N]: " -n 1 -r + echo + if [[ $REPLY =~ ^[Yy]$ ]]; then + rm -rf "$CACHE_DIR" + log_info "✓ Cache cleaned" + else + log_info "Cancelled" + fi +} + +# Main command dispatcher +main() { + if [ $# -lt 1 ]; then + usage + exit 1 + fi + + check_prerequisites + + local command="$1" + shift + + case "$command" in + download) + if [ $# -lt 1 ]; then + log_error "Usage: $0 download " + exit 1 + fi + download_ova "$1" + ;; + + upload) + if [ $# -lt 1 ]; then + log_error "Usage: $0 upload --datacenter --datastore --cluster " + exit 1 + fi + + local ova_path="$1" + shift + + local datacenter="" + local datastore="" + local cluster="" + local template_name="" + local use_govc="false" + + while [ $# -gt 0 ]; do + case "$1" in + --datacenter) + datacenter="$2" + shift 2 + ;; + --datastore) + datastore="$2" + shift 2 + ;; + --cluster) + cluster="$2" + shift 2 + ;; + --template-name) + template_name="$2" + shift 2 + ;; + --use-govc) + use_govc="true" + shift + ;; + *) + log_error "Unknown option: $1" + exit 1 + ;; + esac + done + + if [ -z "$datacenter" ] || [ -z "$datastore" ] || [ -z "$cluster" ]; then + log_error "Missing required options: --datacenter, --datastore, --cluster" + exit 1 + fi + + upload_ova "$ova_path" "$datacenter" "$datastore" "$cluster" "$template_name" "$use_govc" + ;; + + install) + if [ $# -lt 1 ]; then + log_error "Usage: $0 install --datacenter --datastore --cluster " + exit 1 + fi + + local version="$1" + shift + + local datacenter="" + local datastore="" + local cluster="" + local template_name="" + local use_govc="false" + + while [ $# -gt 0 ]; do + case "$1" in + --datacenter) + datacenter="$2" + shift 2 + ;; + --datastore) + datastore="$2" + shift 2 + ;; + --cluster) + cluster="$2" + shift 2 + ;; + --template-name) + template_name="$2" + shift 2 + ;; + --use-govc) + use_govc="true" + shift + ;; + *) + log_error "Unknown option: $1" + exit 1 + ;; + esac + done + + if [ -z "$datacenter" ] || [ -z "$datastore" ] || [ -z "$cluster" ]; then + log_error "Missing required options: --datacenter, --datastore, --cluster" + exit 1 + fi + + # Download + local ova_path + ova_path=$(download_ova "$version") + + # Upload + upload_ova "$ova_path" "$datacenter" "$datastore" "$cluster" "$template_name" "$use_govc" + ;; + + list) + list_cached + ;; + + clean) + clean_cache + ;; + + help|--help|-h) + usage + exit 0 + ;; + + *) + log_error "Unknown command: $command" + usage + exit 1 + ;; + esac +} + +main "$@" diff --git a/plugins/openshift/skills/vsphere-discovery/.gitignore b/plugins/openshift/skills/vsphere-discovery/.gitignore new file mode 100644 index 0000000..b12ebbf --- /dev/null +++ b/plugins/openshift/skills/vsphere-discovery/.gitignore @@ -0,0 +1,18 @@ +# Binaries +vsphere-helper +vsphere-helper-* + +# Go build artifacts +*.exe +*.dll +*.so +*.dylib + +# Test binaries +*.test + +# Go coverage +*.out + +# Dependency directories +vendor/ diff --git a/plugins/openshift/skills/vsphere-discovery/Makefile b/plugins/openshift/skills/vsphere-discovery/Makefile new file mode 100644 index 0000000..e072aeb --- /dev/null +++ b/plugins/openshift/skills/vsphere-discovery/Makefile @@ -0,0 +1,133 @@ +.PHONY: build clean install test help + +# Binary name +BINARY_NAME=vsphere-helper + +# Go parameters +GOCMD=go +GOBUILD=$(GOCMD) build +GOCLEAN=$(GOCMD) clean +GOTEST=$(GOCMD) test +GOGET=$(GOCMD) get +GOMOD=$(GOCMD) mod + +# Detect OS and architecture +UNAME_S := $(shell uname -s) +UNAME_M := $(shell uname -m) + +# Default target architecture based on current system +ifeq ($(UNAME_S),Darwin) + OS=darwin +else ifeq ($(UNAME_S),Linux) + OS=linux +else + OS=unsupported +endif + +ifeq ($(UNAME_M),x86_64) + ARCH=amd64 +else ifeq ($(UNAME_M),arm64) + ARCH=arm64 +else ifeq ($(UNAME_M),aarch64) + ARCH=arm64 +else + ARCH=unsupported +endif + +# Build flags +LDFLAGS=-ldflags "-s -w" + +# Default target +all: build + +# Download dependencies +deps: + @echo "Downloading dependencies..." + $(GOMOD) download + $(GOMOD) tidy + +# Build for current platform +build: deps + @echo "Building $(BINARY_NAME) for $(OS)/$(ARCH)..." + CGO_ENABLED=0 GOOS=$(OS) GOARCH=$(ARCH) $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME) . + @echo "✓ Build complete: $(BINARY_NAME)" + +# Build for Linux (amd64) +build-linux: + @echo "Building $(BINARY_NAME) for linux/amd64..." + CGO_ENABLED=0 GOOS=linux GOARCH=amd64 $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME)-linux-amd64 . + @echo "✓ Build complete: $(BINARY_NAME)-linux-amd64" + +# Build for Linux (arm64) +build-linux-arm64: + @echo "Building $(BINARY_NAME) for linux/arm64..." + CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME)-linux-arm64 . + @echo "✓ Build complete: $(BINARY_NAME)-linux-arm64" + +# Build for macOS (amd64) +build-darwin: + @echo "Building $(BINARY_NAME) for darwin/amd64..." + CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME)-darwin-amd64 . + @echo "✓ Build complete: $(BINARY_NAME)-darwin-amd64" + +# Build for macOS (arm64/M1/M2) +build-darwin-arm64: + @echo "Building $(BINARY_NAME) for darwin/arm64..." + CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(GOBUILD) $(LDFLAGS) -o $(BINARY_NAME)-darwin-arm64 . + @echo "✓ Build complete: $(BINARY_NAME)-darwin-arm64" + +# Build all platforms +build-all: build-linux build-linux-arm64 build-darwin build-darwin-arm64 + @echo "✓ All builds complete" + @ls -lh $(BINARY_NAME)-* + +# Install to user's local bin +install: build + @echo "Installing $(BINARY_NAME)..." + @if [ -d "$$HOME/.local/bin" ]; then \ + cp $(BINARY_NAME) $$HOME/.local/bin/$(BINARY_NAME); \ + echo "✓ Installed to $$HOME/.local/bin/$(BINARY_NAME)"; \ + elif [ -d "$$HOME/bin" ]; then \ + cp $(BINARY_NAME) $$HOME/bin/$(BINARY_NAME); \ + echo "✓ Installed to $$HOME/bin/$(BINARY_NAME)"; \ + else \ + echo "Error: Neither ~/.local/bin nor ~/bin exists"; \ + echo "Create one with: mkdir -p ~/.local/bin"; \ + exit 1; \ + fi + +# Run tests +test: + $(GOTEST) -v ./... + +# Clean build artifacts +clean: + @echo "Cleaning..." + $(GOCLEAN) + rm -f $(BINARY_NAME) + rm -f $(BINARY_NAME)-linux-amd64 + rm -f $(BINARY_NAME)-linux-arm64 + rm -f $(BINARY_NAME)-darwin-amd64 + rm -f $(BINARY_NAME)-darwin-arm64 + @echo "✓ Clean complete" + +# Show help +help: + @echo "vsphere-helper Makefile" + @echo "" + @echo "Targets:" + @echo " make - Build for current platform" + @echo " make build - Build for current platform" + @echo " make build-all - Build for all platforms" + @echo " make install - Build and install to ~/.local/bin or ~/bin" + @echo " make clean - Remove build artifacts" + @echo " make test - Run tests" + @echo " make deps - Download dependencies" + @echo "" + @echo "Platform-specific builds:" + @echo " make build-linux - Build for Linux amd64" + @echo " make build-linux-arm64 - Build for Linux arm64" + @echo " make build-darwin - Build for macOS amd64" + @echo " make build-darwin-arm64 - Build for macOS arm64 (M1/M2)" + @echo "" + @echo "Current platform: $(OS)/$(ARCH)" diff --git a/plugins/openshift/skills/vsphere-discovery/README.md b/plugins/openshift/skills/vsphere-discovery/README.md new file mode 100644 index 0000000..d1faf51 --- /dev/null +++ b/plugins/openshift/skills/vsphere-discovery/README.md @@ -0,0 +1,333 @@ +# vSphere Discovery Skill + +Auto-discover vSphere infrastructure components (datacenters, clusters, datastores, networks) with correct path handling for OpenShift installations. + +## Overview + +This skill provides automated vSphere infrastructure discovery using: +- **vsphere-helper** binary (preferred) - Go binary using govmomi library for accurate path handling +- **govc** CLI (fallback) - VMware's official CLI tool + +The skill is used by `/openshift:install-vsphere` and other OpenShift cluster provisioning commands to gather vSphere infrastructure details. + +## Quick Start + +### Prerequisites + +1. **vCenter Access** - URL, username, and password +2. **Go 1.23+** (only if building from source) + +### Building the Binary + +```bash +cd plugins/openshift/skills/vsphere-discovery + +# Build for your current platform +make build + +# Or install to ~/.local/bin +make install + +# Or build for all platforms +make build-all +``` + +### Using vsphere-helper + +```bash +# Set up environment +export VSPHERE_SERVER="vcenter.example.com" +export VSPHERE_USERNAME="administrator@vsphere.local" +export VSPHERE_PASSWORD="your-password" +export VSPHERE_INSECURE="false" # true to skip SSL verification + +# List all datacenters +vsphere-helper list-datacenters + +# List clusters in a datacenter +vsphere-helper list-clusters --datacenter DC1 + +# List datastores with capacity info +vsphere-helper list-datastores --datacenter DC1 + +# List networks +vsphere-helper list-networks --datacenter DC1 +``` + +## Features + +### Accurate Path Handling + +Unlike text-based govc parsing, vsphere-helper uses govmomi library directly to ensure paths match exactly what OpenShift expects: + +- **Datacenter**: Name only, no leading slash (e.g., `DC1`) +- **Cluster**: Full path required (e.g., `/DC1/host/Cluster1`) +- **Datastore**: Full path required (e.g., `/DC1/datastore/datastore1`) +- **Network**: Name only, no path prefix (e.g., `ci-vlan-981`) + +### Structured JSON Output + +All commands return well-formatted JSON for easy parsing: + +```json +[ + { + "name": "datastore1", + "path": "/DC1/datastore/datastore1", + "freeSpace": 537698893824, + "capacity": 1073741824000, + "type": "VMFS" + } +] +``` + +### Capacity Information + +Datastore listings include free space and capacity in bytes for informed decision-making. + +## Commands + +### list-datacenters + +Lists all datacenters in vCenter. + +**Example:** +```bash +vsphere-helper list-datacenters +``` + +**Output:** +```json +[ + { + "name": "DC1", + "path": "/DC1" + }, + { + "name": "vcenter-110-dc01", + "path": "/vcenter-110-dc01" + } +] +``` + +### list-clusters + +Lists all clusters in a datacenter. + +**Usage:** +```bash +vsphere-helper list-clusters --datacenter +``` + +**Example:** +```bash +vsphere-helper list-clusters --datacenter DC1 +``` + +**Output:** +```json +[ + { + "name": "Cluster1", + "path": "/DC1/host/Cluster1" + } +] +``` + +### list-datastores + +Lists all datastores in a datacenter with capacity information. + +**Usage:** +```bash +vsphere-helper list-datastores --datacenter +``` + +**Example:** +```bash +vsphere-helper list-datastores --datacenter DC1 +``` + +**Output:** +```json +[ + { + "name": "datastore1", + "path": "/DC1/datastore/datastore1", + "freeSpace": 537698893824, + "capacity": 1073741824000, + "type": "VMFS" + }, + { + "name": "vcenter-110-cl01-ds-vsan01", + "path": "/vcenter-110-dc01/datastore/vcenter-110-cl01-ds-vsan01", + "freeSpace": 2199023255552, + "capacity": 4398046511104, + "type": "vsan" + } +] +``` + +### list-networks + +Lists all networks in a datacenter. + +**Usage:** +```bash +vsphere-helper list-networks --datacenter +``` + +**Example:** +```bash +vsphere-helper list-networks --datacenter DC1 +``` + +**Output:** +```json +[ + { + "name": "/DC1/network/ci-vlan-981", + "path": "/DC1/network/ci-vlan-981", + "type": "DistributedVirtualPortgroup" + }, + { + "name": "/DC1/network/VM Network", + "path": "/DC1/network/VM Network", + "type": "Network" + } +] +``` + +## Environment Variables + +| Variable | Description | Required | Default | +|----------|-------------|----------|---------| +| `VSPHERE_SERVER` | vCenter server hostname | Yes | - | +| `VSPHERE_USERNAME` | vCenter username | Yes | - | +| `VSPHERE_PASSWORD` | vCenter password | Yes | - | +| `VSPHERE_INSECURE` | Skip SSL verification | No | `false` | + +## SSL Certificates + +For secure connections (recommended), install vCenter SSL certificates: + +```bash +bash plugins/openshift/scripts/install-vcenter-certs.sh vcenter.example.com +``` + +This installs certificates to: +- **macOS**: System Keychain +- **Linux**: `/usr/local/share/ca-certificates/` + +Alternatively, use `VSPHERE_INSECURE=true` to skip SSL verification (not recommended for production). + +## Building + +### Requirements + +- Go 1.23 or later +- make +- Internet connection (to download dependencies) + +### Build Targets + +```bash +# Build for current platform +make build + +# Install to ~/.local/bin or ~/bin +make install + +# Build for specific platforms +make build-linux # Linux amd64 +make build-linux-arm64 # Linux arm64 +make build-darwin # macOS amd64 +make build-darwin-arm64 # macOS arm64 (M1/M2) + +# Build for all platforms +make build-all + +# Clean build artifacts +make clean + +# Show help +make help +``` + +### Manual Build + +```bash +# Download dependencies +go mod download + +# Build +CGO_ENABLED=0 go build -ldflags "-s -w" -o vsphere-helper . +``` + +## Error Handling + +### Certificate Errors + +``` +Error: x509: certificate signed by unknown authority +``` + +**Solution**: Install vCenter certificates or use `VSPHERE_INSECURE=true` + +### Authentication Errors + +``` +Error: Cannot complete login due to an incorrect user name or password +``` + +**Solution**: Verify username, password, and ensure account is not locked + +### Resource Not Found + +``` +Error: failed to find datacenter 'DC1': datacenter 'DC1' not found +``` + +**Solution**: List available resources and verify exact names + +## Performance + +vsphere-helper is significantly faster than govc for multiple queries: + +| Operation | vsphere-helper | govc CLI | +|-----------|---------------|----------| +| Single query | ~100ms | ~500ms | +| 4 queries | ~400ms | ~2000ms | +| Session reuse | ✅ Yes | ❌ No | + +**Why?** vsphere-helper maintains a single vSphere session across all operations, while govc creates a new session for each command. + +## Contributing + +### Project Structure + +``` +vsphere-discovery/ +├── main.go # CLI implementation +├── go.mod # Go module definition +├── Makefile # Build automation +├── SKILL.md # AI skill instructions +└── README.md # This file +``` + +### Adding New Commands + +1. Add command function to `main.go` +2. Add command case to `main()` switch +3. Update SKILL.md with usage instructions +4. Update this README + +## License + +Part of the [ai-helpers](https://github.com/openshift-eng/ai-helpers) project. + +## Related + +- **Scripts**: `plugins/openshift/scripts/install-govc.sh` - Install govc CLI +- **Scripts**: `plugins/openshift/scripts/install-vcenter-certs.sh` - Install vCenter certificates +- **Command**: `/openshift:install-vsphere` - Uses this skill for infrastructure discovery diff --git a/plugins/openshift/skills/vsphere-discovery/SKILL.md b/plugins/openshift/skills/vsphere-discovery/SKILL.md new file mode 100644 index 0000000..1c907ca --- /dev/null +++ b/plugins/openshift/skills/vsphere-discovery/SKILL.md @@ -0,0 +1,467 @@ +--- +name: vSphere Discovery +description: Auto-discover vSphere infrastructure (datacenters, clusters, datastores, networks) using govmomi with correct path handling +--- + +# vSphere Discovery Skill + +This skill discovers vSphere infrastructure components using either the `vsphere-helper` binary (govmomi-based, preferred) or `govc` CLI (fallback) and presents them to users for interactive selection. + +## When to Use This Skill + +Use this skill when you need to: +- Auto-discover vSphere infrastructure components +- Get correct vSphere inventory paths for install-config.yaml +- List datacenters, clusters, datastores, or networks +- Present interactive dropdowns for user selection +- Handle vCenter authentication and certificate setup + +This skill is used by: +- `/openshift:install-vsphere` - For gathering vSphere infrastructure details +- `/openshift:create-cluster` - For cluster provisioning workflows + +## Prerequisites + +Before starting, ensure these tools are available: + +1. **vsphere-helper binary (Preferred)** + - Check if available: `which vsphere-helper` + - If not available, check if it exists in skill directory: `ls plugins/openshift/skills/vsphere-discovery/vsphere-helper` + - If source exists but binary doesn't, offer to build it: + ```bash + cd plugins/openshift/skills/vsphere-discovery + make build + # Or for user installation: + make install + ``` + - **Why prefer vsphere-helper?** + - Uses govmomi library directly → correct path handling + - Returns structured JSON → easy parsing + - Better error messages + - Faster than spawning govc subprocesses + +2. **govc CLI (Fallback)** + - Check if available: `which govc` + - If not available, install using: `bash plugins/openshift/scripts/install-govc.sh` + - Used when vsphere-helper binary is not available + +3. **vCenter Certificates** + - Required for secure connection (VSPHERE_INSECURE=false) + - Install using: `bash plugins/openshift/scripts/install-vcenter-certs.sh ` + - Optional if using insecure connection (not recommended for production) + +## Input Format + +The user will provide: +1. **vCenter Server URL** - e.g., "vcenter.ci.ibmc.devcluster.openshift.com" or "vcenter.example.com" +2. **vCenter Username** - e.g., "administrator@vsphere.local" +3. **vCenter Password** - Handle securely, never log or display + +Optional: +4. **Datacenter name** - If already known, skip datacenter discovery +5. **Insecure connection** - Whether to skip SSL verification (default: false) + +## Output Format + +Return a structured result containing selected infrastructure components: + +```json +{ + "datacenter": "DC1", + "datacenter_path": "/DC1", + "cluster": "Cluster1", + "cluster_path": "/DC1/host/Cluster1", + "datastore": "datastore1", + "datastore_path": "/DC1/datastore/datastore1", + "network": "ci-vlan-981", + "network_path": "/DC1/network/ci-vlan-981" +} +``` + +**Path Handling Rules (CRITICAL for install-config.yaml):** +- **datacenter**: Name only, no leading slash (e.g., "DC1") +- **cluster_path**: Full path required (e.g., "/DC1/host/Cluster1") +- **datastore_path**: Full path required (e.g., "/DC1/datastore/datastore1") +- **network**: Name only, no path prefix (e.g., "ci-vlan-981") + +## Implementation Steps + +### Step 1: Choose Discovery Tool + +Determine which tool to use: + +```bash +# Check for vsphere-helper binary (preferred) +if which vsphere-helper &>/dev/null; then + echo "Using vsphere-helper (govmomi-based)" + USE_VSPHERE_HELPER=true +elif [ -f "plugins/openshift/skills/vsphere-discovery/vsphere-helper" ]; then + echo "Using vsphere-helper from skill directory" + USE_VSPHERE_HELPER=true + VSPHERE_HELPER_PATH="plugins/openshift/skills/vsphere-discovery/vsphere-helper" +else + echo "vsphere-helper not found, checking for govc..." + if which govc &>/dev/null; then + echo "Using govc CLI (fallback)" + USE_VSPHERE_HELPER=false + else + echo "Neither vsphere-helper nor govc found. Installing govc..." + bash plugins/openshift/scripts/install-govc.sh + USE_VSPHERE_HELPER=false + fi +fi +``` + +### Step 2: Install vCenter Certificates (Optional but Recommended) + +```bash +# Prompt user if they want to install vCenter certificates +# This enables secure SSL connections (VSPHERE_INSECURE=false) + +read -p "Install vCenter SSL certificates for secure connection? (recommended) [Y/n]: " response +if [[ "$response" =~ ^([yY]|)$ ]]; then + bash plugins/openshift/scripts/install-vcenter-certs.sh "$VCENTER_SERVER" + VSPHERE_INSECURE=false +else + echo "Skipping certificate installation. Using insecure connection." + VSPHERE_INSECURE=true +fi +``` + +### Step 3: Set up vSphere Connection Environment + +```bash +# Set environment variables for vsphere-helper +export VSPHERE_SERVER="$VCENTER_SERVER" +export VSPHERE_USERNAME="$VCENTER_USERNAME" +export VSPHERE_PASSWORD="$VCENTER_PASSWORD" +export VSPHERE_INSECURE="$VSPHERE_INSECURE" # "true" or "false" + +# For govc (if using fallback) +export GOVC_URL="https://${VCENTER_SERVER}/sdk" +export GOVC_USERNAME="$VCENTER_USERNAME" +export GOVC_PASSWORD="$VCENTER_PASSWORD" +export GOVC_INSECURE="$VSPHERE_INSECURE" +``` + +### Step 4: Discover Datacenters + +**Using vsphere-helper (preferred):** +```bash +# List all datacenters +DATACENTERS_JSON=$(vsphere-helper list-datacenters) + +# Parse JSON to get datacenter names and paths +echo "$DATACENTERS_JSON" | jq -r '.[] | "\(.name) (\(.path))"' + +# Example output: +# DC1 (/DC1) +# DC2 (/DC2) +# vcenter-110-dc01 (/vcenter-110-dc01) +``` + +**Using govc (fallback):** +```bash +# List all datacenters +govc ls / + +# Example output: +# /DC1 +# /DC2 +# /vcenter-110-dc01 +``` + +**Present to user:** +- Use `AskUserQuestion` tool to present dropdown of available datacenters +- Store user's selection + +**Path extraction:** +```bash +# From vsphere-helper JSON +DATACENTER_NAME=$(echo "$DATACENTERS_JSON" | jq -r ".[] | select(.path == \"$USER_SELECTION\") | .name") + +# From govc output +DATACENTER_NAME=$(echo "$USER_SELECTION" | sed 's|^/||') # Remove leading slash +``` + +**IMPORTANT:** For install-config.yaml, use datacenter NAME without leading slash (e.g., "DC1", not "/DC1") + +### Step 5: Discover Clusters + +**Using vsphere-helper (preferred):** +```bash +# List clusters in selected datacenter +CLUSTERS_JSON=$(vsphere-helper list-clusters --datacenter "$DATACENTER_NAME") + +# Parse JSON +echo "$CLUSTERS_JSON" | jq -r '.[] | "\(.name) (\(.path))"' + +# Example output: +# Cluster1 (/DC1/host/Cluster1) +# vcenter-110-cl01 (/vcenter-110-dc01/host/vcenter-110-cl01) +``` + +**Using govc (fallback):** +```bash +# List clusters +govc ls "/${DATACENTER_NAME}/host" + +# Example output: +# /DC1/host/Cluster1 +# /DC1/host/Cluster2 +``` + +**Present to user:** +- Use `AskUserQuestion` tool with dropdown +- Display cluster name and full path for clarity + +**Path extraction:** +```bash +# From vsphere-helper JSON +CLUSTER_PATH=$(echo "$CLUSTERS_JSON" | jq -r ".[] | select(.name == \"$USER_SELECTION\") | .path") + +# From govc output +CLUSTER_PATH="$USER_SELECTION" # Already has full path +``` + +**IMPORTANT:** For install-config.yaml, use FULL cluster path (e.g., "/DC1/host/Cluster1") + +### Step 6: Discover Datastores + +**Using vsphere-helper (preferred):** +```bash +# List datastores with capacity information +DATASTORES_JSON=$(vsphere-helper list-datastores --datacenter "$DATACENTER_NAME") + +# Parse and format with capacity info +echo "$DATASTORES_JSON" | jq -r '.[] | "\(.name): \(.freeSpace / 1024 / 1024 / 1024 | floor)GB free / \(.capacity / 1024 / 1024 / 1024 | floor)GB total (\(.type))"' + +# Example output: +# datastore1: 500GB free / 1000GB total (VMFS) +# vcenter-110-cl01-ds-vsan01: 2048GB free / 4096GB total (vsan) +``` + +**Using govc (fallback):** +```bash +# List datastores +govc ls "/${DATACENTER_NAME}/datastore" + +# Get capacity for each datastore +for ds in $(govc ls "/${DATACENTER_NAME}/datastore"); do + ds_name=$(basename "$ds") + free=$(govc datastore.info -json "$ds" | jq -r '.Datastores[0].Info.FreeSpace') + capacity=$(govc datastore.info -json "$ds" | jq -r '.Datastores[0].Info.Capacity') + free_gb=$((free / 1024 / 1024 / 1024)) + capacity_gb=$((capacity / 1024 / 1024 / 1024)) + echo "$ds_name: ${free_gb}GB free / ${capacity_gb}GB total" +done +``` + +**Present to user:** +- Use `AskUserQuestion` tool with dropdown +- Show datastore name with capacity information for better decision-making + +**Path extraction:** +```bash +# From vsphere-helper JSON +DATASTORE_PATH=$(echo "$DATASTORES_JSON" | jq -r ".[] | select(.name == \"$USER_SELECTION\") | .path") + +# From govc output +DATASTORE_PATH="/${DATACENTER_NAME}/datastore/${USER_SELECTION}" +``` + +**IMPORTANT:** For install-config.yaml, use FULL datastore path (e.g., "/DC1/datastore/datastore1") + +### Step 7: Discover Networks + +**Using vsphere-helper (preferred):** +```bash +# List networks +NETWORKS_JSON=$(vsphere-helper list-networks --datacenter "$DATACENTER_NAME") + +# Parse JSON +echo "$NETWORKS_JSON" | jq -r '.[] | "\(.name) (\(.type))"' + +# Example output: +# ci-vlan-981 (DistributedVirtualPortgroup) +# VM Network (Network) +``` + +**Using govc (fallback):** +```bash +# List networks +govc ls "/${DATACENTER_NAME}/network" + +# Example output: +# /DC1/network/ci-vlan-981 +# /DC1/network/VM Network +``` + +**Present to user:** +- Use `AskUserQuestion` tool with dropdown +- Show network name and type +- Note: User should select the IBM Cloud Classic VLAN-associated port group if applicable + +**Path extraction:** +```bash +# From vsphere-helper JSON +NETWORK_NAME=$(echo "$NETWORKS_JSON" | jq -r ".[] | select(.path == \"$USER_SELECTION\") | .name") + +# From govc output +NETWORK_NAME=$(basename "$USER_SELECTION") +``` + +**IMPORTANT:** For install-config.yaml, use network NAME only without path prefix (e.g., "ci-vlan-981", not "/DC1/network/ci-vlan-981") + +### Step 8: Return Results + +Compile all discovered information and return as structured data: + +```bash +# Create result JSON +cat > /tmp/vsphere-discovery-result.json <` +- Or use insecure connection (not recommended): `export VSPHERE_INSECURE=true` + +### Authentication Errors + +**If authentication fails:** +``` +Error: failed to connect to vSphere: ServerFaultCode: Cannot complete login due to an incorrect user name or password +``` + +**Solution:** +- Verify vCenter username and password +- Ensure user has correct permissions +- Check if account is locked + +### Discovery Errors + +**If datacenter/cluster/datastore not found:** +``` +Error: failed to find datacenter 'DC1': datacenter 'DC1' not found +``` + +**Solution:** +- List available resources to verify exact names +- Check user has permission to view the resource +- Verify vCenter server is correct + +### Tool Not Available + +**If neither vsphere-helper nor govc available:** +- Offer to install govc: `bash plugins/openshift/scripts/install-govc.sh` +- Or offer to build vsphere-helper: `cd plugins/openshift/skills/vsphere-discovery && make install` + +## Benefits of vsphere-helper vs govc + +| Feature | vsphere-helper (govmomi) | govc CLI | +|---------|-------------------------|----------| +| **Path Accuracy** | ✅ Native govmomi paths | ⚠️ Manual string parsing | +| **Output Format** | ✅ Structured JSON | ⚠️ Text parsing required | +| **Performance** | ✅ Single binary call | ⚠️ Multiple subprocess spawns | +| **Error Messages** | ✅ Detailed Go errors | ⚠️ Generic CLI errors | +| **Type Safety** | ✅ Strongly typed | ❌ Strings only | +| **Session Management** | ✅ Efficient connection | ⚠️ Login per command | + +**Recommendation:** Always prefer vsphere-helper when available. Gracefully fall back to govc when needed. + +## Building vsphere-helper + +If the binary is not pre-built, guide the user to build it: + +```bash +cd plugins/openshift/skills/vsphere-discovery + +# Build for current platform +make build + +# Or install to ~/.local/bin +make install + +# Or build for all platforms +make build-all +``` + +Requirements: +- Go 1.23 or later +- Internet connection (to download govmomi dependency) + +The Makefile handles cross-compilation for: +- Linux: amd64, arm64 +- macOS: amd64, arm64 (M1/M2) + +## Example Workflow + +```bash +# 1. Set up environment +export VSPHERE_SERVER="vcenter.example.com" +export VSPHERE_USERNAME="administrator@vsphere.local" +export VSPHERE_PASSWORD="mypassword" +export VSPHERE_INSECURE="false" + +# 2. Install certificates (recommended) +bash plugins/openshift/scripts/install-vcenter-certs.sh "$VSPHERE_SERVER" + +# 3. Discover datacenters +vsphere-helper list-datacenters +# User selects: "DC1" + +# 4. Discover clusters +vsphere-helper list-clusters --datacenter DC1 +# User selects: "/DC1/host/Cluster1" + +# 5. Discover datastores +vsphere-helper list-datastores --datacenter DC1 +# User selects: "/DC1/datastore/datastore1" (500GB free) + +# 6. Discover networks +vsphere-helper list-networks --datacenter DC1 +# User selects: "ci-vlan-981" + +# 7. Return result JSON +{ + "datacenter": "DC1", + "cluster_path": "/DC1/host/Cluster1", + "datastore_path": "/DC1/datastore/datastore1", + "network": "ci-vlan-981" +} +``` + +## Notes + +- **Security:** Never log or display the vCenter password +- **Cleanup:** Unset environment variables after use if they contain sensitive data +- **Compatibility:** Supports vCenter 7.x and 8.x +- **Performance:** vsphere-helper is significantly faster for multiple queries (single session vs multiple govc calls) +- **Path Correctness:** Using govmomi ensures paths match exactly what OpenShift installer expects diff --git a/plugins/openshift/skills/vsphere-discovery/go.mod b/plugins/openshift/skills/vsphere-discovery/go.mod new file mode 100644 index 0000000..e4967f7 --- /dev/null +++ b/plugins/openshift/skills/vsphere-discovery/go.mod @@ -0,0 +1,9 @@ +module github.com/openshift-eng/ai-helpers/plugins/openshift/skills/vsphere-discovery + +go 1.23.0 + +toolchain go1.24.9 + +require github.com/vmware/govmomi v0.52.0 + +require github.com/google/uuid v1.6.0 // indirect diff --git a/plugins/openshift/skills/vsphere-discovery/go.sum b/plugins/openshift/skills/vsphere-discovery/go.sum new file mode 100644 index 0000000..014fd31 --- /dev/null +++ b/plugins/openshift/skills/vsphere-discovery/go.sum @@ -0,0 +1,14 @@ +github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= +github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8= +github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU= +github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0= +github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= +github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= +github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= +github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA= +github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY= +github.com/vmware/govmomi v0.52.0 h1:JyxQ1IQdllrY7PJbv2am9mRsv3p9xWlIQ66bv+XnyLw= +github.com/vmware/govmomi v0.52.0/go.mod h1:Yuc9xjznU3BH0rr6g7MNS1QGvxnJlE1vOvTJ7Lx7dqI= +gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= +gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= diff --git a/plugins/openshift/skills/vsphere-discovery/main.go b/plugins/openshift/skills/vsphere-discovery/main.go new file mode 100644 index 0000000..819efca --- /dev/null +++ b/plugins/openshift/skills/vsphere-discovery/main.go @@ -0,0 +1,434 @@ +package main + +import ( + "context" + "encoding/json" + "flag" + "fmt" + "net/url" + "os" + "strings" + + "github.com/vmware/govmomi" + "github.com/vmware/govmomi/find" + "github.com/vmware/govmomi/object" + "github.com/vmware/govmomi/property" + "github.com/vmware/govmomi/vim25/mo" + "github.com/vmware/govmomi/vim25/types" +) + +const version = "0.1.0" + +// Client wrapper for vSphere connection +type Client struct { + client *govmomi.Client + finder *find.Finder +} + +// Connect to vSphere +func connect(ctx context.Context, server, username, password string, insecure bool) (*Client, error) { + u, err := url.Parse(fmt.Sprintf("https://%s/sdk", server)) + if err != nil { + return nil, fmt.Errorf("failed to parse URL: %w", err) + } + + u.User = url.UserPassword(username, password) + + client, err := govmomi.NewClient(ctx, u, insecure) + if err != nil { + return nil, fmt.Errorf("failed to connect to vSphere: %w", err) + } + + finder := find.NewFinder(client.Client, true) + + return &Client{ + client: client, + finder: finder, + }, nil +} + +// Datacenter represents a vSphere datacenter +type Datacenter struct { + Name string `json:"name"` + Path string `json:"path"` +} + +// Cluster represents a vSphere cluster +type Cluster struct { + Name string `json:"name"` + Path string `json:"path"` +} + +// Datastore represents a vSphere datastore +type Datastore struct { + Name string `json:"name"` + Path string `json:"path"` + FreeSpace int64 `json:"freeSpace"` + Capacity int64 `json:"capacity"` + Type string `json:"type"` +} + +// Network represents a vSphere network +type Network struct { + Name string `json:"name"` + Path string `json:"path"` + Type string `json:"type"` +} + +// List all datacenters +func (c *Client) listDatacenters(ctx context.Context) ([]Datacenter, error) { + dcs, err := c.finder.DatacenterList(ctx, "*") + if err != nil { + return nil, fmt.Errorf("failed to list datacenters: %w", err) + } + + result := make([]Datacenter, len(dcs)) + for i, dc := range dcs { + result[i] = Datacenter{ + Name: dc.Name(), + Path: dc.InventoryPath, + } + } + + return result, nil +} + +// List clusters in a datacenter +func (c *Client) listClusters(ctx context.Context, datacenter string) ([]Cluster, error) { + dc, err := c.finder.Datacenter(ctx, datacenter) + if err != nil { + return nil, fmt.Errorf("failed to find datacenter '%s': %w", datacenter, err) + } + + c.finder.SetDatacenter(dc) + + clusters, err := c.finder.ClusterComputeResourceList(ctx, "*") + if err != nil { + return nil, fmt.Errorf("failed to list clusters: %w", err) + } + + result := make([]Cluster, len(clusters)) + for i, cluster := range clusters { + result[i] = Cluster{ + Name: cluster.Name(), + Path: cluster.InventoryPath, + } + } + + return result, nil +} + +// List datastores in a datacenter +func (c *Client) listDatastores(ctx context.Context, datacenter string) ([]Datastore, error) { + dc, err := c.finder.Datacenter(ctx, datacenter) + if err != nil { + return nil, fmt.Errorf("failed to find datacenter '%s': %w", datacenter, err) + } + + c.finder.SetDatacenter(dc) + + datastores, err := c.finder.DatastoreList(ctx, "*") + if err != nil { + return nil, fmt.Errorf("failed to list datastores: %w", err) + } + + // Fetch datastore properties + var dss []mo.Datastore + pc := property.DefaultCollector(c.client.Client) + refs := make([]types.ManagedObjectReference, len(datastores)) + for i, ds := range datastores { + refs[i] = ds.Reference() + } + + err = pc.Retrieve(ctx, refs, []string{"name", "summary"}, &dss) + if err != nil { + return nil, fmt.Errorf("failed to retrieve datastore properties: %w", err) + } + + result := make([]Datastore, len(dss)) + for i, ds := range dss { + result[i] = Datastore{ + Name: ds.Name, + Path: datastores[i].InventoryPath, + FreeSpace: ds.Summary.FreeSpace, + Capacity: ds.Summary.Capacity, + Type: ds.Summary.Type, + } + } + + return result, nil +} + +// List networks in a datacenter +func (c *Client) listNetworks(ctx context.Context, datacenter string) ([]Network, error) { + dc, err := c.finder.Datacenter(ctx, datacenter) + if err != nil { + return nil, fmt.Errorf("failed to find datacenter '%s': %w", datacenter, err) + } + + c.finder.SetDatacenter(dc) + + networks, err := c.finder.NetworkList(ctx, "*") + if err != nil { + return nil, fmt.Errorf("failed to list networks: %w", err) + } + + result := make([]Network, 0, len(networks)) + for _, net := range networks { + var netType string + switch net.(type) { + case *object.Network: + netType = "Network" + case *object.DistributedVirtualPortgroup: + netType = "DistributedVirtualPortgroup" + case *object.OpaqueNetwork: + netType = "OpaqueNetwork" + default: + netType = "Unknown" + } + + result = append(result, Network{ + Name: net.GetInventoryPath(), + Path: net.GetInventoryPath(), + Type: netType, + }) + } + + return result, nil +} + +func main() { + if len(os.Args) < 2 { + printUsage() + os.Exit(1) + } + + command := os.Args[1] + + switch command { + case "version", "--version", "-v": + fmt.Printf("vsphere-helper version %s\n", version) + os.Exit(0) + case "help", "--help", "-h": + printUsage() + os.Exit(0) + case "list-datacenters": + listDatacentersCmd() + case "list-clusters": + listClustersCmd() + case "list-datastores": + listDatastoresCmd() + case "list-networks": + listNetworksCmd() + default: + fmt.Fprintf(os.Stderr, "Unknown command: %s\n\n", command) + printUsage() + os.Exit(1) + } +} + +func printUsage() { + fmt.Println("vsphere-helper - vSphere discovery tool using govmomi") + fmt.Println() + fmt.Println("Usage:") + fmt.Println(" vsphere-helper [flags]") + fmt.Println() + fmt.Println("Commands:") + fmt.Println(" list-datacenters List all datacenters") + fmt.Println(" list-clusters List clusters in a datacenter") + fmt.Println(" list-datastores List datastores in a datacenter") + fmt.Println(" list-networks List networks in a datacenter") + fmt.Println(" version Show version") + fmt.Println(" help Show this help") + fmt.Println() + fmt.Println("Authentication:") + fmt.Println(" All commands require vSphere connection via environment variables:") + fmt.Println(" VSPHERE_SERVER - vCenter server (e.g., vcenter.example.com)") + fmt.Println(" VSPHERE_USERNAME - vCenter username") + fmt.Println(" VSPHERE_PASSWORD - vCenter password") + fmt.Println(" VSPHERE_INSECURE - Skip SSL verification (default: false)") + fmt.Println() + fmt.Println("Examples:") + fmt.Println(" # List all datacenters") + fmt.Println(" export VSPHERE_SERVER=vcenter.example.com") + fmt.Println(" export VSPHERE_USERNAME=administrator@vsphere.local") + fmt.Println(" export VSPHERE_PASSWORD=password") + fmt.Println(" vsphere-helper list-datacenters") + fmt.Println() + fmt.Println(" # List clusters in a datacenter") + fmt.Println(" vsphere-helper list-clusters --datacenter DC1") + fmt.Println() + fmt.Println(" # List datastores with capacity info") + fmt.Println(" vsphere-helper list-datastores --datacenter DC1") +} + +func getEnvConfig() (server, username, password string, insecure bool, err error) { + server = os.Getenv("VSPHERE_SERVER") + username = os.Getenv("VSPHERE_USERNAME") + password = os.Getenv("VSPHERE_PASSWORD") + insecureStr := os.Getenv("VSPHERE_INSECURE") + + if server == "" { + return "", "", "", false, fmt.Errorf("VSPHERE_SERVER environment variable not set") + } + if username == "" { + return "", "", "", false, fmt.Errorf("VSPHERE_USERNAME environment variable not set") + } + if password == "" { + return "", "", "", false, fmt.Errorf("VSPHERE_PASSWORD environment variable not set") + } + + insecure = strings.ToLower(insecureStr) == "true" || insecureStr == "1" + + return server, username, password, insecure, nil +} + +func listDatacentersCmd() { + server, username, password, insecure, err := getEnvConfig() + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + ctx := context.Background() + client, err := connect(ctx, server, username, password, insecure) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + defer client.client.Logout(ctx) + + dcs, err := client.listDatacenters(ctx) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + output, err := json.MarshalIndent(dcs, "", " ") + if err != nil { + fmt.Fprintf(os.Stderr, "Error marshaling JSON: %v\n", err) + os.Exit(1) + } + + fmt.Println(string(output)) +} + +func listClustersCmd() { + fs := flag.NewFlagSet("list-clusters", flag.ExitOnError) + datacenter := fs.String("datacenter", "", "Datacenter name (required)") + fs.Parse(os.Args[2:]) + + if *datacenter == "" { + fmt.Fprintf(os.Stderr, "Error: --datacenter flag is required\n") + fs.Usage() + os.Exit(1) + } + + server, username, password, insecure, err := getEnvConfig() + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + ctx := context.Background() + client, err := connect(ctx, server, username, password, insecure) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + defer client.client.Logout(ctx) + + clusters, err := client.listClusters(ctx, *datacenter) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + output, err := json.MarshalIndent(clusters, "", " ") + if err != nil { + fmt.Fprintf(os.Stderr, "Error marshaling JSON: %v\n", err) + os.Exit(1) + } + + fmt.Println(string(output)) +} + +func listDatastoresCmd() { + fs := flag.NewFlagSet("list-datastores", flag.ExitOnError) + datacenter := fs.String("datacenter", "", "Datacenter name (required)") + fs.Parse(os.Args[2:]) + + if *datacenter == "" { + fmt.Fprintf(os.Stderr, "Error: --datacenter flag is required\n") + fs.Usage() + os.Exit(1) + } + + server, username, password, insecure, err := getEnvConfig() + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + ctx := context.Background() + client, err := connect(ctx, server, username, password, insecure) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + defer client.client.Logout(ctx) + + datastores, err := client.listDatastores(ctx, *datacenter) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + output, err := json.MarshalIndent(datastores, "", " ") + if err != nil { + fmt.Fprintf(os.Stderr, "Error marshaling JSON: %v\n", err) + os.Exit(1) + } + + fmt.Println(string(output)) +} + +func listNetworksCmd() { + fs := flag.NewFlagSet("list-networks", flag.ExitOnError) + datacenter := fs.String("datacenter", "", "Datacenter name (required)") + fs.Parse(os.Args[2:]) + + if *datacenter == "" { + fmt.Fprintf(os.Stderr, "Error: --datacenter flag is required\n") + fs.Usage() + os.Exit(1) + } + + server, username, password, insecure, err := getEnvConfig() + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + ctx := context.Background() + client, err := connect(ctx, server, username, password, insecure) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + defer client.client.Logout(ctx) + + networks, err := client.listNetworks(ctx, *datacenter) + if err != nil { + fmt.Fprintf(os.Stderr, "Error: %v\n", err) + os.Exit(1) + } + + output, err := json.MarshalIndent(networks, "", " ") + if err != nil { + fmt.Fprintf(os.Stderr, "Error marshaling JSON: %v\n", err) + os.Exit(1) + } + + fmt.Println(string(output)) +}