Skip to content

Conversation

@nvanthao
Copy link
Member

@nvanthao nvanthao commented May 20, 2025

Add CMX VM Support for Embedded Cluster Testing

Overview

This PR adds support for creating and managing CMX VM through Replicated CLI for wg-easy Embedded Cluster testing. This provides an alternative to GCP VMs for testing embedded cluster deployments.

Changes

  • Added new CMX VM configuration variables to Taskfile.yaml:
    • VM name, distribution, version, instance type, disk size, and TTL settings
    • SSH configuration for VM access
  • Added three new tasks:
    • cmx-vm-create: Creates a CMX VM instance with configurable parameters
    • cmx-vm-delete: Cleans up CMX VM instances
    • cmx-vm-embedded-cluster-setup: Sets up a Replicated embedded cluster on the CMX VM
  • Updated docs/task-reference.md with:
    • New task descriptions and usage information
    • Documentation for CMX VM-specific parameters

Testing

The new tasks have been tested with the following workflow:

  1. Create CMX VM using task cmx-vm-create
  2. Set up embedded cluster using task cmx-vm-embedded-cluster-setup
  3. Verify cluster deployment
  4. Clean up using task cmx-vm-delete

Usage Example

# Create a CMX VM
task cmx-vm-create

# Set up embedded cluster
task cmx-vm-embedded-cluster-setup CMX_VM_USER=github-handle

# Clean up
task cmx-vm-delete

Required Parameters

  • CMX_VM_USER: GitHub handle for SSH access
  • CMX_VM_PUBLIC_KEY: Path to SSH public key (defaults to ~/.ssh/id_ed25519)

Optional Parameters

  • CMX_VM_NAME: Custom VM name (defaults to {USER}-cmx-vm)
  • CMX_VM_DISTRIBUTION: VM distribution (defaults to "ubuntu")
  • CMX_VM_VERSION: VM version (defaults to "24.04")
  • CMX_VM_INSTANCE_TYPE: VM instance type (defaults to "r1.medium")
  • CMX_VM_DISK_SIZE: VM disk size in GB (defaults to "100")
  • CMX_VM_TTL: VM time-to-live (defaults to "1h")

@nvanthao nvanthao self-assigned this May 20, 2025
Copy link
Member

@chris-sanders chris-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments in-line

CMX_VM_DISK_SIZE: '{{.CMX_VM_DISK_SIZE | default "100"}}'
CMX_VM_TTL: '{{.CMX_VM_TTL | default "1h"}}'
CMX_VM_USER: '{{.CMX_VS_USER}}'
CMX_VM_PUBLIC_KEY: '{{.CMX_VM_PUBLIC_KEY | default (env "HOME" | printf "%s/.ssh/id_ed25519")}}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not guess at a default here, just make this a required setting. Or actually do we need to know this at all? As long as you have your key loaded in your ssh-agent we don't need to set this at all do we?

Ideally do not set a default for this, do not require it, but allow people to specify it if they want. That way anyone who's agent is already configured just works out of the box, and anyone that doesn't know how to configure an agent can specify their ssh_public_key and have it integrated into the ssh command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha this has been done.

silent: false
vars:
CHANNEL: '{{.CHANNEL | default "Unstable"}}'
AUTH_TOKEN: '{{.AUTH_TOKEN | default "2usDXzovcJNcpn54yS5tFQVNvCq"}}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is a real token you need to remove it and rotate this customer. We shouldn't be putting any default tokens like this into taskfile. Isn't this just the license ID and we already have a variable for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the token from existing task embedded-cluster-setup variable declaration. I've also updated both tasks to use license ID.

echo "SSH into the VM and download the app binary..."
SSH_BASE_CMD="ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i {{.CMX_VM_PUBLIC_KEY}}"
VM_SSH_CMD=$(replicated vm ls --output=json | jq -r ".[] | select(.name == \"{{.CMX_VM_NAME}}\") | \"$SSH_BASE_CMD -p \(.direct_ssh_port) nvanthao@\(.direct_ssh_endpoint)\"")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvanthao

You have your own user name hard coded.
Didn't we add a cli command to the replciated cli so you don' thave to do the jq here?

@hedge-sparrow isn't there a better way to do an SSH now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated my username to the taskfile var {{.CMX_VM_USER}
I originally planned to use replicated vm ssh-endpoint <vm-name> but the output requires further parsing.

current command output

ssh://<username>@<ip>:<port>

SSH_BASE_CMD="ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i {{.CMX_VM_PUBLIC_KEY}}"
VM_SSH_CMD=$(replicated vm ls --output=json | jq -r ".[] | select(.name == \"{{.CMX_VM_NAME}}\") | \"$SSH_BASE_CMD -p \(.direct_ssh_port) nvanthao@\(.direct_ssh_endpoint)\"")
echo "SSH command: $VM_SSH_CMD"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you echo an SSH command if it has a license / authorization ID in it? I would skip this echo if it's going to expose license information because the echo can't be hidden by setting silent: true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been done.

Comment on lines 608 to 609
echo "Installing {{.APP_NAME}}..."
sudo ./{{.APP_NAME}} install --license license.yaml --admin-console-password {{.ADMIN_CONSOLE_PASSWORD}} --yes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gcp 'setup' task doesn't actually run the install it just gets the binary available to run.
I like the idea of this doing the whole isntall for you so that you can just hit the KOTS UI and do your testing. But with that maybe:

  1. Provide an argument to skip the install in the event someone wants to run/demo the install portion. It can default to automatically install like you have here.
  2. The task name is very long maybe if it's doing everything you can just call it cmx-vm-install I can't think of anything else we would be installing on a cmx-vm other than the embedded cluster so I think this is probably sufficient?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great idea, I've updated the task name and add new SKIP_INSTALL var

| `embedded-cluster-setup` | Sets up a Replicated embedded cluster on the GCP VM | Stage 7: Embedded Testing |
| `cmx-vm-create` | Creates a CMX VM instance using Replicated CLI | Stage 7: Embedded Testing |
| `cmx-vm-delete` | Deletes a CMX VM instance | Stage 7: Cleanup |
| `cmx-vm-embedded-cluster-setup` | Sets up a Replicated embedded cluster on the CMX VM | Stage 7: Embedded Testing |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above on a more concise name for this last one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@chris-sanders chris-sanders marked this pull request as ready for review May 20, 2025 19:51
@nvanthao nvanthao requested a review from chris-sanders May 21, 2025 00:12
cmx-vm-create:
desc: Create a CMX VM instance using Replicated CLI
run: once
silent: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for not silencing this task?

cmx-vm-delete:
desc: Delete a CMX VM instance
silent: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe silence this?

cmx-vm-install:
desc: Download and install the app as Embedded Cluster on CMX VM
silent: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to silence this. Logs from the command are quite noisy by themselves.

if [ -n "{{.CMX_VM_PUBLIC_KEY}}" ]; then
SSH_BASE_CMD="$SSH_BASE_CMD -i {{.CMX_VM_PUBLIC_KEY}}"
fi
VM_SSH_CMD=$(replicated vm ls --output=json | jq -r ".[] | select(.name == \"{{.CMX_VM_NAME}}\") | \"$SSH_BASE_CMD -p \(.direct_ssh_port) {{.CMX_VM_USER}}@\(.direct_ssh_endpoint)\"")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future travellers. Using replicated vm ssh-endpoint does not yet work if your replicated token is a service token. Prefer this vm ls | jq ... oneliner

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's interesting, is that a known limitation is there a bug shortcut for that with CMX?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tagged @chris-sanders to the Slack thread discussion on this.

@nvanthao nvanthao force-pushed the gerard/122893/cmx-vm branch from b44e1e8 to 9e54e97 Compare May 21, 2025 21:21
@nvanthao nvanthao requested a review from banjoh May 21, 2025 21:28
Copy link
Member

@chris-sanders chris-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of my concerns were addressed

@nvanthao nvanthao merged commit 0d98776 into main May 23, 2025
1 check passed
@nvanthao nvanthao deleted the gerard/122893/cmx-vm branch May 23, 2025 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants