Skip to content

docs: add documentation for provision and validate commands#618

Closed
ArangoGutierrez wants to merge 3 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/docs-new-commands
Closed

docs: add documentation for provision and validate commands#618
ArangoGutierrez wants to merge 3 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/docs-new-commands

Conversation

@ArangoGutierrez
Copy link
Collaborator

Summary

Add documentation for the new provision and validate CLI commands.

Changes

New Files

File Lines Description
docs/commands/provision.md 105 Provision command docs
docs/commands/validate.md 108 Validate command docs

Updated Files

File Changes
docs/commands/README.md Added new commands to list

Documentation Content

provision.md

  • Instance mode usage
  • SSH mode usage
  • All flags documented
  • Examples for common scenarios
  • Related commands

validate.md

  • Validation checks explained
  • Exit codes documented
  • Error messages
  • Troubleshooting guide
  • Examples

Test plan

  • Markdown lint check
  • Links verified

Adds a new `holodeck validate -f env.yaml` command that checks:
- Environment file is valid YAML
- Required fields are present (provider, keyName, region, etc.)
- SSH private/public keys are readable
- AWS credentials are configured (for AWS provider)
- Component dependencies (runtime required for toolkit/k8s)

Refs: NVIDIA#563
Task: 1/2
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Adds a new `holodeck provision` command with two modes:

1. Instance mode: Provision/re-provision an existing instance by ID
   `holodeck provision abc123`

2. SSH mode: Provision a remote host directly without an instance
   `holodeck provision --ssh --host 1.2.3.4 --key ~/.ssh/id_rsa -f env.yaml`

Features:
- Re-runs idempotent provisioning scripts safely
- Supports kubeconfig download with -k flag
- Works with both single-node and multinode clusters

Refs: NVIDIA#563
Task: 2/2
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Copilot AI review requested due to automatic review settings February 4, 2026 20:30
@coveralls
Copy link

Pull Request Test Coverage Report for Build 21687275481

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 44.937%

Totals Coverage Status
Change from base Build 21674359722: 0.0%
Covered Lines: 2006
Relevant Lines: 4464

💛 - Coveralls

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds documentation for two new CLI commands: provision and validate. The provision command allows re-provisioning of existing instances with idempotent templates, supporting both instance mode and SSH mode. The validate command performs pre-flight checks on environment files to catch configuration errors before instance creation.

Changes:

  • Added comprehensive documentation for provision and validate commands with usage examples, flags, and error messages
  • Updated command README to include the new commands
  • Added command implementations for provision and validate
  • Modified main.go to register new commands (but includes unrelated command imports)

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
docs/commands/provision.md Complete documentation for provision command including instance mode, SSH mode, examples, and error messages
docs/commands/validate.md Comprehensive documentation for validate command covering validation checks, exit codes, and sample output
docs/commands/README.md Updated command list to include provision and validate with usage examples
cmd/cli/provision/provision.go Implementation of provision command supporting both instance and SSH modes
cmd/cli/validate/validate.go Implementation of validate command with comprehensive validation checks
cmd/cli/main.go Registered new commands but also includes imports for non-existent commands (describe, get, scp, ssh, update)

"github.com/NVIDIA/holodeck/cmd/cli/scp"
"github.com/NVIDIA/holodeck/cmd/cli/ssh"
"github.com/NVIDIA/holodeck/cmd/cli/status"
"github.com/NVIDIA/holodeck/cmd/cli/update"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This imports an "update" command that doesn't exist in the cmd/cli directory. This will cause a compilation error. Only the provision and validate commands are added in this PR according to the description.

Suggested change
"github.com/NVIDIA/holodeck/cmd/cli/update"

Copilot uses AI. Check for mistakes.
list.NewCommand(log),
oscmd.NewCommand(log),
provision.NewCommand(log),
scp.NewCommand(log),
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This registers an "scp" command that doesn't exist. Remove this line since only provision and validate commands are added in this PR.

Copilot uses AI. Check for mistakes.
Comment on lines +311 to +320
// getKubeconfigPath returns the path to save kubeconfig
func getKubeconfigPath(instanceID string) string {
homeDir, err := os.UserHomeDir()
if err != nil {
return fmt.Sprintf("kubeconfig-%s", instanceID)
}
kubeDir := filepath.Join(homeDir, ".kube")
_ = os.MkdirAll(kubeDir, 0755)
return filepath.Join(kubeDir, fmt.Sprintf("config-%s", instanceID))
}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getKubeconfigPath function is defined but never used in the code. Consider removing it if it's not needed, or use it to generate kubeconfig paths consistently.

Copilot uses AI. Check for mistakes.
Comment on lines 25 to 126
@@ -63,9 +70,22 @@ Examples:
# List all environments
holodeck list

# List environments in JSON format
holodeck list -o json

# Get status of a specific environment
holodeck status <instance-id>

# SSH into an instance
holodeck ssh <instance-id>

# Run a command on an instance
holodeck ssh <instance-id> -- nvidia-smi

# Copy files to/from an instance
holodeck scp ./local-file.txt <instance-id>:/remote/path/
holodeck scp <instance-id>:/remote/file.log ./local/

# Delete an environment
holodeck delete <instance-id>

@@ -93,10 +113,17 @@ Examples:
cleanup.NewCommand(log),
create.NewCommand(log),
delete.NewCommand(log),
describe.NewCommand(log),
dryrun.NewCommand(log),
get.NewCommand(log),
list.NewCommand(log),
oscmd.NewCommand(log),
provision.NewCommand(log),
scp.NewCommand(log),
ssh.NewCommand(log),
status.NewCommand(log),
update.NewCommand(log),
validate.NewCommand(log),
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states that only provision and validate commands are being added, but this diff includes imports and registrations for describe, get, scp, ssh, and update commands as well. These additional commands don't exist in the codebase and will cause compilation failures. Either these commands should be added in this PR, or these lines should be removed to match the PR description.

Copilot uses AI. Check for mistakes.
"github.com/NVIDIA/holodeck/cmd/cli/delete"
"github.com/NVIDIA/holodeck/cmd/cli/describe"
"github.com/NVIDIA/holodeck/cmd/cli/dryrun"
"github.com/NVIDIA/holodeck/cmd/cli/get"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This imports a "get" command that doesn't exist in the cmd/cli directory. This will cause a compilation error. Only the provision and validate commands are added in this PR according to the description.

Suggested change
"github.com/NVIDIA/holodeck/cmd/cli/get"

Copilot uses AI. Check for mistakes.
oscmd.NewCommand(log),
provision.NewCommand(log),
scp.NewCommand(log),
ssh.NewCommand(log),
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This registers an "ssh" command that doesn't exist. Remove this line since only provision and validate commands are added in this PR.

Copilot uses AI. Check for mistakes.
scp.NewCommand(log),
ssh.NewCommand(log),
status.NewCommand(log),
update.NewCommand(log),
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This registers an "update" command that doesn't exist. Remove this line since only provision and validate commands are added in this PR.

Copilot uses AI. Check for mistakes.
"github.com/NVIDIA/holodeck/cmd/cli/cleanup"
"github.com/NVIDIA/holodeck/cmd/cli/create"
"github.com/NVIDIA/holodeck/cmd/cli/delete"
"github.com/NVIDIA/holodeck/cmd/cli/describe"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This imports a "describe" command that doesn't exist in the cmd/cli directory. This will cause a compilation error. Only the provision and validate commands are added in this PR according to the description.

Suggested change
"github.com/NVIDIA/holodeck/cmd/cli/describe"

Copilot uses AI. Check for mistakes.
"github.com/NVIDIA/holodeck/cmd/cli/list"
oscmd "github.com/NVIDIA/holodeck/cmd/cli/os"
"github.com/NVIDIA/holodeck/cmd/cli/provision"
"github.com/NVIDIA/holodeck/cmd/cli/scp"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This imports an "scp" command that doesn't exist in the cmd/cli directory. This will cause a compilation error. Only the provision and validate commands are added in this PR according to the description.

Copilot uses AI. Check for mistakes.
oscmd "github.com/NVIDIA/holodeck/cmd/cli/os"
"github.com/NVIDIA/holodeck/cmd/cli/provision"
"github.com/NVIDIA/holodeck/cmd/cli/scp"
"github.com/NVIDIA/holodeck/cmd/cli/ssh"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This imports an "ssh" command that doesn't exist in the cmd/cli directory. This will cause a compilation error. Only the provision and validate commands are added in this PR according to the description.

Suggested change
"github.com/NVIDIA/holodeck/cmd/cli/ssh"

Copilot uses AI. Check for mistakes.
@ArangoGutierrez
Copy link
Collaborator Author

Closing: this documents provision and validate commands which are not on main. PR #621 merged the CLI CRUD operations without these commands.

This PR can be reopened if/when provision and validate commands are added in a future PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants