fix(pkg): use %w for proper error wrapping#609
Conversation
Pull Request Test Coverage Report for Build 21754753844Details
💛 - Coveralls |
There was a problem hiding this comment.
Pull request overview
Updates error wrapping across pkg/ to preserve error chains (switching fmt.Errorf(...: %v, err) to %w) and also introduces new CLI surface area (new validate/provision commands and additional command wiring in cmd/cli/main.go).
Changes:
- Replace
%vwith%winfmt.Errorfcalls acrosspkg/for proper error wrapping. - Add new CLI commands:
holodeck validateandholodeck provision. - Expand CLI command registration/help text in
cmd/cli/main.go(including new, currently-missing command packages).
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/utils/kubeconfig.go | Wrap SSH/session/file transfer errors with %w for chain preservation. |
| pkg/utils/ip.go | Wrap HTTP request/response errors with %w. |
| pkg/provisioner/templates/nv-driver.go | Wrap template execution errors with %w. |
| pkg/provisioner/templates/kubernetes.go | Wrap template execution and file/working-dir errors with %w. |
| pkg/provisioner/templates/docker.go | Wrap template execution errors with %w. |
| pkg/provisioner/templates/crio.go | Wrap template execution errors with %w. |
| pkg/provisioner/templates/containerd.go | Wrap template execution errors with %w. |
| pkg/provisioner/templates/container-toolkit.go | Wrap template execution errors with %w. |
| pkg/provisioner/provisioner.go | Wrap provisioning/SSH/SFTP/template errors with %w. |
| pkg/provider/aws/dryrun.go | Wrap image-check errors with %w. |
| pkg/provider/aws/delete.go | Wrap delete/condition-update/AWS API errors with %w. |
| pkg/provider/aws/create.go | Wrap AWS resource creation/tagging/waiter errors with %w. |
| pkg/provider/aws/cluster.go | Wrap cluster-creation workflow errors with %w (including goroutine error propagation). |
| cmd/cli/validate/validate.go | New validate command for env file, required fields, SSH keys, AWS creds, component checks. |
| cmd/cli/provision/provision.go | New provision command for instance-based or direct SSH provisioning (+ kubeconfig download). |
| cmd/cli/main.go | Registers additional CLI commands and updates help/examples. |
cmd/cli/validate/validate.go
Outdated
|
|
||
| env, err := jyaml.UnmarshalFromFile[v1alpha1.Environment](m.envFile) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("invalid YAML: %v", err) |
There was a problem hiding this comment.
validateEnvFile uses fmt.Errorf("invalid YAML: %v", err), which drops the wrapped error for errors.Is/As. Use %w when returning the underlying parse error.
| return nil, fmt.Errorf("invalid YAML: %v", err) | |
| return nil, fmt.Errorf("invalid YAML: %w", err) |
cmd/cli/validate/validate.go
Outdated
| func (m *command) run() error { | ||
| results := make([]ValidationResult, 0) | ||
| hasErrors := false | ||
| hasWarnings := false | ||
|
|
||
| // 1. Validate environment file exists and is valid YAML | ||
| env, err := m.validateEnvFile() | ||
| if err != nil { | ||
| results = append(results, ValidationResult{ | ||
| Check: "Environment file", | ||
| Passed: false, | ||
| Message: err.Error(), | ||
| }) | ||
| hasErrors = true | ||
| m.printResults(results) | ||
| return fmt.Errorf("validation failed") | ||
| } | ||
| results = append(results, ValidationResult{ | ||
| Check: "Environment file", | ||
| Passed: true, | ||
| Message: "Valid YAML structure", | ||
| }) | ||
|
|
||
| // 2. Validate required fields | ||
| fieldResults := m.validateRequiredFields(env) | ||
| for _, r := range fieldResults { | ||
| results = append(results, r) | ||
| if !r.Passed { | ||
| hasErrors = true | ||
| } | ||
| } | ||
|
|
||
| // 3. Validate SSH keys | ||
| keyResults := m.validateSSHKeys(env) | ||
| for _, r := range keyResults { | ||
| results = append(results, r) | ||
| if !r.Passed { | ||
| hasErrors = true | ||
| } | ||
| } | ||
|
|
||
| // 4. Validate AWS credentials (if AWS provider) | ||
| if env.Spec.Provider == v1alpha1.ProviderAWS { | ||
| awsResult := m.validateAWSCredentials() | ||
| results = append(results, awsResult) | ||
| if !awsResult.Passed { | ||
| if strings.Contains(awsResult.Message, "warning") { | ||
| hasWarnings = true | ||
| } else { | ||
| hasErrors = true | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // 5. Validate component configuration | ||
| compResults := m.validateComponents(env) | ||
| for _, r := range compResults { | ||
| results = append(results, r) | ||
| if !r.Passed { | ||
| hasWarnings = true | ||
| } | ||
| } | ||
|
|
||
| // Print results | ||
| m.printResults(results) | ||
|
|
||
| // Determine exit status | ||
| if hasErrors { | ||
| return fmt.Errorf("validation failed with errors") | ||
| } | ||
| if hasWarnings && m.strict { | ||
| return fmt.Errorf("validation failed with warnings (strict mode)") | ||
| } | ||
|
|
||
| m.log.Info("\n✅ Validation passed") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Most cmd/cli/* subcommands have unit tests (e.g., cmd/cli/delete/delete_test.go, cmd/cli/dryrun/dryrun_test.go). This new command introduces substantial logic but adds no tests; please add at least basic table-driven tests for required-field validation and key/AWS credential checks (mockable paths).
cmd/cli/provision/provision.go
Outdated
| Action: func(c *cli.Context) error { | ||
| if m.sshMode { | ||
| return m.runSSHMode() | ||
| } | ||
|
|
||
| if c.NArg() != 1 { | ||
| return fmt.Errorf("instance ID is required (or use --ssh mode)") | ||
| } | ||
| return m.runInstanceMode(c.Args().Get(0)) | ||
| }, | ||
| } | ||
|
|
||
| return &provisionCmd | ||
| } | ||
|
|
||
| func (m *command) runInstanceMode(instanceID string) error { | ||
| // Get instance details | ||
| manager := instances.NewManager(m.log, m.cachePath) | ||
| instance, err := manager.GetInstance(instanceID) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get instance: %v", err) | ||
| } | ||
|
|
||
| // Load environment | ||
| env, err := jyaml.UnmarshalFromFile[v1alpha1.Environment](instance.CacheFile) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to read environment: %v", err) | ||
| } | ||
|
|
||
| m.log.Info("Provisioning instance %s...", instanceID) | ||
|
|
||
| // Run provisioning based on instance type | ||
| if env.Spec.Cluster != nil && env.Status.Cluster != nil && len(env.Status.Cluster.Nodes) > 0 { | ||
| if err := m.runClusterProvision(&env); err != nil { | ||
| return err | ||
| } | ||
| } else { | ||
| if err := m.runSingleNodeProvision(&env); err != nil { | ||
| return err | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Most cmd/cli/* subcommands have unit tests, but this new command adds significant branching behavior (instance vs SSH mode, cluster vs single-node) without tests. Add unit tests for argument/flag validation and host URL selection logic to prevent regressions.
| var err error | ||
| key, err := os.ReadFile(keyPath) // nolint:gosec | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to read key file: %v", err) | ||
| return nil, fmt.Errorf("failed to read key file: %w", err) | ||
| } |
There was a problem hiding this comment.
connectOrDie reads keyPath directly with os.ReadFile, so values like ~/.ssh/id_rsa will fail (no shell expansion). Several CLI paths/examples use ~, and cmd/cli/validate even expands it, so validation may pass but provisioning can still fail. Consider expanding ~ (and/or making the path absolute) before reading the key.
cmd/cli/main.go
Outdated
| "github.com/NVIDIA/holodeck/cmd/cli/cleanup" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/create" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/delete" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/describe" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/dryrun" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/get" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/list" | ||
| oscmd "github.com/NVIDIA/holodeck/cmd/cli/os" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/provision" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/scp" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/ssh" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/status" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/update" | ||
| "github.com/NVIDIA/holodeck/cmd/cli/validate" |
There was a problem hiding this comment.
cmd/cli/main.go imports and registers subcommands (describe/get/scp/ssh/update) that do not exist under cmd/cli/ in this PR/repo, which will break compilation due to missing packages. Either add the missing command packages or remove these imports/command registrations until they’re implemented.
| c.Commands = []*cli.Command{ | ||
| cleanup.NewCommand(log), | ||
| create.NewCommand(log), | ||
| delete.NewCommand(log), | ||
| describe.NewCommand(log), |
There was a problem hiding this comment.
PR title/description indicate a scoped change in pkg/ to improve error wrapping, but this file also adds multiple new CLI commands to the top-level app. If these CLI additions are intentional, the PR metadata should be updated; otherwise, the cmd/ changes should be split out to keep this PR focused.
cmd/cli/provision/provision.go
Outdated
| } | ||
| } | ||
|
|
||
| // Update provisioned status |
There was a problem hiding this comment.
env.Labels can be nil when unmarshaled; writing to it directly can panic. Initialize the map before setting instances.InstanceProvisionedLabelKey.
| // Update provisioned status | |
| // Update provisioned status | |
| if env.Labels == nil { | |
| env.Labels = make(map[string]string) | |
| } |
cmd/cli/provision/provision.go
Outdated
| instance, err := manager.GetInstance(instanceID) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get instance: %v", err) | ||
| } | ||
|
|
||
| // Load environment | ||
| env, err := jyaml.UnmarshalFromFile[v1alpha1.Environment](instance.CacheFile) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to read environment: %v", err) | ||
| } |
There was a problem hiding this comment.
This file wraps underlying errors with fmt.Errorf(... %v, err) in multiple places (e.g., reading instance/env). That loses the error chain and is inconsistent with the stated goal of this PR; use %w where returning a wrapped error so callers can use errors.Is/As.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
f914e04 to
446725d
Compare
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Summary
Replace all
fmt.Errorf("...: %v", err)withfmt.Errorf("...: %w", err)in thepkg/directory for proper error chain preservation.Motivation
Using
%vto format errors breaks the error chain, making it impossible for callers to useerrors.Is()orerrors.As()to inspect wrapped errors. This makes debugging difficult and prevents proper error handling patterns.Changes
111 instances fixed across 13 files:
pkg/provider/awscluster.gopkg/provider/awscreate.gopkg/provider/awsdelete.gopkg/provider/awsdryrun.gopkg/provisionerprovisioner.gopkg/provisioner/templatescontainer-toolkit.gopkg/provisioner/templatescontainerd.gopkg/provisioner/templatescrio.gopkg/provisioner/templatesdocker.gopkg/provisioner/templateskubernetes.gopkg/provisioner/templatesnv-driver.gopkg/utilsip.gopkg/utilskubeconfig.goWhat Changed
Not Changed
[]error) -%wonly works with single errorsfmt.Errorfcontexts (loggers, test helpers)cmd/orvendor/directoriesTest plan
go build ./pkg/...- compiles successfullygo test ./pkg/...- verify no regressions