fix(instances): return errors from GenerateInstanceID, validate ID format#648
Conversation
…rmat GenerateInstanceID silently returned empty string on rand failure, producing a cache file named '.yaml'. Change to return (string, error). Also validate instance IDs match hex format in GetInstanceCacheFile to prevent path traversal. Audit findings NVIDIA#8 (MEDIUM), NVIDIA#35 (LOW). Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Pull Request Test Coverage Report for Build 21961744234Details
💛 - Coveralls |
There was a problem hiding this comment.
Pull request overview
This PR hardens instance ID handling in Holodeck by making instance ID generation/reporting fail loudly on entropy errors and by validating instance IDs before constructing cache file paths (mitigating path traversal).
Changes:
- Change
GenerateInstanceIDto return(string, error)and propagatecrypto/randfailures. - Add strict instance ID format validation in
GetInstanceCacheFile(accept 8-char lowercase hex + lowercase UUID for backwards compatibility). - Update CLI create flow and instance manager methods/tests to handle the new error-returning APIs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| internal/instances/instances.go | Add ID format regex validation, return errors from ID generation and cache-path construction, and plumb errors through instance operations. |
| internal/instances/instances_test.go | Update tests for new signatures and add validation/traversal rejection coverage. |
| cmd/cli/create/create.go | Update create command to handle (string, error) and validated cache file path creation. |
| // Test deleting instance with invalid ID format | ||
| err = manager.DeleteInstance("nonexistent") | ||
| assert.Error(t, err) |
There was a problem hiding this comment.
This test case now exercises only the invalid-ID-format path ("nonexistent" is rejected by the new format validation) and no longer verifies behavior when a valid instance ID is provided but the cache file is missing. Consider adding a separate subtest that calls DeleteInstance with a valid-looking ID that has no cache file and asserting the expected error (e.g., wraps the read-cache-file failure).
| opts.cacheFile = manager.GetInstanceCacheFile(instanceID) | ||
| instanceID, err := manager.GenerateInstanceID() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to generate instance ID: %w", err) |
There was a problem hiding this comment.
GenerateInstanceID already wraps rand.Read failures with failed to generate instance ID: ..., and this call site wraps again with the same prefix, producing a duplicated error message. Consider returning the original error directly here, or wrapping with different context to avoid repetition.
| return fmt.Errorf("failed to generate instance ID: %w", err) | |
| return err |
| // GetInstanceByFilename returns details for a specific instance by its filename | ||
| func (m *Manager) GetInstanceByFilename(filename string) (*Instance, error) { | ||
| cacheFile := m.GetInstanceCacheFile(filename) | ||
| cacheFile, err := m.GetInstanceCacheFile(filename) | ||
| if err != nil { |
There was a problem hiding this comment.
The GetInstanceByFilename docstring/parameter name implies callers can pass an actual filename, but the implementation validates the value as an instance ID and appends .yaml via GetInstanceCacheFile. This will fail if a caller passes something like abcd1234.yaml. Consider clarifying in the comment (and/or renaming the parameter) that this expects the cache filename stem / instance ID without the .yaml extension.
| // Test getting instance with invalid ID format | ||
| _, err = manager.GetInstance("nonexistent") | ||
| assert.Error(t, err) |
There was a problem hiding this comment.
These assertions now only cover the invalid-ID-format path ("nonexistent" is neither 8-char hex nor UUID). This drops coverage for the "valid ID but cache file does not exist" behavior in GetInstance, which is still an important error path after the refactor. Consider adding a subtest that uses a valid-looking ID (e.g. a1b2c3d4) that has no corresponding cache file and asserting the returned error.
Summary
GenerateInstanceIDto return(string, error)instead of silently returning empty string oncrypto/randfailure (which produced a cache file named.yaml)GetInstanceCacheFileto prevent path traversal attacks (rejects../../etc/passwd, shell metacharacters, etc.)cmd/cli/create/create.goand internal methods (GetInstance,DeleteInstance,GetInstanceByFilename)Audit findings: #8 (MEDIUM), #35 (LOW)
Test plan
TestGenerateInstanceID— verifies 8-char hex output and error returnTestGetInstanceCacheFile— verifies valid hex and UUID IDs acceptedTestGetInstanceCacheFile_RejectsTraversal— verifies../../etc/passwdrejectedTestGetInstanceCacheFile_RejectsInvalidFormat— verifies empty, short, uppercase, non-hex, shell injection, path separators all rejectedgofmtcleangolangci-lint— 0 issuesgo test ./internal/instances/... ./cmd/cli/create/... -v— PASSgo build -o bin/holodeck cmd/cli/main.go— compilesgo mod tidy && go mod verify— clean