Skip to content

fix(aws): fail instead of falling back to 0.0.0.0/0 for security group#650

Merged
ArangoGutierrez merged 1 commit intoNVIDIA:mainfrom
ArangoGutierrez:fix/security-group-fallback
Feb 13, 2026
Merged

fix(aws): fail instead of falling back to 0.0.0.0/0 for security group#650
ArangoGutierrez merged 1 commit intoNVIDIA:mainfrom
ArangoGutierrez:fix/security-group-fallback

Conversation

@ArangoGutierrez
Copy link
Collaborator

Summary

  • Remove dangerous 0.0.0.0/0 fallback when public IP detection fails
  • Now returns a clear error asking user to set ingressCidr explicitly

Audit Finding

Changes

  • pkg/provider/aws/create.go: Replace 0.0.0.0/0 fallback with error return

Test plan

  • gofmt — no formatting issues
  • go build — compiles
  • go test ./pkg/... — all tests pass

When public IP auto-detection failed, SSH and K8s API ports were
opened to the entire internet. Now fail with a clear error message
asking the user to set the CIDR explicitly.

Audit finding NVIDIA#34 (LOW).

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Copilot AI review requested due to automatic review settings February 12, 2026 20:06
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request aims to improve security by removing a dangerous fallback to 0.0.0.0/0 when public IP detection fails during AWS security group creation. Instead of silently falling back to allowing traffic from anywhere, the code now returns an error asking users to explicitly set ingressIpRanges.

Changes:

  • Removed 0.0.0.0/0 fallback when public IP detection fails in AWS security group creation
  • Changed from warning + fallback to immediate error return
  • Updated comment to reflect that only auto-detected IP (not fallback) is added

Comment on lines +355 to 358
publicIP, err := utils.GetIPAddress()
if err != nil {
return fmt.Errorf("could not detect public IP for security group (set ingressCidr explicitly): %w", err)
}
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change breaks a valid use case where users provide explicit ingressIpRanges. According to the documentation (docs/guides/ip-detection.md and docs/commands/create.md), when users provide ingressIpRanges, the detected IP should be added in addition to those ranges. However, if IP detection fails (e.g., due to corporate firewall or proxy), the code will now error out even when the user has provided valid ingressIpRanges.

The fix should only fail when BOTH conditions are true: (1) IP detection fails AND (2) ingressIpRanges is empty. Otherwise, the code should continue with the user-provided ranges.

Suggested logic:

  • Try to detect public IP
  • If detection succeeds, add it to ipRanges
  • If detection fails AND len(p.Spec.IngressIpRanges) == 0, return error
  • If detection fails BUT user provided ingressIpRanges, log a warning and continue

Copilot uses AI. Check for mistakes.
p.log.Warning("Could not detect public IP, using 0.0.0.0/0: %v", err)
publicIP, err := utils.GetIPAddress()
if err != nil {
return fmt.Errorf("could not detect public IP for security group (set ingressCidr explicitly): %w", err)
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message references "ingressCidr" but the actual field name in the API is "ingressIpRanges" (as seen in api/holodeck/v1alpha1/types.go line 108). This inconsistency could confuse users. The error message should use the correct field name from the API spec.

Suggested change
return fmt.Errorf("could not detect public IP for security group (set ingressCidr explicitly): %w", err)
return fmt.Errorf("could not detect public IP for security group (set ingressIpRanges explicitly): %w", err)

Copilot uses AI. Check for mistakes.
} else {
p.log.Warning("Could not detect public IP, using 0.0.0.0/0: %v", err)
publicIP, err := utils.GetIPAddress()
if err != nil {
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing cancelLoading call before returning error. Following the codebase pattern (seen throughout this file and in pkg/provider/aws/cluster.go:222, pkg/provider/aws/nlb.go:68, etc.), errors that occur during a loading operation should call cancelLoading(logger.ErrLoadingFailed) before returning. The loading spinner was started at line 326 but is not properly cancelled when IP detection fails.

Suggested change
if err != nil {
if err != nil {
cancelLoading(logger.ErrLoadingFailed)

Copilot uses AI. Check for mistakes.
Comment on lines +355 to +357
publicIP, err := utils.GetIPAddress()
if err != nil {
return fmt.Errorf("could not detect public IP for security group (set ingressCidr explicitly): %w", err)
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description references "Audit Finding #34 (LOW)" suggesting a security audit, but the linked PR #34 is about bumping golang.org/x/net dependency, not a security audit of this code. This appears to be an error in the PR description - either the wrong issue/PR was referenced, or the audit finding number is incorrect.

Copilot uses AI. Check for mistakes.
@coveralls
Copy link

Pull Request Test Coverage Report for Build 21962394536

Details

  • 5 of 6 (83.33%) changed or added relevant lines in 1 file are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.01%) to 47.511%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/provider/aws/create.go 5 6 83.33%
Files with Coverage Reduction New Missed Lines %
pkg/provider/aws/create.go 1 88.42%
Totals Coverage Status
Change from base Build 21955389842: 0.01%
Covered Lines: 2501
Relevant Lines: 5264

💛 - Coveralls

@ArangoGutierrez ArangoGutierrez merged commit f9185f2 into NVIDIA:main Feb 13, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants