Skip to content

[ni] harden netlink namespace handling with better error handling and cleanup#709

Open
notandy wants to merge 2 commits intomainfrom
harden-ni
Open

[ni] harden netlink namespace handling with better error handling and cleanup#709
notandy wants to merge 2 commits intomainfrom
harden-ni

Conversation

@notandy
Copy link
Collaborator

@notandy notandy commented Mar 23, 2026

Improve reliability and error handling in the Network Injection agent's netlink namespace management:

  • Add isLocked flag tracking to prevent thread lock leaks
  • Introduce lockThread/unlockThread helper methods for atomic lock state management
  • Add comprehensive error wrapping throughout netlink operations for better debugging
  • Implement cleanupFailedNamespace() for proper resource cleanup on failures
  • Add validation to ensure veth pairs exist even when namespace exists (fixes broken namespace state)
  • Ensure proper thread unlock on all error paths in EnableNetworkNamespace
  • Close origin namespace handle in DisableNetworkNamespace to prevent leaks
  • Add safety checks in Close() to prevent double-close
  • Refactor DeleteNetworkNamespace to extract deleteNamespace() helper
  • Fix deferred namespace disable in EnableInjection with proper error logging
  • Return errors from haproxy.AddInstance instead of swallowing them
  • Add "fmt" import to openstack.go for error wrapping

… cleanup

Improve reliability and error handling in the Network Injection agent's netlink namespace management:

- Add isLocked flag tracking to prevent thread lock leaks
- Introduce lockThread/unlockThread helper methods for atomic lock state management
- Add comprehensive error wrapping throughout netlink operations for better debugging
- Implement cleanupFailedNamespace() for proper resource cleanup on failures
- Add validation to ensure veth pairs exist even when namespace exists (fixes broken namespace state)
- Ensure proper thread unlock on all error paths in EnableNetworkNamespace
- Close origin namespace handle in DisableNetworkNamespace to prevent leaks
- Add safety checks in Close() to prevent double-close
- Refactor DeleteNetworkNamespace to extract deleteNamespace() helper
- Fix deferred namespace disable in EnableInjection with proper error logging
- Return errors from haproxy.AddInstance instead of swallowing them
- Add "fmt" import to openstack.go for error wrapping
@notandy notandy requested review from a team, m-kratochvil, notque and ronchi-oss as code owners March 23, 2026 20:16
Wrap existingNS.Close() in deferred function with blank identifier
to properly handle the error return value as per project conventions.
@github-actions
Copy link

Merging this branch changes the coverage (1 decrease, 1 increase)

Impacted Packages Coverage Δ 🤖
github.com/sapcc/archer/internal/agent/ni 24.86% (+1.40%) 👍
github.com/sapcc/archer/internal/agent/ni/netlink 8.28% (-5.18%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/sapcc/archer/internal/agent/ni/netlink/netlink_linux.go 1.27% (-0.90%) 157 (+65) 2 155 (+65) 👎
github.com/sapcc/archer/internal/agent/ni/openstack.go 77.97% (-1.28%) 531 (+54) 414 (+36) 117 (+18) 👎

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/sapcc/archer/internal/agent/ni/openstack_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant