fix(sshutil): add mutex to TOFU known_hosts to prevent race condition#644
Merged
ArangoGutierrez merged 1 commit intoNVIDIA:mainfrom Feb 13, 2026
Merged
Conversation
Concurrent SSH connections (during cluster provisioning) could race on the known_hosts file read-then-write, causing duplicate entries or inconsistent state. Add a package-level mutex around the callback. Audit finding NVIDIA#13 (MEDIUM). Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
There was a problem hiding this comment.
Pull request overview
Adds in-process synchronization to Holodeck’s SSH TOFU host key callback to avoid concurrent read/write races on the cached known_hosts file during parallel provisioning.
Changes:
- Introduce a package-level
sync.Mutexto serialize TOFUknown_hostsaccess. - Acquire/release the mutex at the start of the
ssh.HostKeyCallbackto prevent TOCTOU between read and append.
Comment on lines
+43
to
+45
| tofuMu.Lock() | ||
| defer tofuMu.Unlock() | ||
|
|
There was a problem hiding this comment.
The new global mutex is intended to fix a concurrency race, but there’s no test exercising concurrent calls to the callback. Adding a unit test that runs many goroutines calling this callback against the same temp known_hosts path (and asserting the file ends up with a single correct entry and no errors) would prevent regressions and would fail under -race if the locking is removed/broken.
Pull Request Test Coverage Report for Build 21961219156Details
💛 - Coveralls |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sync.Mutexto serialize access to theknown_hostsfile in the TOFU host key callbackAudit Finding
Finding #13 (MEDIUM): Concurrent SSH connections could race on the known_hosts file read-then-write, causing duplicate entries or inconsistent state.
Test plan
-raceflaggolangci-lintcleango buildcompiles