Commit 53b85c4
committed
security: handle transient files in certificate directory loading
The 'TestDemoLocality' was failing with "no certificates found; does certs
dir exist?" errors. This resulted in connection failures when nodes
attempted to establish RPC connections.
Root cause: The demo cluster stores both TLS certificates and Unix socket
files (e.g., .s.PGSQL.26267) in the same directory. When loading
certificates, readDir() lists all directory entries and then calls
entry.Info() to stat each file. Between these operations, transient socket
lock files (e.g., .s.PGSQL.26267.lock.887590299) can be deleted, causing
lstat() to fail with ENOENT. This caused the entire certificate loading to
fail, even though the actual certificate files existed and were valid.
Fix: this change modified the readDir() to skip files that disappear
between directory listing and stat operations (a standard pattern for
handling concurrent file-system modifications).
Fixes cockroachdb#155255
Epic: none
Release note: None1 parent 21b75ac commit 53b85c4
1 file changed
+6
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
72 | 78 | | |
73 | 79 | | |
74 | 80 | | |
| |||
0 commit comments