Skip to content

Commit b6e9ed0

Browse files
authored
Close failed SSH handshakes to prevent socket buildup (#992)
Investigation: - Worker heartbeats failed while the host showed heavy TCP retries. - Checked TCP state counts: netstat -anp tcp | awk '{print }' | sort | uniq -c | sort -nr | head - Identified top TIME_WAIT/SYN_SENT destinations: netstat -anp tcp | awk '=="TIME_WAIT"{print }' | sort | uniq -c | sort -nr | head netstat -anp tcp | awk '=="SYN_SENT"{print }' | sort | uniq -c | sort -nr | head - Verified ephemeral port range: sysctl net.inet.ip.portrange.hifirst net.inet.ip.portrange.hilast - Checked per-process socket usage: lsof -nP -iTCP | awk '{print }' | sort | uniq -c | sort -nr | head Findings: - Tart isolation retries SSH connections in a tight loop. - When the TCP dial succeeds but the SSH handshake fails, the net.Conn was left open. Fix: - Close net.Conn on handshake failure so each retry releases its socket.
1 parent 978c2c6 commit b6e9ed0

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

internal/executor/instance/persistentworker/remoteagent/remoteagent.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -311,6 +311,7 @@ func WaitForSSH(
311311

312312
sshConn, chans, reqs, err = ssh.NewClientConn(netConn, addr, sshConfig)
313313
if err != nil {
314+
_ = netConn.Close()
314315
err := fmt.Errorf("%w: failed to connect via SSH: %v", ErrFailed, err)
315316

316317
logger.Debugf("failed to perform SSH handshake with %s: %v", addr, err)

0 commit comments

Comments
 (0)