Skip to content

Commit 530c20a

Browse files
2ZZIan Driver
authored andcommitted
Fix #4396: Add timeout to establish_connection to prevent infinite loop (#5104)
**Which issue(s) this PR fixes**: Fixes #4396 **What this PR does / why we need it**: Adds timeout mechanism to `establish_connection` method to prevent infinite loop when handshake protocol gets stuck. In unstable network environments with proxy components, if connection drops during handshake after TLS establishment, Fluentd gets stuck in infinite loop causing logs to stop being flushed. This fix uses existing `hard_timeout` configuration to break the loop, disable problematic nodes, and maintain log flow through healthy nodes. **Docs Changes**: None required - uses existing `hard_timeout` configuration parameter. **Release Note**: Fix infinite loop in out_forward handshake protocol that could cause logs to stop being flushed in unstable network environments. Signed-off-by: Ian Driver <[email protected]> Co-authored-by: Ian Driver <[email protected]> Signed-off-by: Shizuo Fujita <[email protected]>
1 parent 5114da6 commit 530c20a

File tree

2 files changed

+33
-0
lines changed

2 files changed

+33
-0
lines changed

lib/fluent/plugin/out_forward.rb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -610,7 +610,17 @@ def verify_connection
610610
end
611611

612612
def establish_connection(sock, ri)
613+
start_time = Fluent::Clock.now
614+
timeout = @sender.hard_timeout
615+
613616
while ri.state != :established
617+
# Check for timeout to prevent infinite loop
618+
if Fluent::Clock.now - start_time > timeout
619+
@log.warn "handshake timeout after #{timeout}s", host: @host, port: @port
620+
disable!
621+
break
622+
end
623+
614624
begin
615625
# TODO: On Ruby 2.2 or earlier, read_nonblock doesn't work expectedly.
616626
# We need rewrite around here using new socket/server plugin helper.

test/plugin/test_out_forward.rb

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1347,4 +1347,27 @@ def plugin_id_for_test?
13471347
end
13481348
end
13491349
end
1350+
1351+
test 'establish_connection_timeout' do
1352+
@d = d = create_driver(%[
1353+
hard_timeout 1
1354+
<server>
1355+
host #{TARGET_HOST}
1356+
port #{@target_port}
1357+
</server>
1358+
])
1359+
1360+
node = d.instance.nodes.first
1361+
mock_sock = flexmock('socket')
1362+
mock_sock.should_receive(:read_nonblock).with(512).and_return('').at_least.once
1363+
1364+
ri = Fluent::Plugin::ForwardOutput::ConnectionManager::RequestInfo.new(:helo)
1365+
1366+
assert_true node.available?
1367+
node.establish_connection(mock_sock, ri)
1368+
assert_false node.available?
1369+
1370+
logs = d.logs
1371+
assert{ logs.any?{|log| log.include?('handshake timeout after 1.0s') } }
1372+
end
13501373
end

0 commit comments

Comments
 (0)