-
Notifications
You must be signed in to change notification settings - Fork 501
Description
Steps to reproduce
Sudden bring down of otel collector in a high load setup.
What is the expected behavior?
Clients handling export failures
What is the actual behavior?
Hang on export, 30+ seconds. We can see one connection stuck in half open, one stuck in connect SYN_SENT... note that server is gone for some time at this point.
netstat -nap | grep 4318
tcp 0 0 192.168.67.174:51510 192.168.10.101:4318 ESTABLISHED 23/smfcc
tcp 0 1 192.168.67.174:43912 192.168.10.101:4318 SYN_SENT -
Additional context
We suspect this crash is due to otel not providing way to set keep alives on client connections (CURLOPT_TCP_KEEPALIVE)... we've set timeout (CURLOPT_TIMEOUT) to 5s, and ideally for that CURLOPT_TCP_KEEPALIVE should be set to 1s, but otel doesn't have an option to set that.
Request that a config for TCP_KEEPALIVE be provided.