-
Notifications
You must be signed in to change notification settings - Fork 5.8k
fix(inputs.docker): Allow Telegraf to start when Docker daemon is unavailable #18181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(inputs.docker): Allow Telegraf to start when Docker daemon is unavailable #18181
Conversation
…vailable This fixes a regression introduced in v1.36.3 where Telegraf would fail to start if the Docker/Podman socket was unavailable. The Start() method now logs a warning instead of returning a fatal error, and the client connection is retried lazily on each Gather() cycle.
7c2a47a to
ec111c6
Compare
srebhan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @skartikey for the PR! However, I think this should be implemented using the
startup-error-behavior framework, i.e. returning a StartupError with a flag denoting that the error is retryable and let the user decide what to do. See this example on how to implement it.
Please only return a retryable error if it is retryable, e.g. using a Ping and check if IsErrConnectionFailed!
…on failures Implement the startup-error-behavior framework (TSD-006) to handle Docker daemon unavailability during startup. This allows users to configure retry behavior via the startup_error_behavior option (error, retry, ignore, probe) instead of silently logging warnings and deferring connection to the first Gather.
f24d56a to
065bc90
Compare
|
@srebhan Implemented the startup-error-behavior framework (TSD-006) to handle Docker daemon unavailability. Please take a look. |
srebhan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @skartikey! Some more comments...
402ca62 to
dffca5d
Compare
srebhan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @skartikey! Some more minor comments...
|
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
Summary
unavailability
Pingto check Docker connectivity duringStart()StartupErrorwithRetryflag based on whether the error is a connection failurestartup_error_behavioroption:error(default): Fail startup if Docker is unavailableretry: Keep retrying connection on each gather cycleignore: Remove plugin from processing if connection failsprobe: Probe plugin availability before decidingThis gives users control over how Telegraf handles Docker daemon unavailability at startup, rather than silently continuing with deferred initialization.
Checklist
Related issues
resolves #18089