[filebeat][fix] propagate Start() errors from beat receivers #49757
[filebeat][fix] propagate Start() errors from beat receivers #49757barkhayot wants to merge 5 commits intoelastic:mainfrom
Start() errors from beat receivers #49757Conversation
barkhayot
commented
Mar 28, 2026
- When storage is configured for the Filebeat OTel receiver but the corresponding extension is not present on the host, the startup failure was silently swallowed. Both filebeatReceiver and metricbeatReceiver called BeatReceiver.Start() inside a goroutine, so any error (including the missing extension error) was only logged and never propagated — causing Start() to always return nil and making a misconfigured deployment appear healthy at startup [bug-hunter] Filebeat receiver Start returns nil when configured storage extension is missing #49617
- Split BeatReceiver.Start() into two methods:
- Setup(host): runs all initialization synchronously (diagnostic hooks, storage extension lookup, metric reporter, stop callback) and returns errors immediately
- Run(): calls beater.Run() and is meant to be executed in a goroutine
- Both filebeatReceiver and metricbeatReceiver now call Setup() synchronously before spawning the goroutine for Run(), ensuring startup errors are surfaced to the OTel collector as hard failures.
- Add suggested test from issue.
|
💚 CLA has been signed |
🤖 GitHub commentsJust comment with:
|
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe BeatReceiver lifecycle was split: host-dependent initialization moved into Setup(host) which returns initialization errors immediately, and the main execution moved into Run(). Filebeat and Metricbeat receivers now call Setup before spawning their goroutines and call Run inside the goroutine; they no longer start the goroutine when Setup fails. libbeat's BeatReceiver replaced Start with Setup and Run and added a public groupReporter field. A test was added that verifies Start fails when a required storage extension is missing. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
6e1d893 to
9fe11f5
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@x-pack/libbeat/cmd/instance/receiver.go`:
- Around line 121-125: The code unconditionally dereferences br.groupReporter in
the Run() error path; update the Run error handling to guard against nil by
checking br.groupReporter != nil before calling br.groupReporter.UpdateStatus
(or any method on it) so beat receivers that don't implement
cfgfile.WithOtelFactoryWrapper won't panic; locate the Run error handling in the
same receiver (method Run on BeatReceiver) and wrap the UpdateStatus call with a
nil-check (or alternative safe path such as logging the failure) to avoid
panics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 96dd7d48-5cac-43fd-a977-457e0ef34e9e
📒 Files selected for processing (4)
x-pack/filebeat/fbreceiver/receiver.gox-pack/filebeat/fbreceiver/receiver_test.gox-pack/libbeat/cmd/instance/receiver.gox-pack/metricbeat/mbreceiver/receiver.go
🚧 Files skipped from review as they are similar to previous changes (1)
- x-pack/filebeat/fbreceiver/receiver.go
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |