[9.2](backport #49632) Fix check config and take over interaction#49784
[9.2](backport #49632) Fix check config and take over interaction#49784mergify[bot] wants to merge 4 commits into9.2from
Conversation
When using Filestream's take_over feature with autodiscover, files were being re-ingested from the beginning instead of continuing from the offset recorded by the Log input. Autodiscover validates each rendered configuration by instantiating the input with a temporary, suffixed ID before starting it. Because take_over ran during input initialisation, states were migrated to the temporary ID rather than the real input ID. When the real input started, the Log input states had already been consumed, so all files appeared new. The fix moves the take_over migration step from input initialisation to input start. This ensures that config validation (CheckConfig) never triggers state migration, and only the input that actually runs performs the takeover. Additionally, the Log input state is no longer deleted from the registry after migration. Instead, Filestream checks whether it already holds a state for the file before migrating, skipping the takeover if a state is found. This makes the mechanism idempotent and removes reliance on the TTL=-2 heuristic that was used to detect previously-migrated states. Last, but not least, a few other issues in the TakeOver implementation are also fixed: - Incorrect resource release - ephemeralStore is now locked throughout the whole TakeOver duration GenAI-Assisted: Yes Human-Reviewed: Yes Tool: Claude-CLI, Model: Claude 4.6 Opus (Thinking) Tool: Cursor-CLI, Model: GPT-5.3 Codex Extra High --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> (cherry picked from commit 8a648cf) # Conflicts: # filebeat/input/filestream/internal/input-logfile/input.go # filebeat/input/filestream/internal/input-logfile/store.go
|
Cherry-pick of 8a648cf has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
🤖 GitHub commentsJust comment with:
|
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
This comment has been minimized.
This comment has been minimized.
GenAI-Assisted: Yes Human-Reviewed: Yes Tool: Cursor-CLI, Model: GPT-5.3 Codex Extra High Fast
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
TL;DRAll 4 failing Buildkite jobs are failing before packaging starts because Remediation
Investigation detailsRoot CauseThe packaging script invokes
In each failed Buildkite log, execution reaches
This happened consistently in all 4 failing jobs:
The repository’s other Buildkite pipeline definitions commonly set Evidence
Verification
Follow-up
Note 🔒 Integrity filtering filtered 1 itemIntegrity filtering activated and filtered the following item during workflow execution.
What is this? | From workflow: PR Buildkite Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
Proposed commit message
Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesstresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.## Disruptive User Impact## Author's ChecklistHow to test this PR locally
The integration test
TestAutodiscoverFilestreamTakeOverDoesNotReingestKind (and Docker) to created a K8s cluster for testing.Run the tests
Manual Test: Filestream take_over does not re-ingest with autodiscover
Requirements: Linux, Docker, root (needs
/var/lib/docker/containersread access)Start a container that writes one log line per second:
Start Filebeat with the Log input via autodiscover, pointed at the container log file:
Start Filebeat:
Wait until at least 5 events appear in the output file, then stop Filebeat. Note the line count:
Restart Filebeat with the Filestream input and
take_over: enabled: true, using the same output file (no rotation):Start Filebeat:
Wait until at least 2 new lines appear in the output (check with
wc -l /tmp/fb-test/output*), confirming Filestream picked up where the Log input left off.Stop the container and count the total lines it generated:
Wait for Filebeat to log
"File is inactive. Closing.", then stop it and count total ingested events:Expected result
TOTAL_INGESTED == GENERATEDNo lines should be duplicated or missing. If
TOTAL_INGESTED > GENERATED, re-ingestion occurred — the Filestream input restarted from offset 0 instead of continuing from where the Log input stopped.Related issues
## Use cases## Screenshots## LogsThis is an automatic backport of pull request #49632 done by [Mergify](https://mergify.com).