Skip to content

Add error handling for cri and podman engines when container events are disabled and go-channels are closed early - also add retry logic in fetcher channel to prevent missing container info due to timing issues with cri engine#900

Merged
poiana merged 6 commits intofalcosecurity:mainfrom
nenioscio:main
Jul 29, 2025

Conversation

@nenioscio
Copy link
Contributor

What type of PR is this?

/kind bug

Any specific area of the project related to this PR?

/area plugins

What this PR does / why we need it:

  1. For the engines podman and cri a possible error on event listener setup is not captured leading to and endless loop reading from a closed socket - this causes high CPU load
  2. For the cri engine a timing issue prevents reading the container info - workaround: add a retry loop for a fixed duration
  3. Enhance interaction between c++ and go when AskForContainerInfo is called by making the channel buffered

Which issue(s) this PR fixes:
Fixes

Fixes falcosecurity/falco#3610 and falcosecurity/falco#3630

Special notes for your reviewer:

@FedeDP : this pull request is based new commit on top of already discussed commit 28cb4d4 which already include the changes you requested via inline comments

@FedeDP
Copy link
Contributor

FedeDP commented Jul 28, 2025

Thank you very much, doing the codereview now! :)

Klaus Wagner added 2 commits July 28, 2025 15:33
Signed-off-by: Klaus Wagner <Klaus.Wagner@erstegroup.com>
…nd podman engines

Signed-off-by: Klaus Wagner <Klaus.Wagner@erstegroup.com>
Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all, a bunch of great fixes and improvements to make container plugin more reliable.
I already shared my huge appreciation on falcosecurity/falco#3610, but let me share it too :) Thank you very much for your great work!

I left some suggestions/small things to improve, but all in all this is looking really good!

@FedeDP
Copy link
Contributor

FedeDP commented Jul 28, 2025

Oh and it seems that we also broke podman tests somewhere :)

@github-actions
Copy link

Rules files suggestions

@FedeDP
Copy link
Contributor

FedeDP commented Jul 28, 2025

/hold

nenioscio and others added 2 commits July 28, 2025 16:12
Co-authored-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: Klaus Wagner <nenioscio@gmail.com>
Co-authored-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: Klaus Wagner <nenioscio@gmail.com>
@github-actions
Copy link

Rules files suggestions

…nt definition

Signed-off-by: Klaus Wagner <Klaus.Wagner@erstegroup.com>
@github-actions
Copy link

Rules files suggestions

Signed-off-by: Klaus Wagner <Klaus.Wagner@erstegroup.com>
@nenioscio
Copy link
Contributor Author

Oh and it seems that we also broke podman tests somewhere :)

I managed to fix this -

"system.Events" yields nil on successful init so I ignore it when the error side-channel is processed

@github-actions
Copy link

Rules files suggestions

@nenioscio nenioscio requested a review from FedeDP July 28, 2025 17:38
Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM!
We can further optimize the way we kill goroutines dynamically from workerLoop, but aside from that (that should be a corner case of a corner case btw), everything looks good!
/approve

@poiana
Copy link
Contributor

poiana commented Jul 29, 2025

LGTM label has been added.

DetailsGit tree hash: a514d8f94d579725020df8709ef6abdc32923a7a

@FedeDP
Copy link
Contributor

FedeDP commented Jul 29, 2025

/unhold

@poiana
Copy link
Contributor

poiana commented Jul 29, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ekoops, FedeDP, nenioscio

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit 7bb3847 into falcosecurity:main Jul 29, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Falco 0.41.1 with new container plugin causes false positives for incubating rules

4 participants