-
Notifications
You must be signed in to change notification settings - Fork 499
pkg/option: allow policy-filter-map-entries configurable via flag #4331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
pkg/option: allow policy-filter-map-entries configurable via flag #4331
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
7475a97 to
444c976
Compare
mtardy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch, I think it's the right direction and a good idea, just need a few changes in how it's done :)
| struct { | ||
| __uint(type, BPF_MAP_TYPE_HASH_OF_MAPS); | ||
| __uint(max_entries, POLICY_FILTER_MAX_POLICIES); | ||
| __uint(max_entries, 1); // will be resized by agent when needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used by policy_filter_cgroup_maps just nearby. Maybe we would need to update both and remove that const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think it makes sense to me. I'll update both of them and remove that const.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtardy Given another think, the POLICY_FILTER_MAX_POLICIES in policy_filter_cgroup_maps represents how many policies can apply to a single cgroup (inner map size of policy_filter_cgroup_maps).
I’m fine to simplify and drive both policy_filter_maps and policy_filter_cgroup_maps from the same policy-filter-map-entries setting and remove that const. However, I just want you to aware that if we want to update both, it would be mixing a global capacity knob with a per-cgroup limit, which could waste memory if we config more policies globally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtardy Please feel free to add you comment. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m fine to simplify and drive both policy_filter_maps and policy_filter_cgroup_maps from the same policy-filter-map-entries setting and remove that const. However, I just want you to aware that if we want to update both, it would be mixing a global capacity knob with a per-cgroup limit, which could waste memory if we config more policies globally.
My reading is that that policy_filter_cgroup_maps uses POLICY_FILTER_MAX_POLICIES since it's the most pessimistic approach. This means, that there will be no capacity issues (i.e., a failure to insert) in policy_filter_cgroup_maps without hitting a capacity issue in policy_filter_maps.
I think there are three options here:
- Have the run-time size to determine the size of policy_filter_maps and policy_filter_cgroup_maps. That is, have the run-time size instead of POLICY_FILTER_MAX_POLICIES.
- Have a different run-time size for both. In this case, we would need to check the error path of when the capacity of
policy_filter_cgroup_mapsis not enough and do something reasonable. This can be a follow-up PR. - Leave it as is. In this case, we would still need to check the error path since we might end up with a case where the capacity of
policy_filter_cgroup_mapsis not enough.
Note that option (2.) can be done in a followup PR after (1.).
CCing @tpapagian who I believe introduced this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to leave it as-is for now, and keep policy_filter_cgroup_maps sized using POLICY_FILTER_MAX_POLICIES.
A few reasons for this:
- Using
POLICY_FILTER_MAX_POLICIESforpolicy_filter_cgroup_mapsis intentionally pessimistic and gives us a safety guarantee. - While driving both maps from a single run-time knob is appealing, it mixes two different semantics: a global policy capacity, and a per-cgroup upper bound. This coupling can indeed lead to wasted memory when the global configuration is large, and I’m not fully convinced that trade-off is worth it yet.
- Introducing separate run-time sizing (option 2) would require us to define and carefully handle a new failure mode (partial success where the global map accepts entries but the cgroup map does not). That feels like a larger behavioural change than this PR intends.
I agree that we should eventually harden the error path around policy_filter_cgroup_maps, but I think that can be done independently of changing sizing semantics, and safely as a follow-up.
2f156b5 to
0453fe8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think we can merge like this, I'd just do a small modification on the code part we had the loader before merging. :)
Sorry for the delay in the reviews
0453fe8 to
cc2b381
Compare
This commit introduces a new flag to configure the number of entries in policy filter maps. This allows users to tune the map size based on workload scale and system resources, improving flexibility in policy handling. Note: this commit only affects policies with k8s segmentation primitives (i.e., either podSelectors or namespaced policies). Fixes: cilium#4260 Signed-off-by: Kyle Dong <[email protected]>
Signed-off-by: Kyle Dong <[email protected]>
cc2b381 to
53bbe52
Compare
|
@mtardy Thank you so much for taking time to review and discuss this PR. I have edit the comment as you suggested. Please take another look when you have time. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll just let @olsajiri ack this before merging. (this might take some time since people are on PTO for christmas, please ping back here beginning of Jan if no progress, thanks!!)
kkourt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I have a small comment (please see below).
|
|
||
| Config.RetprobesCacheSize = viper.GetInt(KeyRetprobesCacheSize) | ||
|
|
||
| Config.PolicyFilterMapEntries = viper.GetInt(KeyPolicyFilterMapEntries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the discussion in 9dc9735#r2629371514, should we reject values that are larger than larger than POLICY_FILTER_MAX_POLICIES?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think we should reject values larger than POLICY_FILTER_MAX_POLICIES. The goal of this PR is to allow users to configure the number of entries in the policy filter maps. The original limit of 128 is sometimes too small, and users may legitimately have more policies in their maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but given that we have:
// This map keeps exactly the same information as policy_filter_maps
// but keeps the reverse mappings. i.e.
// policy_filter_maps maps policy_id to cgroup_ids
// policy_filter_cgroup_maps maps cgroup_id to policy_ids
struct {
__uint(type, BPF_MAP_TYPE_HASH_OF_MAPS);
__uint(max_entries, POLICY_FILTER_MAX_CGROUP_IDS);
__type(key, __u64); /* cgroup id */
__array(
values, struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, POLICY_FILTER_MAX_POLICIES);
__type(key, __u32); /* policy id */
__type(value, __u8); /* empty */
});
} policy_filter_cgroup_maps SEC(".maps");
Doesn't this mean that for a value larger than POLICY_FILTER_MAX_POLICIES, things will not work as expected when considering the policy_filter_cgoup_maps?
I sent a message to Jiri to progress on this |
| } | ||
| } | ||
|
|
||
| // TODO: remove this special case handling (see #4398) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was discussed here for info #4331 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right, I was waiting for this to land in first, just to avoid any potential conflict here.
Description
This commit introduces a new flag to configure the number of entries in policy filter maps. This allows users to tune the map size based on workload scale and system resources, improving flexibility in policy handling.
Note: this commit only affects policies with k8s segmentation primitives (i.e., either podSelectors or namespaced policies).
Fixes: #4260