Conversation
Signed-off-by: Richard Wall <richard.wall@cyberark.com>
wallrj
commented
May 28, 2025
| resource-type: | ||
| version: v1 | ||
| resource: replicasets | ||
| group: apps |
Contributor
Author
There was a problem hiding this comment.
There is no dedicated Role or ClusterRole for this resource, because there is a ClusterRoleBinding to the standard cluster-view ClusterRole:
inteon
approved these changes
May 28, 2025
Contributor
Author
|
I've updated the PR description with some information about how I simulated a large cluster with many replicasets and measured the peak memory usage before and after removing the replicaset datagatherer. Perhaps in future we can automate those steps. I will attach some heap profiles and metrics to the Jira issue so that we can try and understand why so much memory is used by the agent. I also ran the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replicaset resources are ignored by the TLSPK backend service so there's no point collecting them.
In a Kind cluster with 20k Replicasets, this reduced the peak resident set memory usage from 522 MB to 85 MB.
On a busy cluster where there are frequent Helm upgrades and / or deployment rollouts, there may be a large number of Replicaset resources for previous revisions of each Deployment. This depends on the Deployment revisionHistoryLimit, which is 10 by default.
It may also depend on the
helm upgrade --history-maxvalue, which is 10 by default:$ helm upgrade --help ... --history-max int limit the maximum number of revisions saved per release. Use 0 for no limit (default 10)xref: https://venafi.atlassian.net/browse/VC-41078
Testing
Before:
After:
Create a Kind cluster:
Deploy venafi-kubernetes-agent:
Remove memory limit:
Enable pprof:
Use kwok to allow me to create some fake nodes and realistic pods:
Create ~45 nodes and ~45 Deployments by running the following command repeatedly:
Run
kubectl rollout restart deploymentin a while loop, to start creating replicasets:This created ~20k Replicasets in the default namespace:
Which is ~80Mi of JSON data:
Measure the virtual memory usage of venafi-kubernetes-agent process (preflight),
by reading from /proc/status
Note the
VmHWMvalue which is "peak resident set size (“high water mark”)".Remove the replicaset datagatherer from the configmap:
Restart venafi-kubernetes-agent: