Skip to content

Comments

[VC-41078] Reduce memory usage by removing the Replicaset data gatherer from the default config#658

Merged
wallrj merged 1 commit intomasterfrom
remove-replicasets
May 29, 2025
Merged

[VC-41078] Reduce memory usage by removing the Replicaset data gatherer from the default config#658
wallrj merged 1 commit intomasterfrom
remove-replicasets

Conversation

@wallrj
Copy link
Contributor

@wallrj wallrj commented May 28, 2025

Replicaset resources are ignored by the TLSPK backend service so there's no point collecting them.

In a Kind cluster with 20k Replicasets, this reduced the peak resident set memory usage from 522 MB to 85 MB.

See testing section below. Other resources: 53 Nodes, 52 Deployments, 350 Pods, 4 Secrets

On a busy cluster where there are frequent Helm upgrades and / or deployment rollouts, there may be a large number of Replicaset resources for previous revisions of each Deployment. This depends on the Deployment revisionHistoryLimit, which is 10 by default.
It may also depend on the helm upgrade --history-max value, which is 10 by default:

$ helm upgrade --help
...
      --history-max int                            limit the maximum number of revisions saved per release. Use 0 for no limit (default 10)

xref: https://venafi.atlassian.net/browse/VC-41078

Testing

Before:

$ cat /proc/$(pidof preflight)/status | grep Vm
VmPeak:  1753432 kB
VmSize:  1753432 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:    522736 kB
VmRSS:    403084 kB
VmData:   532024 kB
VmStk:       132 kB
VmExe:     26984 kB
VmLib:         8 kB
VmPTE:      1156 kB
VmSwap:        0 kB

After:

$ cat /proc/$(pidof preflight)/status | grep Vm
VmPeak:  1287844 kB
VmSize:  1287844 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     65176 kB
VmRSS:     65176 kB
VmData:    66436 kB
VmStk:       132 kB
VmExe:     26984 kB
VmLib:         8 kB
VmPTE:       248 kB
VmSwap:        0 kB

# After running for some minutes

$ cat /proc/$(pidof preflight)/status | grep Vm
VmPeak:  1287844 kB
VmSize:  1287844 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     77720 kB
VmRSS:     76424 kB
VmData:    82820 kB
VmStk:       132 kB
VmExe:     26984 kB
VmLib:         8 kB
VmPTE:       276 kB
VmSwap:        0 kB

# Sometime later 

$ cat /proc/$(pidof preflight)/status | grep Vm
VmPeak:  1288100 kB
VmSize:  1288100 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     84828 kB
VmRSS:     84828 kB
VmData:    87172 kB
VmStk:       132 kB
VmExe:     26984 kB
VmLib:         8 kB
VmPTE:       284 kB
VmSwap:        0 kB

Create a Kind cluster:

$ kind version
kind v0.27.0 go1.23.6 linux/amd64

$ kind create cluster
...

$ kubectl version
Client Version: v1.32.3
Kustomize Version: v5.5.0
Server Version: v1.32.2

Deploy venafi-kubernetes-agent:

$ venctl installation cluster connect \
    --name "richardw-cluster-connect-test-20" \
    --api-key $VEN_API_KEY \
    --no-promptts \
    --owning-team RichardW

Remove memory limit:

kubectl patch deployment venafi-kubernetes-agent \
    -n venafi \
    --type=json \
    -p='[{"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits"}]'

Enable pprof:

kubectl patch deployment venafi-kubernetes-agent \
        -n venafi \
        --type=json \
        -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-pprof"}]'

Use kwok to allow me to create some fake nodes and realistic pods:

$ kwok \
   --kubeconfig=~/.kube/config \
   --manage-all-nodes=false \
   --manage-nodes-with-annotation-selector=kwok.x-k8s.io/node=fake \
   --manage-nodes-with-label-selector= \
   --manage-single-node= \
   --cidr=10.0.0.1/24 \
   --node-ip=10.0.0.1 \
   --node-lease-duration-seconds=40

Create ~45 nodes and ~45 Deployments by running the following command repeatedly:

kubectl create -f - <<EOF
apiVersion: v1
kind: Node
metadata:
  annotations:
    node.alpha.kubernetes.io/ttl: "0"
    kwok.x-k8s.io/node: fake
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/os: linux
    kubernetes.io/role: agent
    node-role.kubernetes.io/agent: ""
    type: kwok
  generateName: kwok-node-
spec:
  taints: # Avoid scheduling actual running pods to fake Node
  - effect: NoSchedule
    key: kwok.x-k8s.io/node
    value: fake
status:
  allocatable:
    cpu: 32
    memory: 256Gi
    pods: 110
  capacity:
    cpu: 32
    memory: 256Gi
    pods: 110
  nodeInfo:
    architecture: amd64
    bootID: ""
    containerRuntimeVersion: ""
    kernelVersion: ""
    kubeProxyVersion: fake
    kubeletVersion: fake
    machineID: ""
    operatingSystem: linux
    osImage: ""
    systemUUID: ""
  phase: Running
EOF


kubectl create -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  generateName: fake-
  namespace: default
spec:
  replicas: 1
  revisionHistoryLimit: 1000
  selector:
    matchLabels:
      app: fake-pod
  template:
    metadata:
      labels:
        app: fake-pod
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: type
                operator: In
                values:
                - kwok
      # A taints was added to an automatically created Node.
      # You can remove taints of Node or add this tolerations.
      tolerations:
      - key: "kwok.x-k8s.io/node"
        operator: "Exists"
        effect: "NoSchedule"
      containers:
      - name: fake-container
        image: fake-image
EOF

Run kubectl rollout restart deployment in a while loop, to start creating replicasets:

while kubectl rollout restart deploy; do sleep 1; done

This created ~20k Replicasets in the default namespace:

$ kubectl get --raw "/metrics" | grep apiserver_storage_objects | grep replicasets
apiserver_storage_objects{resource="replicasets.apps"} 20000
$ kubectl get rs -oname  | wc -l
19987

Which is ~80Mi of JSON data:

$ kubectl get rs -o json > replicasets.json
$ ls -lrth replicasets.json
-rw-r--r-- 1 richard richard 80M May 29 09:36 replicasets.json

Measure the virtual memory usage of venafi-kubernetes-agent process (preflight),
by reading from /proc/status

$ cat /proc/$(pidof preflight)/status | grep Vm
VmPeak:  1753432 kB
VmSize:  1753432 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:    522736 kB
VmRSS:    505508 kB
VmData:   532024 kB
VmStk:       132 kB
VmExe:     26984 kB
VmLib:         8 kB
VmPTE:      1156 kB
VmSwap:        0 kB

Note the VmHWM value which is "peak resident set size (“high water mark”)".

Remove the replicaset datagatherer from the configmap:

helm get values venafi-kubernetes-agent -n venafi  -o yaml > values.yaml
helm template venafi-kubernetes-agent deploy/charts/venafi-kubernetes-agent \
    -n venafi \
     --show-only templates/configmap.yaml \
      --show-only templates/rbac.yaml \
       --values values.yaml \
| kubectl apply -f -

Restart venafi-kubernetes-agent:

kubectl rollout restart deploy -n venafi venafi-kubernetes-agent
$ cat /proc/$(pidof preflight)/status | grep Vm
VmPeak:  1287844 kB
VmSize:  1287844 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     77720 kB
VmRSS:     76424 kB
VmData:    82820 kB
VmStk:       132 kB
VmExe:     26984 kB
VmLib:         8 kB
VmPTE:       276 kB
VmSwap:        0 kB

Signed-off-by: Richard Wall <richard.wall@cyberark.com>
resource-type:
version: v1
resource: replicasets
group: apps
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no dedicated Role or ClusterRole for this resource, because there is a ClusterRoleBinding to the standard cluster-view ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "venafi-kubernetes-agent.fullname" . }}-cluster-viewer
labels:
{{- include "venafi-kubernetes-agent.labels" . | nindent 4 }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view
subjects:
- kind: ServiceAccount
name: {{ include "venafi-kubernetes-agent.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}

@wallrj wallrj changed the title WIP: [VC-41078] Remove ReplicaSet from the default config WIP: [VC-41078] Reduce memory usage by removing the Replicaset data gatherer from the default config May 29, 2025
@wallrj wallrj changed the title WIP: [VC-41078] Reduce memory usage by removing the Replicaset data gatherer from the default config [VC-41078] Reduce memory usage by removing the Replicaset data gatherer from the default config May 29, 2025
@wallrj
Copy link
Contributor Author

wallrj commented May 29, 2025

I've updated the PR description with some information about how I simulated a large cluster with many replicasets and measured the peak memory usage before and after removing the replicaset datagatherer.

Perhaps in future we can automate those steps.

I will attach some heap profiles and metrics to the Jira issue so that we can try and understand why so much memory is used by the agent.

I also ran the ./hack/e2e/test.sh script and observed the test pass.

@wallrj wallrj merged commit 8c2091c into master May 29, 2025
3 of 4 checks passed
@wallrj wallrj deleted the remove-replicasets branch May 29, 2025 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants