Skip to content

Commit 400615f

Browse files
authored
Merge pull request #7921 from mosbahmajed/workitem-85945
AB#3203: Workitem 85945 - AKS DocReview: Increased memory usage reported in Kubernetes 1.25 or later versions
2 parents cec5788 + c36f199 commit 400615f

File tree

1 file changed

+31
-5
lines changed

1 file changed

+31
-5
lines changed

support/azure/azure-kubernetes/create-upgrade-delete/aks-increased-memory-usage-cgroup-v2.md

Lines changed: 31 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
---
22
title: Increased memory usage reported in Kubernetes 1.25 or later versions
33
description: Resolve an increase in memory usage that's reported after you upgrade an Azure Kubernetes Service (AKS) cluster to Kubernetes 1.25.x.
4-
ms.date: 07/13/2023
5-
editor: v-jsitser
4+
ms.date: 03/03/2025
5+
editor: momajed
66
ms.reviewer: aritraghosh, cssakscic, v-leedennis
77
ms.service: azure-kubernetes-service
88
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)
@@ -23,23 +23,49 @@ You experience one or more of the following symptoms:
2323

2424
## Cause
2525

26-
This increase is caused by a change in memory accounting within version 2 of the Linux control group (cgroup) API. [Cgroup v2](https://kubernetes.io/docs/concepts/architecture/cgroups/) is now the default cgroup version for Kubernetes 1.25 on AKS.
26+
This increase is caused by a change in memory accounting within version 2 of the Linux control group (`cgroup`) API. [Cgroup v2](https://kubernetes.io/docs/concepts/architecture/cgroups/) is now the default `cgroup` version for Kubernetes 1.25 on AKS.
2727

2828
> [!NOTE]
29-
> This issue is distinct from the memory saturation in nodes that's caused by applications or frameworks that aren't aware of cgroup v2. For more information, see [Memory saturation occurs in pods after cluster upgrade to Kubernetes 1.25](./aks-memory-saturation-after-upgrade.md).
29+
> This issue is distinct from the memory saturation in nodes that's caused by applications or frameworks that aren't aware of `cgroup` v2. For more information, see [Memory saturation occurs in pods after cluster upgrade to Kubernetes 1.25](./aks-memory-saturation-after-upgrade.md).
3030
3131
## Solution
3232

3333
- If you observe frequent memory pressure on the nodes, upgrade your subscription to increase the amount of memory that's available to your virtual machines (VMs).
3434

3535
- If you see a higher eviction rate on the pods, [use higher limits and requests for pods](/azure/aks/developer-best-practices-resource-management#define-pod-resource-requests-and-limits).
3636

37+
- `cgroup` v2 uses a different API than `cgroup` v1. If there are any applications that directly access the `cgroup` file system, update them to later versions that support `cgroup` v2. For example:
38+
39+
- **Third-party monitoring and security agents**:
40+
41+
Some monitoring and security agents depend on the `cgroup` file system. Update these agents to versions that support `cgroup` v2.
42+
43+
- **Java applications**:
44+
45+
Use versions that fully support `cgroup` v2:
46+
- OpenJDK/HotSpot: `jdk8u372`, `11.0.16`, `15`, and later versions.
47+
- IBM Semeru Runtimes: `8.0.382.0`, `11.0.20.0`, `17.0.8.0`, and later versions.
48+
- IBM Java: `8.0.8.6` and later versions.
49+
50+
- **uber-go/automaxprocs**:
51+
If you're using the `uber-go/automaxprocs` package, ensure the version is `v1.5.1` or later.
52+
53+
- An alternative temporary solution is to revert the `cgroup` version on your nodes by using the DaemonSet. For more information, see [Revert to cgroup v1 DaemonSet](https://github.com/Azure/AKS/blob/master/examples/cgroups/revert-cgroup-v1.yaml).
54+
55+
> [!IMPORTANT]
56+
> - Use the DaemonSet cautiously. Test it in a lower environment before applying to production to ensure compatibility and prevent disruptions.
57+
> - By default, the DaemonSet applies to all nodes in the cluster and reboots them to implement the `cgroup` change.
58+
> - To control how the DaemonSet is applied, configure a `nodeSelector` to target specific nodes.
59+
60+
3761
> [!NOTE]
3862
> If you experience only an increase in memory use without any of the other symptoms that are mentioned in the "Symptoms" section, you don't have to take any action.
3963
4064
## Status
4165

42-
We're actively working with the Kubernetes community to fix the underlying issue, and we'll keep you updated on our progress. We also plan to change the eviction thresholds or [resource reservations](/azure/aks/concepts-clusters-workloads#resource-reservations), depending on the outcome of the fix.
66+
We're actively working with the Kubernetes community to resolve the underlying issue. Progress on this effort can be tracked at [Azure/AKS Issue #3443](https://github.com/kubernetes/kubernetes/issues/118916).
67+
68+
As part of the resolution, we plan to adjust the eviction thresholds or update [resource reservations](/azure/aks/concepts-clusters-workloads#resource-reservations), depending on the outcome of the fix.
4369

4470
## Reference
4571

0 commit comments

Comments
 (0)