weird OOMs after update to EKS 1.33 #4550
Replies: 4 comments 2 replies
-
Hi @makarov-roman ! We made several changes in the BR variants with k8s-1.33 like containerd-2.0 and kernel-6.12. The full changelog is here. This is concerning and we will create an issue and investigate it. |
Beta Was this translation helpful? Give feedback.
-
Link to the issue: #4551 |
Beta Was this translation helpful? Give feedback.
-
@makarov-roman ,
|
Beta Was this translation helpful? Give feedback.
-
Issue #4549 is an OOM issue, and may be related, especially if your workloads expect to use cgroups v1 to govern their memory use (as the JVM does in #4549). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi guys,
I've updated EKS cluster from 1.32 to 1.33 and noticed a strange behaviour. Around 20% of workloads went into crashloops with OOMs. Some of them I consider very stable, which didn't have any changes for months, if not years. also in some cases I had to increase the requests/limits by 2 times. e.g. 3->6, 2->4 to fix them. Nodes have plenty of spare memory, does it point at containerd?
I've checked containerd, bottlerocket and EKS/k8s 1.33 changelogs and didn't find a good enough explanation yet. Do you have any ideas what should I check to understand what's going on?
Thanks in advance
Beta Was this translation helpful? Give feedback.
All reactions