-
Notifications
You must be signed in to change notification settings - Fork 35
OOM Counter Incrementing Incorrectly #10
Description
Hi!
Thank you for the project!
There seem to be one weird bug. Here is the description:
- I've installed MCM on the kube cluster v1.21.2 with docker runtime
- Port-forward one MCM container to check metrics
k port-forward monitoring-missingcm-h257l 3001:3001 - Connect to some container located on the same node as MCM pod and trigger oom event with the help of stress command:
stress --vm 1 --vm-bytes 3024M
stress: info: [389] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [389] (415) <-- worker 390 got signal 9
stress: WARN: [389] (417) now reaping child worker processes
stress: FAIL: [389] (451) failed run completed in 2s
Please note, we should run the above command several times to reproduce the issue
- Check the container_ooms metrics for the above container
Expected result: the container_ooms counter should have value exact the same as number of times the stress command was executed
Actual result: container_ooms is greater than the number of times the stress command was executed. I've got the value 13 even though I run command only 3 times
Additional info:
I've checked docker events on the node while reproducing the issue. The number of oom events is matched with the number of stress runs.
Also checked /var/log/messages on the node. Result is as expected - the number of oom logs is matched with the number of stress runs.
Any idea what could be wrong here?