Node (NodeJS) process gets killed when Memory Cgroup reports OOM, when running on instances with a high RAM and CPU cores, but works with small instances #4732

rshad · 2022-12-15T15:48:12Z

rshad
Dec 15, 2022

Version

v16.19.0

Platform

amzn2.x86_64

Subsystem

No response

What steps will reproduce the bug?

When running a job as a pipeline in Gitlab Runner's K8s pod, the job gets completed successfully only when running on a small instance like m5*.large which offers 2 vCPUs and 8GB of RAM. We set a limit for the build, helper, and services containers mentioned below. Still, the job fails with an Out Of Memory (OOM) error, getting the process node killed by cgroup when running on an instance way more powerful like m5d*.2xlarge for example which offers 8 vCPUs and 32GB of RAM.

Note that we tried to dedicate high resources to the containers, especially the build one in which the node process is a child process of this and nothing changed when running on powerful instances; the node process still got killed because of OOM, each time we give it more memory, the node process consumed higher memory and so on.

Also, regarding the CPU usage, in powerful instances, the more vCPUs we gave it, the more is consumed and we noticed that it has CPU Throtelling at ~100% almost all the time, however, in the small instances like m5*.large, the CPU throttling didn't pass the 3%.

Note that we specified a maximum of memory that be used by the node process but it looks like it does not take any effect. We tried to set it to 1GB, 1.5GB and 3GB.

  NODE_OPTIONS: "--max-old-space-size=1536"

resources request/limits configuration

memory_request = "1Gi"
memory_limit = "4Gi"
service_cpu_request = "100m"
service_cpu_limit = "500m"
service_memory_request = "250Mi"
service_memory_limit = "2Gi"
helper_cpu_request = "100m"
helper_cpu_limit = "250m"
helper_memory_request = "250Mi"
helper_memory_limit = "1Gi"

Resource consumptions of a successful job running on m5d.large

Resource consumptions of a failing job running on m5d.2xlarge

How often does it reproduce? Is there a required condition?

Always

What is the expected behavior?

No response

What do you see instead?

Logs of the host where the job runs

"message": "oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=....
....
"message": "Memory cgroup out of memory: Killed process 16828 (node) total-vm:1667604kB
``

### Additional information

_No response_

bnoordhuis · 2022-12-16T10:18:53Z

bnoordhuis
Dec 16, 2022
Collaborator

I've converted this to a discussion because it feels more like a support question than a bug report.

0 replies

tilgovi · 2022-12-19T18:50:50Z

tilgovi
Dec 19, 2022

@rshad is this new with v16.19.0, or can you reproduce this under v16.18.1 as well?

0 replies

rshad · 2022-12-21T10:18:45Z

rshad
Dec 21, 2022
Author

Hi @tilgovi!

Actually, we never tested it before with other configs of AWS EC2 instances than the one we got working. But this would be an excellent approach to test. I will be back with more details if we could revert back to version v16.18.1 and see what happens.

0 replies

rshad · 2022-12-21T14:21:39Z

rshad
Dec 21, 2022
Author

Hi @tilgovi,

Actually, we've just tested it with v16.18.1 and also v18 and it still fails!

Docker images: node:16.18.1 and node:18

0 replies

ctradar · 2023-06-05T04:15:17Z

ctradar
Jun 5, 2023

@rshad did you get to the bottom of this or have a workaround for the same ?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node.js

Node (NodeJS) process gets killed when Memory Cgroup reports OOM, when running on instances with a high RAM and CPU cores, but works with small instances #4732

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Node.js

Node (NodeJS) process gets killed when Memory Cgroup reports OOM, when running on instances with a high RAM and CPU cores, but works with small instances #4732

Uh oh!

Uh oh!

rshad Dec 15, 2022

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior?

What do you see instead?

Replies: 0 comments · 5 replies

Uh oh!

bnoordhuis Dec 16, 2022 Collaborator

Uh oh!

tilgovi Dec 19, 2022

Uh oh!

Uh oh!

rshad Dec 21, 2022 Author

Uh oh!

Uh oh!

rshad Dec 21, 2022 Author

Uh oh!

ctradar Jun 5, 2023

rshad
Dec 15, 2022

Replies: 0 comments 5 replies

bnoordhuis
Dec 16, 2022
Collaborator

tilgovi
Dec 19, 2022

rshad
Dec 21, 2022
Author

rshad
Dec 21, 2022
Author

ctradar
Jun 5, 2023