-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
After reading the Ballooning documentation my understanding of the deflate_on_oom is that if the parameter is set to true the ballooning device will be deflated automatically if a process in the guest requires memory pages which can not be otherwise provided:
deflate_on_oom: if this is set to true and a guest process wants to allocate some memory which would make the guest enter an out-of-memory state, the kernel will take some pages from the balloon and give them to said process
However, if I run Firecracker with e.g. 2 vCPUs,1gb of memory and a balloon device of 900mb:
{
"target_pages": 230400,
"actual_pages": 230400,
"target_mib": 900,
"actual_mib": 900,
"swap_in": 0,
"swap_out": 0,
"major_faults": 92,
"minor_faults": 3103,
"free_memory": 66572288,
"total_memory": 84398080,
"available_memory": 0,
"disk_caches": 151552,
"hugetlb_allocations": 0,
"hugetlb_failures": 0
}..and then try to start a Java process in the guest with -Xms800m -Xmx800m (i.e. with a heap size of 800mb) the Java process in the guest will hang, Firecracker will use ~200% CPU time but the actual size occupied by the ballooning device in the guest will not change and remain at 900mb:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2552550 xxxxxxxx 20 0 1063092 82992 82096 R 99,9 0,3 3:44.76 fc_vcpu 1
2552544 xxxxxxxx 20 0 1063092 82992 82096 R 90,9 0,3 3:12.17 firecracker
2552549 xxxxxxxx 20 0 1063092 82992 82096 S 25,0 0,3 0:43.47 fc_vcpu 0
Once I reset the target size of the ballooning device to 100mb, the Java process will become unblocked and start.
However, from the documentation of the deflate_on_oom option I would have expected that the guest kernel would deflate the ballooning device automatically, if deflate_on_oom=true?
If I run the same experiment with deflate_on_oom=false, I instantly get an out of memory error when I trying to start the Java process:
penJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000ce000000, 279576576, 0) failed; error='Not enough space' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 279576576 bytes for committing reserved memory.
which is what I would have expected.
Also, if I increase (i.e. inflate) the balloon to 900m again after I started the Java process, I start getting warnings from the ballooning driver (as documented):
[ 282.580254] virtio_balloon virtio0: Out of puff! Can't get 1 pages
..but the CPU usage again goes up to almost ~200%. Is this expected? I mean, the warnings are OK, but I wouldn't expect that Firecracker will burn all its CPU shares while trying to inflate the balloon?
So to summarize, is the described behavior with deflate_on_oom=true a bug in the implementation or have I misunderstood the behavior of the ballooning device in the event of low memory in the guest?
PS: I've used the following kernel and FC versinons for the experiments:
Guest kernel: 5.19.8
Host kernel : 6.5.7 (Ubuntu 20.04)
Firecracker : 1.5.1 and 1.6.0-dev ( from today 036d9906)