-
Notifications
You must be signed in to change notification settings - Fork 100
SapMachine High Memory Reports
SapMachine High Memory Reports - HiMemReport - is a VM-internal facility that generates reports in low-memory situations. It only exists in SapMachine and SAP JVM (for now; we may still contribute it upstream).
High Memory Reports are enabled with -XX:+HiMemReport.
A dedicated reporter thread will periodically poll the VM process's resident set and swap sizes when enabled. If the sum of both values hits certain threshold levels, it generates reports.
The thresholds are staggered at 66%, 75%, and 90% of a maximum X. Per default, X is:
- for containerized VMs, the container memory limit
- For non-containerized VMs, half of the total physical memory of the host.
The user can override X with any arbitrary value with option
-XX:HiMemReportMax=<X>
The HiMemReport base report contains VM arguments, SapMachine Vitals, and NMT (for the latter, NMT has to be enabled with -XX:NativeMemoryTracking=summary or =detail).
In addition to the base report, the user can specify arbitrary jcmds with -XX:HiMemReportExec=command[;command...]. These jcmds will run after the base report was generated. Typical examples could be VM.info, VM.metaspaceorGC.heap_dump`.
By default, all reports - base report and output from optional jcmds - get written to stderr. With -XX:HiMemReportDir=<directory>, the user can specify a directory into which reports are redirected instead. In that case the base report is named <report directory>/sapmachine_himemalert_pid<pid>_<timestamp>.log. Output from additional jcmds is written into separate files as <report directory>/<command>_<pid>_<timestamp>.(out|err).
Example for -XX:HiMemReportDir=/tmp/himem "-XX:HiMemReportExec=VM.info;VM.flags -all":
thomas@starfish$ ls -al /tmp/himem/
total 176
drwxr-xr-x 2 thomas thomas 4096 Jun 17 15:56 .
drwxrwxrwt 26 root root 36864 Jun 17 15:56 ..
-rw-rw-r-- 1 thomas thomas 2485 Jun 17 15:56 sapmachine_himemalert_pid11015_2022_06_17_15_56_13.log
-rw-rw-r-- 1 thomas thomas 0 Jun 17 15:56 VM.flags_pid11015_2022_06_17_15_56_13.err
-rw-rw-r-- 1 thomas thomas 63568 Jun 17 15:56 VM.flags_pid11015_2022_06_17_15_56_13.out
-rw-rw-r-- 1 thomas thomas 0 Jun 17 15:56 VM.info_pid11015_2022_06_17_15_56_13.err
-rw-rw-r-- 1 thomas thomas 60764 Jun 17 15:56 VM.info_pid11015_2022_06_17_15_56_13.out
Use -XX:HiMemReportExec to specify additional jcmds to run after the base report. Multiple commands are separated with semicolons (;). Commands can contain arguments. May sure to quote the argument correctly on the command line.
A special case is "GC.heap_dump". If specified as a jcmd to run, and if arguments are omitted, it will write the dump file as GC.heap_dump_pid<pid>_<timestamp>.dump into the report directory or the current working directory if the user did not specify a report directory.
-XX:+HiMemReport enables high memory reports.
-XX:HiMemReportMax=<memory size>
overides the maximum memory size the reporter references.
-XX:HiMemReportDir=<dir>
redirects reports into separate files in a common reporting directory; by
-XX:HiMemReportExec=<command>[;<command2> ...]
specify one or more commands to run
If Rss+Swap reaches 66/75/90% of the "natural" limit (container limit or 1/2 total physical memory), write a report to stderr.
Like before, but also execute jcmd VM.info and write its output to stderr.
If Rss+Swap reach 66/75/90% of 8gb (so, 5.28GB, 6GB or 7.2GB), write reports to ./himem-reports-2/sapmachine_himemalert_pid<pid>_<timestamp>.log
java -XX:+HiMemReport -XX:HiMemReportDir=himem-reports '-XX:HiMemReportExec=VM.info;VM.flags -all;GC.heap_dump'
If Rss+Swap reach 66/75/90% of either container limit or 1/2 total physical memory, write reports to ./himem-reports-2/sapmachine_himemalert_pid<pid>_<timestamp>.log
In addition:
- run
jcmd VM.infoon the VM and write the output to./himem-reports-2/VM.info_pid<pid>_<timestamp>.out. - run
jcmd VM.flags -allon the VM and write the output to./himem-reports-2/VM.flags_pid<pid>_<timestamp>.out. - run
jcmd GC.heap_dumpon the VM and write the heap dump to./himem-reports-2/GC.heap_dump_pid<pid>_<timestamp>.dump.
If VM footprint recovers, the high memory reporting is reset, but only after a grace period of five minutes. Each report will contain a "spike number", which is the number of times the VM recovered from earlier spikes.
After a fixed number of resets, high memory reporting will be disabled. This prevents flooding the disk with high memory reports.
-XX:HiMemReportExec jcmds are spawned as sub-processes (the VM calls jcmd <pid> <command> on itself). The reporter uses posix_spawn() for this, which should be using vfork() or clone(). So it should be cheap, low-memory friendly. Still, both the process of forking and the command itself need memory and, therefore, may misfunction in low-memory scenarios.
Did the VM process live long enough to generate these reports? For example, the VM may have been killed by the OOM killer, or it may have ended naturally, or the enclosing container may have been stopped.