Fix: Reporting the correct number of utilized CPUs#6809
Fix: Reporting the correct number of utilized CPUs#6809JosuaCarl wants to merge 5 commits intonextflow-io:masterfrom
Conversation
…d on system level information at execution time Signed-off-by: Josua Carl <josua.carl@uni-tuebingen.de>
Signed-off-by: Josua Carl <josua.carl@uni-tuebingen.de>
…`command-trace.txt` Signed-off-by: Josua Carl <josua.carl@uni-tuebingen.de>
Signed-off-by: Josua Carl <josua.carl@uni-tuebingen.de>
✅ Deploy Preview for nextflow-docs-staging canceled.
|
|
(I'm not in the Nextflow team, I'm just a Nextflow user.) I would like to plead to leave the current I see the metrics you're trying to add as an "advanced" trace that gives valuable insights, alongside the existing scheduler and runtime traces, like |
|
@muffato So would you rather opt for an additional trace value, like |
…n of `cpus` Refactor: Utilized `nxf_trace_write` in `nxf_trace_linux` to align it with `nxf_trace_mac` Test: Added additional testing condition for parsing of `used_cpus` from memory after bash script Signed-off-by: Josua Carl <josua.carl@uni-tuebingen.de>
|
Personally, yes, but again I'm just a mere (moderately advanced) user. The call should come from Seqera. |
|
Closing for the same reason as #6731 -- this seems like a lot of added complexity for not much added value, consider doing it through a plugin instead. |
As stated in #6743, a problem with flexible scheduling of tasks is that the rules for execution constraints upon logical cores is not enforced, or enforced with wiggle room.
In the instance of a Docker executor,
--cpu-shared 2048is used instead of--cpus 2which prioritizes resource utilization, but is no hard cutoff (see https://docs.docker.com/engine/containers/resource_constraints/#configure-the-default-cfs-scheduler).nextflow/modules/nextflow/src/main/groovy/nextflow/container/DockerBuilder.groovy
Lines 128 to 129 in 887443e
This behavior has many upsides, but the number of
cpusin the trace is often reported incorrectly, as it takes the value set by the user inprocess.cpusfor granted, even if it was not enforced.The PR adds a sampling script which utilizes breadth-first search to extract the tasks and its children's last used logical CPU and record them in a list. In the end, this is summed to the total value of utilized CPUs.
The current sampling rate is at
0.1s. I tried1.0s, but this missed several short processes in thedemopipeline, although I suspect this effect may vanish for longer processes. I could see no large changes in execution time and memory utilization for the tasks for either option. Sampling was chosen, because other ways either require root permission (tracking at kernel level withperf) or only reported minima (ceil(%cpu / 100)) and maxima (grep Cpus_allowed_list /proc/$pid/status) for the number of utilized CPUs. In practice I observed, that the number of actually used CPUs was most of the time close to the maximum, so if an approximation was to be used that does not rely on sampling, I would suggestgrep Cpus_allowed_list /proc/$pid/status, which was my original solution in the first commit (have a look if you want).cpustrace value does not correspond to used logical cores of the task #6743Edits:
cpusto a newTraceRecordfieldused_cpus.nxf_write_traceto alignnxf_trace_linuxwithnxf_trace_macand save a few lines. This could also be put into a new PR, if deemed too much deviation from the main task.