Skip to content

Subcommand --import-owl-models shows poor performance in CPU-rich environments #549

@kltm

Description

@kltm

I have three machines:

  • "New" is a new heavy compute server; it is the fastest machine, with oodles of memory and cores and SSD in a fast setup
  • "Old" is an aging compute server; it is very fast, with cores and memory, but has filesystem i/o issues (long story)
  • "Lap" is an old laptop; it has little memory and not much CPU, but at least has an SSD

I recently tested filesystem i/o on these three machines. "New" was fastest, with "Lap" just behind; "Old" was a factor of four slower than both.

With this context, when running this command:

java -Xmx64G -jar ../minerva/minerva-cli/bin/minerva-cli.jar --import-owl-models -j /tmp/blazegraph.jnl -f ~/local/src/git/noctua-models/models/

I get something like this for the times:

  • "Lap": 30min (using only 8G, as it is a small machine)
  • "New": 60min
  • "Old": 120min

That is unexpected. "New" should be smoking all comers. Filesystem testing indicates that that shouldn't be the problem. Looking at the processing on "New", I noticed that almost all cores were in use (~100). I'm wondering if there is degenerate utilization of CPUs or memory.

I figured out how to "hide" CPUs from java (e.g. taskset -a -c 0,1,2,3,4,5,6,7 COMMAND) and started trying to find if I could get better numbers by tuning. The answer is a baffling "yes".

CPUs   Mem  Runtime 
8      8G      57:04.77
4      16G     52:24.90
2      32G     38:09.33
2      64G     36:15.91
1      64G     26:15.49
1      128G    26:31.99

It seems that the job is single-cpu bound and more make things worse. More memory does nothing, once there is enough.

Tagging @balhoff

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions