-
Notifications
You must be signed in to change notification settings - Fork 8
Description
At work, in BIOS we disable hyperthreading and turn off the power saving CPU C states. If we have multiple NUMA nodes, we sometimes run one process on NUMA node 0, and the other on NUMA node 1.
At home when testing, hyperthreading is usually enabled and CPU power saving states are enabled. Modern processors sometimes have a Turbo mode, which only kicks in after so much activity, possibly complicating performance testing.
Whether entire arrays lives inside the L2 or L3 cache may further complicate performance testing.
To see if you have hyper threading on ("pip install psutil") then..
In [1]: import psutil
In [2]: psutil.cpu_count(logical=False)
Out[2]: 6
In [3]: psutil.cpu_count(logical=True)
Out[3]: 12
If the numbers are different, hyperthreading is turned on.
To see cache sizes and other information
pip install py-cpuinfo
In [1]: import cpuinfo
In [2]: cpuinfo.get_cpu_info()
In [3]: cpuinfo.get_cpu_info()['l2_cache_size']
Out[3]: 1572864
In [4]: cpuinfo.get_cpu_info()['l3_cache_size']
Out[4]: 15728640
On my home computer hyperthreading is turned on. I am in the process of writing code (completed for Windows, now working on Linux) to detect hyperthreading, read L1/L2/L3 cache sizes, read NUMA information, and change the way threading works based on this.
If hyperthreading is turned on, we will start as many threads as physical (not logical) and set a thread affinity for every other code to avoid clashing. So far in testing, this does appear to speed up calculations.
In the future, if we detect NUMA nodes, we can give the option to run on which NUMA node.
I am not sure yet how knowledge of cache sizes might help, but it will probably help determine how many threads to wake up for a given array size.