Skip to content

pcm: allow core 0 to be offlined#957

Merged
rdementi merged 1 commit intointel:masterfrom
144026:master
Jun 16, 2025
Merged

pcm: allow core 0 to be offlined#957
rdementi merged 1 commit intointel:masterfrom
144026:master

Conversation

@144026
Copy link
Copy Markdown
Contributor

@144026 144026 commented Jun 15, 2025

On servers with Intel SST-PP enabled, it seems core 0 can be put offline, and pcm will fail to start. I corrected ref_core from hardcoded 0 to use socketRefCore, and relaxed thread affinity aff0 requirement during system topology discovery, now it works on my test environment.

Before:

$sudo ./pcm

=====  Processor information  =====
Linux arch_perfmon flag  : yes
Hybrid processor         : no
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 36
CPU family               : 6
CPU model number         : 173
ERROR: pthread_setaffinity_np for core 0 failed with code 22
PCM ERROR. Exception pthread_setaffinity_np failed

After:

$sudo ./pcm

=====  Processor information  =====
Linux arch_perfmon flag  : yes
Hybrid processor         : no
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 36
CPU family               : 6
CPU model number         : 173
Number of logical cores: 240

....

 Core (SKT) | UTIL | IPC  | CFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3MPI | L2MPI |   L3OCC |   LMB  |   RMB  | TEMP

   2    0     0.14   0.17   ...
   3    0     0.01   0.29   ...
   5    0     0.00   0.18   ...
   6    0     0.00   0.17   ...
   8    0     0.00   0.22   ...
   9    0     0.00   0.29   ...

Copy link
Copy Markdown
Contributor

@opcm opcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the patch

@144026 144026 force-pushed the master branch 2 times, most recently from 70d8242 to ce7b97d Compare June 16, 2025 10:47
Copy link
Copy Markdown
Contributor

@opcm opcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. Please address the remaining issue

std::unordered_map<int, domain> topologyDomainMap;
{
TemporalThreadAffinity aff0(0);
const int32 maxTopoDomainAff = 8;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will fail if the first 8 cores are offlined which is a realistic scenario. Please increase the max to 1<<16

@rdementi rdementi merged commit f8ecbea into intel:master Jun 16, 2025
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants