-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[llvm-exegesis] Add Pfm Counters for SapphireRapids #113847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm-exegesis] Add Pfm Counters for SapphireRapids #113847
Conversation
This patch adds the appropriate hookups in X86PfmCounters.td for SapphireRapids. This is mostly to fix errors when some of my jobs that only really need dummy counters get scheduled on sapphire rapids machines, but figured I might as well do it properly while here. I do not have hardware access to test this currently, but this matches exactly with what is in the libpfm source code.
|
@llvm/pr-subscribers-backend-x86 Author: Aiden Grossman (boomanaiden154) ChangesThis patch adds the appropriate hookups in X86PfmCounters.td for SapphireRapids. This is mostly to fix errors when some of my jobs that only really need dummy counters get scheduled on sapphire rapids machines, but figured I might as well do it properly while here. I do not have hardware access to test this currently, but this matches exactly with what is in the libpfm source code. Full diff: https://github.com/llvm/llvm-project/pull/113847.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86PfmCounters.td b/llvm/lib/Target/X86/X86PfmCounters.td
index c30e989cdc2af1..38d8d19091e0fd 100644
--- a/llvm/lib/Target/X86/X86PfmCounters.td
+++ b/llvm/lib/Target/X86/X86PfmCounters.td
@@ -220,6 +220,22 @@ def AlderLakePfmCounters : ProcPfmCounters {
}
def : PfmCountersBinding<"alderlake", AlderLakePfmCounters>;
+def SapphireRapidsPfmCounters : ProcPfmCounters {
+ let CycleCounter = UnhaltedCoreCyclesPfmCounter;
+ let UopsCounter = UopsIssuedPfmCounter;
+ let IssueCounters = [
+ PfmIssueCounter<"SPRPort00", "uops_dispatched_port:port_0">,
+ PfmIssueCounter<"SPRPort01", "uops_dispatched_port:port_1">,
+ PfmIssueCounter<"SPRPort02_03_10", "uops_dispatched_port:port_2_3_10">,
+ PfmIssueCounter<"SPRPort04_09", "uops_dispatched_port:port_4_9">,
+ PfmIssueCounter<"SPRPort05_11", "uops_dispatched_port:port_5_11">,
+ PfmIssueCounter<"SPRPort06", "uops_dispatched_port:port_6">,
+ PfmIssueCounter<"SPRPort07_08", "uops_dispatched_port:port_7_8">,
+ ];
+ let ValidationCounters = DefaultIntelPfmValidationCounters;
+}
+def : PfmCountersBinding<"sapphirerapids", SapphireRapidsPfmCounters>;
+
// AMD X86 Counters.
defvar DefaultAMDPfmValidationCounters = [
PfmValidationCounter<InstructionRetired, "RETIRED_INSTRUCTIONS">,
diff --git a/llvm/lib/Target/X86/X86SchedSapphireRapids.td b/llvm/lib/Target/X86/X86SchedSapphireRapids.td
index 6e292da4e293db..b0ebe70c31fd44 100644
--- a/llvm/lib/Target/X86/X86SchedSapphireRapids.td
+++ b/llvm/lib/Target/X86/X86SchedSapphireRapids.td
@@ -59,6 +59,8 @@ def SPRPort01_05 : ProcResGroup<[SPRPort01, SPRPort05]>;
def SPRPort01_05_10 : ProcResGroup<[SPRPort01, SPRPort05, SPRPort10]>;
def SPRPort02_03 : ProcResGroup<[SPRPort02, SPRPort03]>;
def SPRPort02_03_11 : ProcResGroup<[SPRPort02, SPRPort03, SPRPort11]>;
+def SPRPort02_03_10 : ProcResGroup<[SPRPort02, SPRPort03, SPRPort10]>;
+def SPRPort05_11 : ProcResGroup<[SPRPort05, SPRPort11]>;
def SPRPort07_08 : ProcResGroup<[SPRPort07, SPRPort08]>;
// EU has 112 reservation stations.
@@ -78,6 +80,10 @@ def SPRPort02_03_07_08_11 : ProcResGroup<[SPRPort02, SPRPort03, SPRPort07,
let BufferSize = 72;
}
+def SPRPortAny : ProcResGroup<[SPRPort00, SPRPort01, SPRPort02, SPRPort03,
+ SPRPort04, SPRPort05, SPRPort06, SPRPort07,
+ SPRPort08, SPRPort09, SPRPort10, SPRPort11]>;
+
// Integer loads are 5 cycles, so ReadAfterLd registers needn't be available
// until 5 cycles after the memory operand.
def : ReadAdvance<ReadAfterLd, 5>;
|
RKSimon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| def SPRPort02_03 : ProcResGroup<[SPRPort02, SPRPort03]>; | ||
| def SPRPort02_03_11 : ProcResGroup<[SPRPort02, SPRPort03, SPRPort11]>; | ||
| def SPRPort02_03_10 : ProcResGroup<[SPRPort02, SPRPort03, SPRPort10]>; | ||
| def SPRPort05_11 : ProcResGroup<[SPRPort05, SPRPort11]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly, 2,3,11 is a group according to intel optimization manual. https://www.intel.com/content/www/us/en/content-details/814198/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html
uops.info and some website and even some intel tool made mistake to swap them.
I fixed this in https://reviews.llvm.org/D130897. llvm/utils/schedtool/tools/add_uops_uopsinfo.py:29 print('Warning: port 10 and port 11 are reversed on uops.info.', "Let's swap them.", file=sys.stderr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ports should be based on optimization manual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure this isn't a typo in the optimization manual there? All of the other tools being different makes it seem like something might be up here.
I just took the names of the performance counters from libpfm, so it would probably need to be adjusted there too first.
|
Going to merge this as I need this to do cycles measurements on Sapphire Rapids. Happy to continue discussing the optimization manual discrepancy more post-commit. If it's actually an issue, the alder lake mappings need to be updated too, so probably better as a follow up PR assuming there's an issue. |
|
@boomanaiden154 Please can you raise an issue(s) for the port namings |
|
Good point. I've filed #113941. |
This patch adds the appropriate hookups in X86PfmCounters.td for SapphireRapids. This is mostly to fix errors when some of my jobs that only really need dummy counters get scheduled on sapphire rapids machines, but figured I might as well do it properly while here. I do not have hardware access to test this currently, but this matches exactly with what is in the libpfm source code.

This patch adds the appropriate hookups in X86PfmCounters.td for SapphireRapids. This is mostly to fix errors when some of my jobs that only really need dummy counters get scheduled on sapphire rapids machines, but figured I might as well do it properly while here. I do not have hardware access to test this currently, but this matches exactly with what is in the libpfm source code.