- 
                Notifications
    
You must be signed in to change notification settings  - Fork 15.1k
 
[llvm-profgen] Add an option to mark all the profile context as preinlined #156501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
           surprised that we don't have flag for this already. how did we do probe only (implies replay inlining) experiments earlier?  | 
    
          
 Good question! Let me first confirm we’re aligned on term: the previous "probe-only" refers to using  yes, we have done something to make it behave like "replay inlining", there are some settings on compiler side to remove the limits: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/SampleProfile.cpp#L2053-L2056, set the max/min inlinelimit to int::max, so it then works as it will replay without any limit. However, there are still some important and subtle differences: one major is the inlining callsite threshold: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/SampleProfile.cpp#L1386-L1406 
 Another one is for the external/importing functions: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/SampleProfile.cpp#L1060-L1062 Therefore, the previous "probe-only" was not a pure replay. I don't remember whether this is a "bug"(that we intentionally want a pure reply but missed the above setting) or it's just by design as we may just want it to do more like the classic line-number based AutoFDO which has those limits. Yeah, alternatively we can have a "fix" in compiler to remove the limits for all pseudo-probe based profile if it's a "bug".  | 
    
          
 I was thinking about  I thought we run pre-inliner as long as  Judging from your change, it seems like we don't run pre-inliner for probe-only (  | 
    
          
 I see, thanks for the check! Seems like it's indeed equivalent, IIRC, the  
 IIRC, csspgo-preinliner is only available in CS mode:  https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-profgen/ProfileGenerator.cpp#L1109C7-L1115, so we don't run/have this for   | 
    
          
 Ok.. that explains it. In this case the change looks good. Good fine, thanks!  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with a nit on naming.
| cl::cat(ProfGenCategory)); | ||
| 
               | 
          ||
| static cl::opt<bool> ForceProfilePreinlined( | ||
| "force-profile-preinlined", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: mark-all-context-preinlined? we are not force doing actual reinline, but only marking preinlined flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the suggestion!
| 
          
 @llvm/pr-subscribers-pgo Author: Lei Wang (wlei-llvm) ChangesAdd a new flag ( Full diff: https://github.com/llvm/llvm-project/pull/156501.diff 6 Files Affected: 
 diff --git a/llvm/include/llvm/ProfileData/SampleProf.h b/llvm/include/llvm/ProfileData/SampleProf.h
index fb2d4d3cc50ed..a626071d23915 100644
--- a/llvm/include/llvm/ProfileData/SampleProf.h
+++ b/llvm/include/llvm/ProfileData/SampleProf.h
@@ -868,6 +868,17 @@ class FunctionSamples {
     }
   }
 
+  // Propagate the given attribute to this profile context and all callee
+  // contexts.
+  void setContextAttribute(ContextAttributeMask Attr) {
+    Context.setAttribute(Attr);
+    for (auto &I : CallsiteSamples) {
+      for (auto &CS : I.second) {
+        CS.second.setContextAttribute(Attr);
+      }
+    }
+  }
+
   // Query the stale profile matching results and remap the location.
   const LineLocation &mapIRLocToProfileLoc(const LineLocation &IRLoc) const {
     // There is no remapping if the profile is not stale or the matching gives
diff --git a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
index 205e467091352..5d0d3d3f3a9ca 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
@@ -6,6 +6,9 @@
 ; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER-OFFSET2
 ; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0 --csspgo-preinliner=0 --gen-cs-nested-profile=0
 ; RUN: FileCheck %s --input-file %t
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0 --csspgo-preinliner=0 --gen-cs-nested-profile=0 --mark-all-context-preinlined
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-MARK-PREINLINED
+
 
 ; CHECK:     [main:2 @ foo]:74:0
 ; CHECK-NEXT: 1: 0
@@ -18,10 +21,30 @@
 ; CHECK-NEXT: 8: 14 bar:14
 ; CHECK-NEXT: 9: 0
 ; CHECK-NEXT: !CFGChecksum: 563088904013236
-; CHECK:[main:2 @ foo:8 @ bar]:28:14
+; CHECK-NEXT: !Attributes: 1
+; CHECK-NEXT:[main:2 @ foo:8 @ bar]:28:14
 ; CHECK-NEXT: 1: 14
 ; CHECK-NEXT: 4: 14
 ; CHECK-NEXT: !CFGChecksum: 72617220756
+; CHECK-NEXT: !Attributes: 1
+
+; CHECK-MARK-PREINLINED:     [main:2 @ foo]:74:0
+; CHECK-MARK-PREINLINED-NEXT: 1: 0
+; CHECK-MARK-PREINLINED-NEXT: 2: 15
+; CHECK-MARK-PREINLINED-NEXT: 3: 15
+; CHECK-MARK-PREINLINED-NEXT: 4: 14
+; CHECK-MARK-PREINLINED-NEXT: 5: 1
+; CHECK-MARK-PREINLINED-NEXT: 6: 15
+; CHECK-MARK-PREINLINED-NEXT: 7: 0
+; CHECK-MARK-PREINLINED-NEXT: 8: 14 bar:14
+; CHECK-MARK-PREINLINED-NEXT: 9: 0
+; CHECK-MARK-PREINLINED-NEXT: !CFGChecksum: 563088904013236
+; CHECK-MARK-PREINLINED-NEXT: !Attributes: 3
+; CHECK-MARK-PREINLINED-NEXT:[main:2 @ foo:8 @ bar]:28:14
+; CHECK-MARK-PREINLINED-NEXT: 1: 14
+; CHECK-MARK-PREINLINED-NEXT: 4: 14
+; CHECK-MARK-PREINLINED-NEXT: !CFGChecksum: 72617220756
+; CHECK-MARK-PREINLINED-NEXT: !Attributes: 3
 
 ; CHECK-UNWINDER:      3
 ; CHECK-UNWINDER-NEXT: 201800-201858:1
diff --git a/llvm/test/tools/llvm-profgen/inline-noprobe.test b/llvm/test/tools/llvm-profgen/inline-noprobe.test
index f960f074fefff..421dfcdb78e33 100644
--- a/llvm/test/tools/llvm-profgen/inline-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-noprobe.test
@@ -8,26 +8,48 @@
 ; RUN: FileCheck %s --input-file %t1 --check-prefix=CHECK-UPDATE-TOTAL-SAMPLE
 ; RUN: llvm-profgen --format=text --use-dwarf-correlation --perfscript=%S/Inputs/inline-noprobe.perfscript --binary=%S/Inputs/inline-noprobe.perfbin --output=%t
 ; RUN: FileCheck %s --input-file %t --check-prefix=CHECK
+; RUN: llvm-profgen --format=text --use-dwarf-correlation --perfscript=%S/Inputs/inline-noprobe.perfscript --binary=%S/Inputs/inline-noprobe.perfbin --output=%t --mark-all-context-preinlined
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-MARK-PREINLINED
+
 ; RUN: echo -e "0\n0" > %t
 ; RUN: llvm-profgen --format=text --unsymbolized-profile=%t --binary=%S/Inputs/inline-noprobe.perfbin --output=%t1 --fill-zero-for-all-funcs
 ; RUN: FileCheck %s --input-file %t1 --check-prefix=CHECK-ALL-ZERO
 ; RUN: llvm-profgen --format=text --unsymbolized-profile=%S/Inputs/out-of-bounds.raw.prof --binary=%S/Inputs/inline-noprobe.perfbin --output=%t1
 ; RUN: FileCheck %s --input-file %t1 --check-prefix=CHECK-OB
 
-CHECK: main:2609:0
-CHECK:  0: 0
-CHECK:  2: 0
-CHECK:  1: foo:2609
-CHECK:   2.1: 42
-CHECK:   3: 62
-CHECK:   3.2: 21
-CHECK:   4: 0
-CHECK:   65526: 62
-CHECK:   3.1: bar:546
-CHECK:    1: 42
-CHECK:    65533: 42
-CHECK:   3.2: bar:189
-CHECK:    1: 21
+CHECK:      main:2609:0
+CHECK-NEXT:  0: 0
+CHECK-NEXT:  2: 0
+CHECK-NEXT:  1: foo:2609
+CHECK-NEXT:   2.1: 42
+CHECK-NEXT:   3: 62
+CHECK-NEXT:   3.2: 21
+CHECK-NEXT:   4: 0
+CHECK-NEXT:   65526: 62
+CHECK-NEXT:   3.1: bar:546
+CHECK-NEXT:    1: 42
+CHECK-NEXT:    65533: 42
+CHECK-NEXT:   3.2: bar:189
+CHECK-NEXT:    1: 21
+
+CHECK-MARK-PREINLINED:      main:2609:0
+CHECK-MARK-PREINLINED-NEXT:  0: 0
+CHECK-MARK-PREINLINED-NEXT:  2: 0
+CHECK-MARK-PREINLINED-NEXT:  1: foo:2609
+CHECK-MARK-PREINLINED-NEXT:   2.1: 42
+CHECK-MARK-PREINLINED-NEXT:   3: 62
+CHECK-MARK-PREINLINED-NEXT:   3.2: 21
+CHECK-MARK-PREINLINED-NEXT:   4: 0
+CHECK-MARK-PREINLINED-NEXT:   65526: 62
+CHECK-MARK-PREINLINED-NEXT:   3.1: bar:546
+CHECK-MARK-PREINLINED-NEXT:    1: 42
+CHECK-MARK-PREINLINED-NEXT:    65533: 42
+CHECK-MARK-PREINLINED-NEXT:    !Attributes: 2
+CHECK-MARK-PREINLINED-NEXT:   3.2: bar:189
+CHECK-MARK-PREINLINED-NEXT:    1: 21
+CHECK-MARK-PREINLINED-NEXT:    !Attributes: 2
+CHECK-MARK-PREINLINED-NEXT:   !Attributes: 2
+CHECK-MARK-PREINLINED-NEXT:  !Attributes: 2
 
 CHECK-UPDATE-TOTAL-SAMPLE: main:292:0
 CHECK-UPDATE-TOTAL-SAMPLE:  0: 0
diff --git a/llvm/test/tools/llvm-profgen/inline-pseudoprobe.test b/llvm/test/tools/llvm-profgen/inline-pseudoprobe.test
index 2dacf7fef8f13..c417b6c7f47c6 100644
--- a/llvm/test/tools/llvm-profgen/inline-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-pseudoprobe.test
@@ -1,5 +1,7 @@
 ; RUN: llvm-profgen --format=text --ignore-stack-samples --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
+; RUN: llvm-profgen --format=text --ignore-stack-samples --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t1 --profile-summary-cold-count=0 --mark-all-context-preinlined
+; RUN: FileCheck %s --input-file %t1 --check-prefix=CHECK-MARK-PREINLINED
 
 ; CHECK:     main:88:0
 ; CHECK-NEXT: 1: 0
@@ -19,6 +21,26 @@
 ; CHECK-NEXT:  !CFGChecksum: 563088904013236
 ; CHECK-NEXT: !CFGChecksum: 281479271677951
 
+; CHECK-MARK-PREINLINED:     main:88:0
+; CHECK-MARK-PREINLINED-NEXT: 1: 0
+; CHECK-MARK-PREINLINED-NEXT: 2: foo:88
+; CHECK-MARK-PREINLINED-NEXT:  1: 0
+; CHECK-MARK-PREINLINED-NEXT:  2: 15
+; CHECK-MARK-PREINLINED-NEXT:  3: 15
+; CHECK-MARK-PREINLINED-NEXT:  4: 14
+; CHECK-MARK-PREINLINED-NEXT:  5: 1
+; CHECK-MARK-PREINLINED-NEXT:  6: 15
+; CHECK-MARK-PREINLINED-NEXT:  7: 0
+; CHECK-MARK-PREINLINED-NEXT:  9: 0
+; CHECK-MARK-PREINLINED-NEXT:  8: bar:28
+; CHECK-MARK-PREINLINED-NEXT:   1: 14
+; CHECK-MARK-PREINLINED-NEXT:   4: 14
+; CHECK-MARK-PREINLINED-NEXT:   !CFGChecksum: 72617220756
+; CHECK-MARK-PREINLINED-NEXT:   !Attributes: 2 
+; CHECK-MARK-PREINLINED-NEXT:  !CFGChecksum: 563088904013236
+; CHECK-MARK-PREINLINED-NEXT:  !Attributes: 2
+; CHECK-MARK-PREINLINED-NEXT: !CFGChecksum: 281479271677951
+; CHECK-MARK-PREINLINED-NEXT: !Attributes: 2
 
 ; clang -O3 -fuse-ld=lld -fpseudo-probe-for-profiling
 ; -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Xclang -mdisable-tail-calls
diff --git a/llvm/tools/llvm-profgen/ProfileGenerator.cpp b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
index 33575b9c67625..c611e410ebd58 100644
--- a/llvm/tools/llvm-profgen/ProfileGenerator.cpp
+++ b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
@@ -64,6 +64,12 @@ static cl::opt<bool>
                              "than threshold, it will be trimmed."),
                     cl::cat(ProfGenCategory));
 
+static cl::opt<bool> MarkAllContextPreinlined(
+    "mark-all-context-preinlined",
+    cl::desc("Mark all function samples as preinlined(set "
+             "ContextShouldBeInlined attribute)."),
+    cl::init(false));
+
 static cl::opt<bool> CSProfMergeColdContext(
     "csprof-merge-cold-context", cl::init(true),
     cl::desc("If the total count of context profile is smaller than "
@@ -511,10 +517,19 @@ void ProfileGenerator::generateProfile() {
   postProcessProfiles();
 }
 
+void ProfileGeneratorBase::markAllContextPreinlined(
+    SampleProfileMap &ProfileMap) {
+  for (auto &I : ProfileMap)
+    I.second.setContextAttribute(ContextShouldBeInlined);
+  FunctionSamples::ProfileIsPreInlined = true;
+}
+
 void ProfileGenerator::postProcessProfiles() {
   computeSummaryAndThreshold(ProfileMap);
   trimColdProfiles(ProfileMap, ColdCountThreshold);
   filterAmbiguousProfile(ProfileMap);
+  if (MarkAllContextPreinlined)
+    markAllContextPreinlined(ProfileMap);
   calculateAndShowDensity(ProfileMap);
 }
 
@@ -1130,6 +1145,8 @@ void CSProfileGenerator::postProcessProfiles() {
     FunctionSamples::ProfileIsCS = false;
   }
   filterAmbiguousProfile(ProfileMap);
+  if (MarkAllContextPreinlined)
+    markAllContextPreinlined(ProfileMap);
   ProfileGeneratorBase::calculateAndShowDensity(ProfileMap);
 }
 
diff --git a/llvm/tools/llvm-profgen/ProfileGenerator.h b/llvm/tools/llvm-profgen/ProfileGenerator.h
index d3e04563a81c2..dbf9d469063e1 100644
--- a/llvm/tools/llvm-profgen/ProfileGenerator.h
+++ b/llvm/tools/llvm-profgen/ProfileGenerator.h
@@ -123,6 +123,8 @@ class ProfileGeneratorBase {
 
   void showDensitySuggestion(double Density);
 
+  void markAllContextPreinlined(SampleProfileMap &ProfileMap);
+
   void collectProfiledFunctions();
 
   bool collectFunctionsFromRawProfile(
 | 
    
Add a new option (under
--mark-all-context-preinlined) that marks all function samples with theContextShouldBeInlinedattribute during post-processing to make the profile as preinlined. This can be useful for experiments outside of the CS preinliner, e.g. to fully replay the inlining for a given profile.