Skip to content

Conversation

@teresajohnson
Copy link
Contributor

When matching memprof profiles, for indirect calls we use the callee
guids recorded on callsites in the profile to synthesize indirect call
VP metadata when none exists. However, we only do this for the first
matching CallSiteEntry from the profile.

In some case there can be multiple, for example when the current
function was eventually inlined into multiple callers. Profile
generation propagates the CallSiteEntry from those callers into the
inlined callee's profile as it may not yet have been inlined in the
new compile.

To capture all of these potential indirect call targets, merge callee
guids across all matching CallSiteEntries.

When matching memprof profiles, for indirect calls we use the callee
guids recorded on callsites in the profile to synthesize indirect call
VP metadata when none exists. However, we only do this for the first
matching CallSiteEntry from the profile.

In some case there can be multiple, for example when the current
function was eventually inlined into multiple callers. Profile
generation propagates the CallSiteEntry from those callers into the
inlined callee's profile as it may not yet have been inlined in the
new compile.

To capture all of these potential indirect call targets, merge callee
guids across all matching CallSiteEntries.
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:transforms labels Dec 6, 2025
@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2025

@llvm/pr-subscribers-pgo

Author: Teresa Johnson (teresajohnson)

Changes

When matching memprof profiles, for indirect calls we use the callee
guids recorded on callsites in the profile to synthesize indirect call
VP metadata when none exists. However, we only do this for the first
matching CallSiteEntry from the profile.

In some case there can be multiple, for example when the current
function was eventually inlined into multiple callers. Profile
generation propagates the CallSiteEntry from those callers into the
inlined callee's profile as it may not yet have been inlined in the
new compile.

To capture all of these potential indirect call targets, merge callee
guids across all matching CallSiteEntries.


Full diff: https://github.com/llvm/llvm-project/pull/170964.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemProfUse.cpp (+33-18)
  • (modified) llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test (+13-1)
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
index 31e69784262da..011aeca9ef0f2 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
@@ -528,35 +528,50 @@ static void handleCallSite(
     Module &M, std::set<std::vector<uint64_t>> &MatchedCallSites,
     OptimizationRemarkEmitter &ORE) {
   auto &Ctx = M.getContext();
+  // Set of Callee GUIDs to attach to indirect calls. We accumulate all of them
+  // to support cases where the instuction's inlined frames match multiple call
+  // site entries, which can happen if the profile was collected from a binary
+  // where this instruction was eventually inlined into multiple callers.
+  SetVector<GlobalValue::GUID> CalleeGuids;
+  bool CallsiteMDAdded = false;
   for (const auto &CallSiteEntry : CallSiteEntries) {
     // If we found and thus matched all frames on the call, create and
     // attach call stack metadata.
     if (stackFrameIncludesInlinedCallStack(CallSiteEntry.Frames,
                                            InlinedCallStack)) {
       NumOfMemProfMatchedCallSites++;
-      addCallsiteMetadata(I, InlinedCallStack, Ctx);
-
-      // Try to attach indirect call metadata if possible.
-      if (!CalledFunction)
-        addVPMetadata(M, I, CallSiteEntry.CalleeGuids);
-
       // Only need to find one with a matching call stack and add a single
       // callsite metadata.
-
-      // Accumulate call site matching information upon request.
-      if (ClPrintMemProfMatchInfo) {
-        std::vector<uint64_t> CallStack;
-        append_range(CallStack, InlinedCallStack);
-        MatchedCallSites.insert(std::move(CallStack));
+      if (!CallsiteMDAdded) {
+        addCallsiteMetadata(I, InlinedCallStack, Ctx);
+
+        // Accumulate call site matching information upon request.
+        if (ClPrintMemProfMatchInfo) {
+          std::vector<uint64_t> CallStack;
+          append_range(CallStack, InlinedCallStack);
+          MatchedCallSites.insert(std::move(CallStack));
+        }
+        ORE.emit(OptimizationRemark(DEBUG_TYPE, "MemProfUse", &I)
+                 << ore::NV("CallSite", &I) << " in function "
+                 << ore::NV("Caller", I.getFunction())
+                 << " matched callsite with frame count "
+                 << ore::NV("Frames", InlinedCallStack.size()));
+
+        // If this is a direct call, we're done.
+        if (CalledFunction)
+          break;
+        CallsiteMDAdded = true;
       }
-      ORE.emit(OptimizationRemark(DEBUG_TYPE, "MemProfUse", &I)
-               << ore::NV("CallSite", &I) << " in function "
-               << ore::NV("Caller", I.getFunction())
-               << " matched callsite with frame count "
-               << ore::NV("Frames", InlinedCallStack.size()));
-      break;
+
+      assert(!CalledFunction && "Didn't expect direct call");
+
+      // Collect Callee GUIDs from all matching CallSiteEntries.
+      CalleeGuids.insert(CallSiteEntry.CalleeGuids.begin(),
+                         CallSiteEntry.CalleeGuids.end());
     }
   }
+  // Try to attach indirect call metadata if possible.
+  addVPMetadata(M, I, CalleeGuids.getArrayRef());
 }
 
 static void readMemprof(Module &M, Function &F,
diff --git a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
index ad83da285694a..54f3ed2d9e65e 100644
--- a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
+++ b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
@@ -18,6 +18,18 @@ HeapProfileRecords:
       - Frames:
           - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: false }
         CalleeGuids:   [0x123456789abcdef0, 0x23456789abcdef01]
+      # The next 2 sets of frames simulates the case where this function was
+      # eventually inlined into multiple callers. We would have propagated the
+      # resulting frames and callee guids here for matching with they not yet
+      # inlined bar. We should aggregate all callee guids into the metadata.
+      - Frames:
+          - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
+          - { Function: _Z3foov, LineOffset: 1, Column: 6, IsInlineFrame: false }
+        CalleeGuids:   [0x1234, 0x2345]
+      - Frames:
+          - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
+          - { Function: _Z3foov, LineOffset: 10, Column: 7, IsInlineFrame: false }
+        CalleeGuids:   [0x3456, 0x4567]
 ...
 
 ;--- basic.ll
@@ -31,7 +43,7 @@ entry:
   ret void
 }
 
-; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 2, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
+; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 6, i64 13398, i64 1, i64 17767, i64 1, i64 4660, i64 1, i64 9029, i64 1, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
 
 !llvm.module.flags = !{!2, !3}
 

@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Teresa Johnson (teresajohnson)

Changes

When matching memprof profiles, for indirect calls we use the callee
guids recorded on callsites in the profile to synthesize indirect call
VP metadata when none exists. However, we only do this for the first
matching CallSiteEntry from the profile.

In some case there can be multiple, for example when the current
function was eventually inlined into multiple callers. Profile
generation propagates the CallSiteEntry from those callers into the
inlined callee's profile as it may not yet have been inlined in the
new compile.

To capture all of these potential indirect call targets, merge callee
guids across all matching CallSiteEntries.


Full diff: https://github.com/llvm/llvm-project/pull/170964.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemProfUse.cpp (+33-18)
  • (modified) llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test (+13-1)
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
index 31e69784262da..011aeca9ef0f2 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
@@ -528,35 +528,50 @@ static void handleCallSite(
     Module &M, std::set<std::vector<uint64_t>> &MatchedCallSites,
     OptimizationRemarkEmitter &ORE) {
   auto &Ctx = M.getContext();
+  // Set of Callee GUIDs to attach to indirect calls. We accumulate all of them
+  // to support cases where the instuction's inlined frames match multiple call
+  // site entries, which can happen if the profile was collected from a binary
+  // where this instruction was eventually inlined into multiple callers.
+  SetVector<GlobalValue::GUID> CalleeGuids;
+  bool CallsiteMDAdded = false;
   for (const auto &CallSiteEntry : CallSiteEntries) {
     // If we found and thus matched all frames on the call, create and
     // attach call stack metadata.
     if (stackFrameIncludesInlinedCallStack(CallSiteEntry.Frames,
                                            InlinedCallStack)) {
       NumOfMemProfMatchedCallSites++;
-      addCallsiteMetadata(I, InlinedCallStack, Ctx);
-
-      // Try to attach indirect call metadata if possible.
-      if (!CalledFunction)
-        addVPMetadata(M, I, CallSiteEntry.CalleeGuids);
-
       // Only need to find one with a matching call stack and add a single
       // callsite metadata.
-
-      // Accumulate call site matching information upon request.
-      if (ClPrintMemProfMatchInfo) {
-        std::vector<uint64_t> CallStack;
-        append_range(CallStack, InlinedCallStack);
-        MatchedCallSites.insert(std::move(CallStack));
+      if (!CallsiteMDAdded) {
+        addCallsiteMetadata(I, InlinedCallStack, Ctx);
+
+        // Accumulate call site matching information upon request.
+        if (ClPrintMemProfMatchInfo) {
+          std::vector<uint64_t> CallStack;
+          append_range(CallStack, InlinedCallStack);
+          MatchedCallSites.insert(std::move(CallStack));
+        }
+        ORE.emit(OptimizationRemark(DEBUG_TYPE, "MemProfUse", &I)
+                 << ore::NV("CallSite", &I) << " in function "
+                 << ore::NV("Caller", I.getFunction())
+                 << " matched callsite with frame count "
+                 << ore::NV("Frames", InlinedCallStack.size()));
+
+        // If this is a direct call, we're done.
+        if (CalledFunction)
+          break;
+        CallsiteMDAdded = true;
       }
-      ORE.emit(OptimizationRemark(DEBUG_TYPE, "MemProfUse", &I)
-               << ore::NV("CallSite", &I) << " in function "
-               << ore::NV("Caller", I.getFunction())
-               << " matched callsite with frame count "
-               << ore::NV("Frames", InlinedCallStack.size()));
-      break;
+
+      assert(!CalledFunction && "Didn't expect direct call");
+
+      // Collect Callee GUIDs from all matching CallSiteEntries.
+      CalleeGuids.insert(CallSiteEntry.CalleeGuids.begin(),
+                         CallSiteEntry.CalleeGuids.end());
     }
   }
+  // Try to attach indirect call metadata if possible.
+  addVPMetadata(M, I, CalleeGuids.getArrayRef());
 }
 
 static void readMemprof(Module &M, Function &F,
diff --git a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
index ad83da285694a..54f3ed2d9e65e 100644
--- a/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
+++ b/llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test
@@ -18,6 +18,18 @@ HeapProfileRecords:
       - Frames:
           - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: false }
         CalleeGuids:   [0x123456789abcdef0, 0x23456789abcdef01]
+      # The next 2 sets of frames simulates the case where this function was
+      # eventually inlined into multiple callers. We would have propagated the
+      # resulting frames and callee guids here for matching with they not yet
+      # inlined bar. We should aggregate all callee guids into the metadata.
+      - Frames:
+          - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
+          - { Function: _Z3foov, LineOffset: 1, Column: 6, IsInlineFrame: false }
+        CalleeGuids:   [0x1234, 0x2345]
+      - Frames:
+          - { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
+          - { Function: _Z3foov, LineOffset: 10, Column: 7, IsInlineFrame: false }
+        CalleeGuids:   [0x3456, 0x4567]
 ...
 
 ;--- basic.ll
@@ -31,7 +43,7 @@ entry:
   ret void
 }
 
-; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 2, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
+; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 6, i64 13398, i64 1, i64 17767, i64 1, i64 4660, i64 1, i64 9029, i64 1, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
 
 !llvm.module.flags = !{!2, !3}
 

Copy link

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

passed using a std::unordered set, which defeated the MapVector's
attempt at deterministic ordering. Change the unordered_set to a vector.
In theory there should not be duplicated sets of call site frames within
a function's profile, but even if there are, there's no harm as we only
create callsite metadata for the first, and we want to include all
callee guids anyway.
@teresajohnson teresajohnson merged commit e3905a4 into llvm:main Dec 8, 2025
10 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 8, 2025

LLVM Buildbot has detected a new failure on builder clang-hip-vega20 running on hip-vega20-0 while building llvm at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/31873

Here is the relevant piece of the build log for the reference
Step 3 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure)
...
[59/61] Linking CXX executable External/HIP/math_h-hip-7.0.2
[60/61] Building CXX object External/HIP/CMakeFiles/TheNextWeek-hip-7.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o
[61/61] Linking CXX executable External/HIP/TheNextWeek-hip-7.0.2
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
+ ninja check-hip-simple
@@@BUILD_STEP Testing HIP test-suite@@@
[0/1] cd /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP && /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/llvm-lit -sv array-hip-7.0.2.test empty-hip-7.0.2.test with-fopenmp-hip-7.0.2.test saxpy-hip-7.0.2.test memmove-hip-7.0.2.test memset-hip-7.0.2.test split-kernel-args-hip-7.0.2.test builtin-logb-scalbn-hip-7.0.2.test TheNextWeek-hip-7.0.2.test algorithm-hip-7.0.2.test cmath-hip-7.0.2.test complex-hip-7.0.2.test math_h-hip-7.0.2.test new-hip-7.0.2.test blender.test
-- Testing: 15 tests, 15 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90
FAIL: test-suite :: External/HIP/blender.test (15 of 15)
******************** TEST 'test-suite :: External/HIP/blender.test' FAILED ********************

/home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out --redirect-input /dev/null --summary /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.time /bin/bash test_blender.sh
/bin/bash verify_blender.sh /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out
Begin Blender test.
TEST_SUITE_HIP_ROOT=/opt/botworker/llvm/External/hip
Render /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend
Blender 4.1.1 (hash e1743a0317bc built 2024-04-15 23:47:45)
Read blend: "/opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend"
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
I1208 20:56:07.748772 3592721 device.cpp:39] HIPEW initialization succeeded
I1208 20:56:07.752877 3592721 device.cpp:45] Found HIPCC hipcc
I1208 20:56:07.837369 3592721 device.cpp:207] Device has compute preemption or is not used for display.
I1208 20:56:07.837386 3592721 device.cpp:211] Added device "" with id "HIP__0000:83:00".
I1208 20:56:07.837467 3592721 device.cpp:568] Mapped host memory limit set to 1,009,924,165,632 bytes. (940.56G)
I1208 20:56:07.837749 3592721 device_impl.cpp:63] Using AVX2 CPU kernels.
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Eyepiece_rim
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.008
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.026
Fra:1 Mem:524.16M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Hoses.003
Fra:1 Mem:532.18M (Peak 532.18M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors
Fra:1 Mem:533.23M (Peak 533.23M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | Cylinder.029
Fra:1 Mem:533.32M (Peak 533.31M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Weapon_thingie
Fra:1 Mem:533.54M (Peak 533.54M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Head_greeble
Fra:1 Mem:534.72M (Peak 534.72M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Head_greeble.005
Step 12 (Testing HIP test-suite) failure: Testing HIP test-suite (failure)
@@@BUILD_STEP Testing HIP test-suite@@@
[0/1] cd /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP && /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/llvm-lit -sv array-hip-7.0.2.test empty-hip-7.0.2.test with-fopenmp-hip-7.0.2.test saxpy-hip-7.0.2.test memmove-hip-7.0.2.test memset-hip-7.0.2.test split-kernel-args-hip-7.0.2.test builtin-logb-scalbn-hip-7.0.2.test TheNextWeek-hip-7.0.2.test algorithm-hip-7.0.2.test cmath-hip-7.0.2.test complex-hip-7.0.2.test math_h-hip-7.0.2.test new-hip-7.0.2.test blender.test
-- Testing: 15 tests, 15 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90
FAIL: test-suite :: External/HIP/blender.test (15 of 15)
******************** TEST 'test-suite :: External/HIP/blender.test' FAILED ********************

/home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out --redirect-input /dev/null --summary /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.time /bin/bash test_blender.sh
/bin/bash verify_blender.sh /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out
Begin Blender test.
TEST_SUITE_HIP_ROOT=/opt/botworker/llvm/External/hip
Render /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend
Blender 4.1.1 (hash e1743a0317bc built 2024-04-15 23:47:45)
Read blend: "/opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend"
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
I1208 20:56:07.748772 3592721 device.cpp:39] HIPEW initialization succeeded
I1208 20:56:07.752877 3592721 device.cpp:45] Found HIPCC hipcc
I1208 20:56:07.837369 3592721 device.cpp:207] Device has compute preemption or is not used for display.
I1208 20:56:07.837386 3592721 device.cpp:211] Added device "" with id "HIP__0000:83:00".
I1208 20:56:07.837467 3592721 device.cpp:568] Mapped host memory limit set to 1,009,924,165,632 bytes. (940.56G)
I1208 20:56:07.837749 3592721 device_impl.cpp:63] Using AVX2 CPU kernels.
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Eyepiece_rim
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.008
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.026
Fra:1 Mem:524.16M (Peak 524.70M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Hoses.003
Fra:1 Mem:532.18M (Peak 532.18M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors
Fra:1 Mem:533.23M (Peak 533.23M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | Cylinder.029
Fra:1 Mem:533.32M (Peak 533.31M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Weapon_thingie
Fra:1 Mem:533.54M (Peak 533.54M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Head_greeble
Fra:1 Mem:534.72M (Peak 534.72M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Head_greeble.005
Fra:1 Mem:534.42M (Peak 534.72M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Head_greeble.010
Fra:1 Mem:534.67M (Peak 534.72M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Wires
Fra:1 Mem:535.53M (Peak 535.53M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Head_plates
Fra:1 Mem:540.75M (Peak 540.75M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Mouth_inside
Fra:1 Mem:541.52M (Peak 541.52M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Pistons.001
Fra:1 Mem:541.75M (Peak 541.75M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_wires
Fra:1 Mem:545.50M (Peak 545.49M) | Time:00:00.49 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | ENV-fog

honeygoyal pushed a commit to honeygoyal/llvm-project that referenced this pull request Dec 9, 2025
…170964)

When matching memprof profiles, for indirect calls we use the callee
guids recorded on callsites in the profile to synthesize indirect call
VP metadata when none exists. However, we only do this for the first
matching CallSiteEntry from the profile.

In some case there can be multiple, for example when the current
function was eventually inlined into multiple callers. Profile
generation propagates the CallSiteEntry from those callers into the
inlined callee's profile as it may not yet have been inlined in the
new compile.

To capture all of these potential indirect call targets, merge callee
guids across all matching CallSiteEntries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:transforms PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants