Skip to content

Conversation

@kazutakahirata
Copy link
Contributor

This patch does two things:

  • During deserialization, we accept a function name for Frame as an
    alternative to the usual GUID expressed as a hexadecimal number.

  • During serialization, we print a GUID of Frame as a 16-digit
    hexadecimal number prefixed with 0x in the usual way. (Without this
    patch, we print a decimal number, which is not customary.)

The patch uses a machinery called "normalization" in YAML I/O, which
lets us serialize and deserialize into an alternative data structure.
For our use case, we have an alternative Frame data structure, which
is identical to "struct Frame" except that Function is of type
GUIDHex64 instead of GlobalValue::GUID. This alternative type
supports the two bullet points above without modifying "struct Frame"
at all.

This patch does two things:

- During deserialization, we accept a function name for Frame as an
  alternative to the usual GUID expressed as a hexadecimal number.

- During serialization, we print a GUID of Frame as a 16-digit
  hexadecimal number prefixed with 0x in the usual way.  (Without this
  patch, we print a decimal number, which is not customary.)

The patch uses a machinery called "normalization" in YAML I/O, which
lets us serialize and deserialize into an alternative data structure.
For our use case, we have an alternative Frame data structure, which
is identical to "struct Frame" except that Function is of type
GUIDHex64 instead of GlobalValue::GUID.  This alternative type
supports the two bullet points above without modifying "struct Frame"
at all.
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Dec 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 11, 2024

@llvm/pr-subscribers-pgo

Author: Kazu Hirata (kazutakahirata)

Changes

This patch does two things:

  • During deserialization, we accept a function name for Frame as an
    alternative to the usual GUID expressed as a hexadecimal number.

  • During serialization, we print a GUID of Frame as a 16-digit
    hexadecimal number prefixed with 0x in the usual way. (Without this
    patch, we print a decimal number, which is not customary.)

The patch uses a machinery called "normalization" in YAML I/O, which
lets us serialize and deserialize into an alternative data structure.
For our use case, we have an alternative Frame data structure, which
is identical to "struct Frame" except that Function is of type
GUIDHex64 instead of GlobalValue::GUID. This alternative type
supports the two bullet points above without modifying "struct Frame"
at all.


Full diff: https://github.com/llvm/llvm-project/pull/119629.diff

4 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/MemProfYAML.h (+29-4)
  • (modified) llvm/test/tools/llvm-profdata/memprof-yaml.test (+8-8)
  • (modified) llvm/tools/llvm-profdata/llvm-profdata.cpp (+3-1)
  • (modified) llvm/unittests/ProfileData/MemProfTest.cpp (+2-2)
diff --git a/llvm/include/llvm/ProfileData/MemProfYAML.h b/llvm/include/llvm/ProfileData/MemProfYAML.h
index 4568385fc6f71f..fa1b7dd4738435 100644
--- a/llvm/include/llvm/ProfileData/MemProfYAML.h
+++ b/llvm/include/llvm/ProfileData/MemProfYAML.h
@@ -52,11 +52,36 @@ template <> struct ScalarTraits<memprof::GUIDHex64> {
 };
 
 template <> struct MappingTraits<memprof::Frame> {
+  // Essentially the same as memprof::Frame except that Function is of type
+  // memprof::GUIDHex64 instead of GlobalValue::GUID.  This class helps in two
+  // ways.  During serialization, we print Function as a 16-digit hexadecimal
+  // number.  During deserialization, we accept a function name as an
+  // alternative to the usual GUID expressed as a hexadecimal number.
+  class FrameWithHex64 {
+  public:
+    FrameWithHex64(IO &) {}
+    FrameWithHex64(IO &, const memprof::Frame &F)
+        : Function(F.Function), LineOffset(F.LineOffset), Column(F.Column),
+          IsInlineFrame(F.IsInlineFrame) {}
+    memprof::Frame denormalize(IO &) {
+      return memprof::Frame(Function, LineOffset, Column, IsInlineFrame);
+    }
+
+    memprof::GUIDHex64 Function = 0;
+    static_assert(std::is_same_v<decltype(Function.value),
+                                 decltype(memprof::Frame::Function)>);
+    decltype(memprof::Frame::LineOffset) LineOffset = 0;
+    decltype(memprof::Frame::Column) Column = 0;
+    decltype(memprof::Frame::IsInlineFrame) IsInlineFrame = false;
+  };
+
   static void mapping(IO &Io, memprof::Frame &F) {
-    Io.mapRequired("Function", F.Function);
-    Io.mapRequired("LineOffset", F.LineOffset);
-    Io.mapRequired("Column", F.Column);
-    Io.mapRequired("IsInlineFrame", F.IsInlineFrame);
+    MappingNormalization<FrameWithHex64, memprof::Frame> Keys(Io, F);
+
+    Io.mapRequired("Function", Keys->Function);
+    Io.mapRequired("LineOffset", Keys->LineOffset);
+    Io.mapRequired("Column", Keys->Column);
+    Io.mapRequired("IsInlineFrame", Keys->IsInlineFrame);
 
     // Assert that the definition of Frame matches what we expect.  The
     // structured bindings below detect changes to the number of fields.
diff --git a/llvm/test/tools/llvm-profdata/memprof-yaml.test b/llvm/test/tools/llvm-profdata/memprof-yaml.test
index 0c040ab3a1f9d9..c6ec7935ec64c8 100644
--- a/llvm/test/tools/llvm-profdata/memprof-yaml.test
+++ b/llvm/test/tools/llvm-profdata/memprof-yaml.test
@@ -11,24 +11,24 @@ HeapProfileRecords:
   - GUID:            0xdeadbeef12345678
     AllocSites:
       - Callstack:
-          - { Function: 100, LineOffset: 11, Column: 10, IsInlineFrame: true }
-          - { Function: 200, LineOffset: 22, Column: 20, IsInlineFrame: false }
+          - { Function: 0x1111111111111111, LineOffset: 11, Column: 10, IsInlineFrame: true }
+          - { Function: 0x2222222222222222, LineOffset: 22, Column: 20, IsInlineFrame: false }
         MemInfoBlock:
           AllocCount:      111
           TotalSize:       222
           TotalLifetime:   333
           TotalLifetimeAccessDensity: 444
       - Callstack:
-          - { Function: 300, LineOffset: 33, Column: 30, IsInlineFrame: false }
-          - { Function: 400, LineOffset: 44, Column: 40, IsInlineFrame: true }
+          - { Function: 0x3333333333333333, LineOffset: 33, Column: 30, IsInlineFrame: false }
+          - { Function: 0x4444444444444444, LineOffset: 44, Column: 40, IsInlineFrame: true }
         MemInfoBlock:
           AllocCount:      555
           TotalSize:       666
           TotalLifetime:   777
           TotalLifetimeAccessDensity: 888
     CallSites:
-      - - { Function: 500, LineOffset: 55, Column: 50, IsInlineFrame: true }
-        - { Function: 600, LineOffset: 66, Column: 60, IsInlineFrame: false }
-      - - { Function: 700, LineOffset: 77, Column: 70, IsInlineFrame: true }
-        - { Function: 800, LineOffset: 88, Column: 80, IsInlineFrame: false }
+      - - { Function: 0x5555555555555555, LineOffset: 55, Column: 50, IsInlineFrame: true }
+        - { Function: 0x6666666666666666, LineOffset: 66, Column: 60, IsInlineFrame: false }
+      - - { Function: 0x7777777777777777, LineOffset: 77, Column: 70, IsInlineFrame: true }
+        - { Function: 0x8888888888888888, LineOffset: 88, Column: 80, IsInlineFrame: false }
 ...
diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 6b9e2349899a44..ffc481f071857a 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -3307,7 +3307,9 @@ static int showMemProfProfile(ShowFormat SFormat, raw_fd_ostream &OS) {
 
   auto Reader = std::move(ReaderOrErr.get());
   memprof::AllMemProfData Data = Reader->getAllMemProfData();
-  yaml::Output Yout(OS);
+  // Construct yaml::Output with the maximum column width of 80 so that each
+  // Frame fits in one line.
+  yaml::Output Yout(OS, nullptr, 80);
   Yout << Data;
 
   return 0;
diff --git a/llvm/unittests/ProfileData/MemProfTest.cpp b/llvm/unittests/ProfileData/MemProfTest.cpp
index 456b093362b50f..7dac0eb7ca87f2 100644
--- a/llvm/unittests/ProfileData/MemProfTest.cpp
+++ b/llvm/unittests/ProfileData/MemProfTest.cpp
@@ -804,11 +804,11 @@ template <typename T> std::string serializeInYAML(T &Val) {
 }
 
 TEST(MemProf, YAMLWriterFrame) {
-  Frame F(11, 22, 33, true);
+  Frame F(0x0123456789abcdefULL, 22, 33, true);
 
   std::string Out = serializeInYAML(F);
   EXPECT_EQ(Out, R"YAML(---
-{ Function: 11, LineOffset: 22, Column: 33, IsInlineFrame: true }
+{ Function: 0x0123456789abcdef, LineOffset: 22, Column: 33, IsInlineFrame: true }
 ...
 )YAML");
 }

Copy link

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kazutakahirata kazutakahirata merged commit 9040dd4 into llvm:main Dec 12, 2024
10 checks passed
@kazutakahirata kazutakahirata deleted the memprof_YAML_GUID branch December 12, 2024 01:58
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 12, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/10108

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/schedule.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/schedule.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/schedule.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/schedule.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/schedule.c
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/schedule.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/schedule.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/schedule.c.tmp
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/schedule.c
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/schedule.c:80:12: error: CHECK: expected string not found in input
# |  // CHECK: test no order OK
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Fail to schedule in order.
# | ^
# | <stdin>:2:1: note: possible intended match here
# | test ordered OK
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/schedule.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Fail to schedule in order. 
# | check:80'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: test ordered OK 
# | check:80'0     ~~~~~~~~~~~~~~~~
# | check:80'1     ?                possible intended match
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants