Skip to content

Conversation

@kazutakahirata
Copy link
Contributor

This patch does two things:

  • During deserialization, we accept a function name as an alternative
    to the usual GUID represented as a hexadecimal number.

  • During serialization, we print a GUID as a 16-digit hexadecimal
    number prefixed with 0x in the usual way. (Without this patch, we
    print a decimal number, which is not customary.)

In YAML, the MemProf profile is a vector of pairs of GUID and
MemProfRecord. This patch accepts a function name for the GUID, but
it does not accept a function name for the GUID used in Frames yet.
That will be addressed in a subsequent patch.

This patch does two things:

- During deserialization, we accept a function name as an alternative
  to the usual GUID represented as a hexadecimal number.

- During serialization, we print a GUID as a 16-digit hexadecimal
  number prefixed with 0x in the usual way.  (Without this patch, we
  print a decimal number, which is not customary.)

In YAML, the MemProf profile is a vector of pairs of GUID and
MemProfRecord.  This patch accepts a function name for the GUID, but
it does not accept a function name for the GUID used in Frames yet.
That will be addressed in a subsequent patch.
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Dec 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2024

@llvm/pr-subscribers-pgo

Author: Kazu Hirata (kazutakahirata)

Changes

This patch does two things:

  • During deserialization, we accept a function name as an alternative
    to the usual GUID represented as a hexadecimal number.

  • During serialization, we print a GUID as a 16-digit hexadecimal
    number prefixed with 0x in the usual way. (Without this patch, we
    print a decimal number, which is not customary.)

In YAML, the MemProf profile is a vector of pairs of GUID and
MemProfRecord. This patch accepts a function name for the GUID, but
it does not accept a function name for the GUID used in Frames yet.
That will be addressed in a subsequent patch.


Full diff: https://github.com/llvm/llvm-project/pull/119453.diff

3 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/MemProf.h (+27-1)
  • (modified) llvm/test/tools/llvm-profdata/memprof-yaml.test (+1-1)
  • (modified) llvm/unittests/ProfileData/MemProfTest.cpp (+38)
diff --git a/llvm/include/llvm/ProfileData/MemProf.h b/llvm/include/llvm/ProfileData/MemProf.h
index 8a85196e57044d..a4d8b5479061d5 100644
--- a/llvm/include/llvm/ProfileData/MemProf.h
+++ b/llvm/include/llvm/ProfileData/MemProf.h
@@ -11,6 +11,7 @@
 #include "llvm/Support/BLAKE3.h"
 #include "llvm/Support/Endian.h"
 #include "llvm/Support/EndianStream.h"
+#include "llvm/Support/Format.h"
 #include "llvm/Support/HashBuilder.h"
 #include "llvm/Support/YAMLTraits.h"
 #include "llvm/Support/raw_ostream.h"
@@ -491,11 +492,15 @@ struct MemProfRecord {
   }
 };
 
+// A "typedef" for GUID.  See ScalarTraits<memprof::GUIDHex64> for how a GUID is
+// serialized and deserialized in YAML.
+LLVM_YAML_STRONG_TYPEDEF(uint64_t, GUIDHex64)
+
 // Helper struct for AllMemProfData.  In YAML, we treat the GUID and the fields
 // within MemProfRecord at the same level as if the GUID were part of
 // MemProfRecord.
 struct GUIDMemProfRecordPair {
-  GlobalValue::GUID GUID;
+  GUIDHex64 GUID;
   MemProfRecord Record;
 };
 
@@ -1166,6 +1171,27 @@ template <typename FrameIdTy> class CallStackRadixTreeBuilder {
 } // namespace memprof
 
 namespace yaml {
+template <> struct ScalarTraits<memprof::GUIDHex64> {
+  static void output(const memprof::GUIDHex64 &Val, void *, raw_ostream &Out) {
+    // Print GUID as a 16-digit hexadecimal number.
+    Out << format("0x%016" PRIx64, (uint64_t)Val);
+  }
+  static StringRef input(StringRef Scalar, void *, memprof::GUIDHex64 &Val) {
+    uint64_t Num;
+    if (Scalar.starts_with_insensitive("0x")) {
+      // Accept hexadecimal numbers starting with 0x or 0X.
+      if (Scalar.getAsInteger(0, Num))
+        return "invalid hex64 number";
+      Val = Num;
+    } else {
+      // Otherwise, treat the input as a string containing a function name.
+      Val = memprof::IndexedMemProfRecord::getGUID(Scalar);
+    }
+    return StringRef();
+  }
+  static QuotingType mustQuote(StringRef) { return QuotingType::None; }
+};
+
 template <> struct MappingTraits<memprof::Frame> {
   static void mapping(IO &Io, memprof::Frame &F) {
     Io.mapRequired("Function", F.Function);
diff --git a/llvm/test/tools/llvm-profdata/memprof-yaml.test b/llvm/test/tools/llvm-profdata/memprof-yaml.test
index fbbf8a6e7e5b52..0c040ab3a1f9d9 100644
--- a/llvm/test/tools/llvm-profdata/memprof-yaml.test
+++ b/llvm/test/tools/llvm-profdata/memprof-yaml.test
@@ -8,7 +8,7 @@
 ;--- memprof-in.yaml
 ---
 HeapProfileRecords:
-  - GUID:            16045690981402826360
+  - GUID:            0xdeadbeef12345678
     AllocSites:
       - Callstack:
           - { Function: 100, LineOffset: 11, Column: 10, IsInlineFrame: true }
diff --git a/llvm/unittests/ProfileData/MemProfTest.cpp b/llvm/unittests/ProfileData/MemProfTest.cpp
index 093b3a26b5f872..bd9a2d0c5d6292 100644
--- a/llvm/unittests/ProfileData/MemProfTest.cpp
+++ b/llvm/unittests/ProfileData/MemProfTest.cpp
@@ -35,6 +35,7 @@ using ::llvm::StringRef;
 using ::llvm::object::SectionedAddress;
 using ::llvm::symbolize::SymbolizableModule;
 using ::testing::ElementsAre;
+using ::testing::IsEmpty;
 using ::testing::Pair;
 using ::testing::Return;
 using ::testing::SizeIs;
@@ -760,6 +761,43 @@ TEST(MemProf, YAMLParser) {
               ElementsAre(hashCallStack(CS3), hashCallStack(CS4)));
 }
 
+// Verify that the YAML parser accepts a GUID expressed as a function name.
+TEST(MemProf, YAMLParserGUID) {
+  StringRef YAMLData = R"YAML(
+---
+HeapProfileRecords:
+- GUID: _Z3fooi
+  AllocSites:
+  - Callstack:
+    - {Function: 0x100, LineOffset: 11, Column: 10, IsInlineFrame: true}
+    MemInfoBlock: {}
+  CallSites: []
+)YAML";
+
+  YAMLMemProfReader YAMLReader;
+  YAMLReader.parse(YAMLData);
+  IndexedMemProfData MemProfData = YAMLReader.takeMemProfData();
+
+  Frame F1(0x100, 11, 10, true);
+
+  llvm::SmallVector<FrameId> CS1 = {F1.hash()};
+
+  // Verify the entire contents of MemProfData.Frames.
+  EXPECT_THAT(MemProfData.Frames, UnorderedElementsAre(Pair(F1.hash(), F1)));
+
+  // Verify the entire contents of MemProfData.Frames.
+  EXPECT_THAT(MemProfData.CallStacks,
+              UnorderedElementsAre(Pair(hashCallStack(CS1), CS1)));
+
+  // Verify the entire contents of MemProfData.Records.
+  ASSERT_THAT(MemProfData.Records, SizeIs(1));
+  const auto &[GUID, Record] = *MemProfData.Records.begin();
+  EXPECT_EQ(GUID, IndexedMemProfRecord::getGUID("_Z3fooi"));
+  ASSERT_THAT(Record.AllocSites, SizeIs(1));
+  EXPECT_EQ(Record.AllocSites[0].CSId, hashCallStack(CS1));
+  EXPECT_THAT(Record.CallSiteIds, IsEmpty());
+}
+
 template <typename T> std::string serializeInYAML(T &Val) {
   std::string Out;
   llvm::raw_string_ostream OS(Out);

Copy link

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with some minor comments

@kazutakahirata kazutakahirata merged commit 76b4931 into llvm:main Dec 11, 2024
8 checks passed
@kazutakahirata kazutakahirata deleted the memprof_YAML_GUID branch December 11, 2024 05:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants