Skip to content

Conversation

@wlei-llvm
Copy link
Contributor

@wlei-llvm wlei-llvm commented Apr 15, 2025

The function base name could be way long which overflows and leads to a crash. Update to extend the max size.

Also changed to use heap allocation( std::vector<char> ) to avoid stack overflow.

@wlei-llvm wlei-llvm requested a review from WenleiHe April 15, 2025 21:43
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:transforms labels Apr 15, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-pgo

Author: Lei Wang (wlei-llvm)

Changes

The function base name could be way long which leads to a crash. Update to extend the max size.

Also changed to use dynamic allocation( std::vector&lt;char&gt; ) to avoid stack overflow.


Full diff: https://github.com/llvm/llvm-project/pull/135863.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp (+3-3)
diff --git a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
index d6d1b7c51d4c0..963c321772d6e 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
@@ -737,11 +737,11 @@ bool SampleProfileMatcher::functionMatchesProfileHelper(
     auto FunctionName = FName.str();
     if (Demangler.partialDemangle(FunctionName.c_str()))
       return std::string();
-    constexpr size_t MaxBaseNameSize = 4096;
-    char BaseNameBuf[MaxBaseNameSize] = {};
+    constexpr size_t MaxBaseNameSize = 65536;
+    std::vector<char> BaseNameBuf(MaxBaseNameSize, 0);
     size_t BaseNameSize = MaxBaseNameSize;
     char *BaseNamePtr =
-        Demangler.getFunctionBaseName(BaseNameBuf, &BaseNameSize);
+        Demangler.getFunctionBaseName(BaseNameBuf.data(), &BaseNameSize);
     return (BaseNamePtr && BaseNameSize)
                ? std::string(BaseNamePtr, BaseNameSize)
                : std::string();

@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Lei Wang (wlei-llvm)

Changes

The function base name could be way long which leads to a crash. Update to extend the max size.

Also changed to use dynamic allocation( std::vector&lt;char&gt; ) to avoid stack overflow.


Full diff: https://github.com/llvm/llvm-project/pull/135863.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp (+3-3)
diff --git a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
index d6d1b7c51d4c0..963c321772d6e 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
@@ -737,11 +737,11 @@ bool SampleProfileMatcher::functionMatchesProfileHelper(
     auto FunctionName = FName.str();
     if (Demangler.partialDemangle(FunctionName.c_str()))
       return std::string();
-    constexpr size_t MaxBaseNameSize = 4096;
-    char BaseNameBuf[MaxBaseNameSize] = {};
+    constexpr size_t MaxBaseNameSize = 65536;
+    std::vector<char> BaseNameBuf(MaxBaseNameSize, 0);
     size_t BaseNameSize = MaxBaseNameSize;
     char *BaseNamePtr =
-        Demangler.getFunctionBaseName(BaseNameBuf, &BaseNameSize);
+        Demangler.getFunctionBaseName(BaseNameBuf.data(), &BaseNameSize);
     return (BaseNamePtr && BaseNameSize)
                ? std::string(BaseNamePtr, BaseNameSize)
                : std::string();

@HighW4y2H3ll HighW4y2H3ll self-requested a review April 15, 2025 22:20
@wlei-llvm wlei-llvm merged commit 80855eb into llvm:main Apr 16, 2025
9 of 11 checks passed
amharc added a commit that referenced this pull request Apr 28, 2025
Currently the backing buffer of a `std::vector<char>` is passed[1] to
`Demangler.getFunctionBaseName`. However, deeply inside the call stack
`OutputBuffer::grow` will call[2] `std::realloc` if it needs to grow the
buffer, leading to UB.

The demangler APIs specify[3] that "`Buf` and `N` behave like the second
and third parameters to `__cxa_demangle`" and the docs for the latter
say[4] that the output buffer must be allocated with `malloc` (but can
also be `NULL` and will then be realloced accordingly).

Note: PR #135863 changed this from a stack array to a `std::vector` and
increased the size to 65K, but this can still lead to a crash if the
demangled name is longer than that - yes, I'm surprised that a >65K-long
function name happens in practice...

[1]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744
[2]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50
[3]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93
[4]:
https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Currently the backing buffer of a `std::vector<char>` is passed[1] to
`Demangler.getFunctionBaseName`. However, deeply inside the call stack
`OutputBuffer::grow` will call[2] `std::realloc` if it needs to grow the
buffer, leading to UB.

The demangler APIs specify[3] that "`Buf` and `N` behave like the second
and third parameters to `__cxa_demangle`" and the docs for the latter
say[4] that the output buffer must be allocated with `malloc` (but can
also be `NULL` and will then be realloced accordingly).

Note: PR llvm#135863 changed this from a stack array to a `std::vector` and
increased the size to 65K, but this can still lead to a crash if the
demangled name is longer than that - yes, I'm surprised that a >65K-long
function name happens in practice...

[1]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744
[2]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50
[3]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93
[4]:
https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Currently the backing buffer of a `std::vector<char>` is passed[1] to
`Demangler.getFunctionBaseName`. However, deeply inside the call stack
`OutputBuffer::grow` will call[2] `std::realloc` if it needs to grow the
buffer, leading to UB.

The demangler APIs specify[3] that "`Buf` and `N` behave like the second
and third parameters to `__cxa_demangle`" and the docs for the latter
say[4] that the output buffer must be allocated with `malloc` (but can
also be `NULL` and will then be realloced accordingly).

Note: PR llvm#135863 changed this from a stack array to a `std::vector` and
increased the size to 65K, but this can still lead to a crash if the
demangled name is longer than that - yes, I'm surprised that a >65K-long
function name happens in practice...

[1]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744
[2]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50
[3]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93
[4]:
https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:transforms PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants