Skip to content

Conversation

@amharc
Copy link
Contributor

@amharc amharc commented Apr 28, 2025

Currently the backing buffer of a std::vector<char> is passed1 to Demangler.getFunctionBaseName. However, deeply inside the call stack OutputBuffer::grow will call2 std::realloc if it needs to grow the buffer, leading to UB.

The demangler APIs specify3 that "Buf and N behave like the second and third parameters to __cxa_demangle" and the docs for the latter say4 that the output buffer must be allocated with malloc (but can also be NULL and will then be realloced accordingly).

Note: PR #135863 changed this from a stack array to a std::vector and increased the size to 65K, but this can still lead to a crash if the demangled name is longer than that - yes, I'm surprised that a >65K-long function name happens in practice...

Currently the backing buffer of a std::vector<char> is passed[1] to
Demangler.getFunctionBaseName. However, deeply inside the call stack
OutputBuffer::grow will call[2] std::realloc if it needs to grow the
buffer, leading to UB.

The demangler APIs specify[3] that "Buf and N behave like the second
and third parameters to __cxa_demangle" and the docs for the latter
say[4] that the output buffer must be allocated with malloc (but can
also be NULL and will then be realloced accordingly).

[1]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744
[2]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50
[3]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93
[4]: https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:transforms labels Apr 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 28, 2025

@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: Krzysztof Pszeniczny (amharc)

Changes

Currently the backing buffer of a std::vector&lt;char&gt; is passed1 to Demangler.getFunctionBaseName. However, deeply inside the call stack OutputBuffer::grow will call2 std::realloc if it needs to grow the buffer, leading to UB.

The demangler APIs specify3 that "Buf and N behave like the second and third parameters to __cxa_demangle" and the docs for the latter say4 that the output buffer must be allocated with malloc (but can also be NULL and will then be realloced accordingly).

Note: PR #135863 changed this from a stack array to a std::vector and increased the size to 65K, but this can still lead to a crash if the demangled name is longer than that - yes, I'm surprised that a >65K-long function name happens in practice...


Full diff: https://github.com/llvm/llvm-project/pull/137659.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp (+7-8)
diff --git a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
index 963c321772d6e..fdb5631f415a0 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
@@ -737,14 +737,13 @@ bool SampleProfileMatcher::functionMatchesProfileHelper(
     auto FunctionName = FName.str();
     if (Demangler.partialDemangle(FunctionName.c_str()))
       return std::string();
-    constexpr size_t MaxBaseNameSize = 65536;
-    std::vector<char> BaseNameBuf(MaxBaseNameSize, 0);
-    size_t BaseNameSize = MaxBaseNameSize;
-    char *BaseNamePtr =
-        Demangler.getFunctionBaseName(BaseNameBuf.data(), &BaseNameSize);
-    return (BaseNamePtr && BaseNameSize)
-               ? std::string(BaseNamePtr, BaseNameSize)
-               : std::string();
+    size_t BaseNameSize = 0;
+    char *BaseNamePtr = Demangler.getFunctionBaseName(nullptr, &BaseNameSize);
+    std::string Result = (BaseNamePtr && BaseNameSize)
+                             ? std::string(BaseNamePtr, BaseNameSize)
+                             : std::string();
+    free(BaseNamePtr);
+    return Result;
   };
   auto IRBaseName = GetBaseName(IRFunc.getName());
   auto ProfBaseName = GetBaseName(ProfFunc.stringRef());

Copy link

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@amharc amharc requested a review from wlei-llvm April 28, 2025 16:35
Copy link
Contributor

@wlei-llvm wlei-llvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix!

Copy link
Contributor

@alexfh alexfh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix!

@amharc amharc merged commit acaf403 into llvm:main Apr 28, 2025
7 of 11 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 28, 2025

LLVM Buildbot has detected a new failure on builder openmp-s390x-linux running on systemz-1 while building llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/11019

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libomp :: tasking/issue-94260-2.c' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 1
/home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp   -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic && /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************


IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Currently the backing buffer of a `std::vector<char>` is passed[1] to
`Demangler.getFunctionBaseName`. However, deeply inside the call stack
`OutputBuffer::grow` will call[2] `std::realloc` if it needs to grow the
buffer, leading to UB.

The demangler APIs specify[3] that "`Buf` and `N` behave like the second
and third parameters to `__cxa_demangle`" and the docs for the latter
say[4] that the output buffer must be allocated with `malloc` (but can
also be `NULL` and will then be realloced accordingly).

Note: PR llvm#135863 changed this from a stack array to a `std::vector` and
increased the size to 65K, but this can still lead to a crash if the
demangled name is longer than that - yes, I'm surprised that a >65K-long
function name happens in practice...

[1]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744
[2]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50
[3]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93
[4]:
https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Currently the backing buffer of a `std::vector<char>` is passed[1] to
`Demangler.getFunctionBaseName`. However, deeply inside the call stack
`OutputBuffer::grow` will call[2] `std::realloc` if it needs to grow the
buffer, leading to UB.

The demangler APIs specify[3] that "`Buf` and `N` behave like the second
and third parameters to `__cxa_demangle`" and the docs for the latter
say[4] that the output buffer must be allocated with `malloc` (but can
also be `NULL` and will then be realloced accordingly).

Note: PR llvm#135863 changed this from a stack array to a `std::vector` and
increased the size to 65K, but this can still lead to a crash if the
demangled name is longer than that - yes, I'm surprised that a >65K-long
function name happens in practice...

[1]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744
[2]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50
[3]:
https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93
[4]:
https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:transforms PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants