-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[SampleProfile] Fix UB in Demangler invocation. #137659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently the backing buffer of a std::vector<char> is passed[1] to Demangler.getFunctionBaseName. However, deeply inside the call stack OutputBuffer::grow will call[2] std::realloc if it needs to grow the buffer, leading to UB. The demangler APIs specify[3] that "Buf and N behave like the second and third parameters to __cxa_demangle" and the docs for the latter say[4] that the output buffer must be allocated with malloc (but can also be NULL and will then be realloced accordingly). [1]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744 [2]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50 [3]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93 [4]: https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
|
@llvm/pr-subscribers-pgo @llvm/pr-subscribers-llvm-transforms Author: Krzysztof Pszeniczny (amharc) ChangesCurrently the backing buffer of a The demangler APIs specify3 that " Note: PR #135863 changed this from a stack array to a Full diff: https://github.com/llvm/llvm-project/pull/137659.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
index 963c321772d6e..fdb5631f415a0 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
@@ -737,14 +737,13 @@ bool SampleProfileMatcher::functionMatchesProfileHelper(
auto FunctionName = FName.str();
if (Demangler.partialDemangle(FunctionName.c_str()))
return std::string();
- constexpr size_t MaxBaseNameSize = 65536;
- std::vector<char> BaseNameBuf(MaxBaseNameSize, 0);
- size_t BaseNameSize = MaxBaseNameSize;
- char *BaseNamePtr =
- Demangler.getFunctionBaseName(BaseNameBuf.data(), &BaseNameSize);
- return (BaseNamePtr && BaseNameSize)
- ? std::string(BaseNamePtr, BaseNameSize)
- : std::string();
+ size_t BaseNameSize = 0;
+ char *BaseNamePtr = Demangler.getFunctionBaseName(nullptr, &BaseNameSize);
+ std::string Result = (BaseNamePtr && BaseNameSize)
+ ? std::string(BaseNamePtr, BaseNameSize)
+ : std::string();
+ free(BaseNamePtr);
+ return Result;
};
auto IRBaseName = GetBaseName(IRFunc.getName());
auto ProfBaseName = GetBaseName(ProfFunc.stringRef());
|
snehasish
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
wlei-llvm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix!
alexfh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix!
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/11019 Here is the relevant piece of the build log for the reference |
Currently the backing buffer of a `std::vector<char>` is passed[1] to `Demangler.getFunctionBaseName`. However, deeply inside the call stack `OutputBuffer::grow` will call[2] `std::realloc` if it needs to grow the buffer, leading to UB. The demangler APIs specify[3] that "`Buf` and `N` behave like the second and third parameters to `__cxa_demangle`" and the docs for the latter say[4] that the output buffer must be allocated with `malloc` (but can also be `NULL` and will then be realloced accordingly). Note: PR llvm#135863 changed this from a stack array to a `std::vector` and increased the size to 65K, but this can still lead to a crash if the demangled name is longer than that - yes, I'm surprised that a >65K-long function name happens in practice... [1]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744 [2]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50 [3]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93 [4]: https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
Currently the backing buffer of a `std::vector<char>` is passed[1] to `Demangler.getFunctionBaseName`. However, deeply inside the call stack `OutputBuffer::grow` will call[2] `std::realloc` if it needs to grow the buffer, leading to UB. The demangler APIs specify[3] that "`Buf` and `N` behave like the second and third parameters to `__cxa_demangle`" and the docs for the latter say[4] that the output buffer must be allocated with `malloc` (but can also be `NULL` and will then be realloced accordingly). Note: PR llvm#135863 changed this from a stack array to a `std::vector` and increased the size to 65K, but this can still lead to a crash if the demangled name is longer than that - yes, I'm surprised that a >65K-long function name happens in practice... [1]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L744 [2]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Utility.h#L50 [3]: https://github.com/llvm/llvm-project/blob/d7e631c7cd6d9c13b9519991ec6becf08bc6b8aa/llvm/include/llvm/Demangle/Demangle.h#L92-L93 [4]: https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html
Currently the backing buffer of a
std::vector<char>is passed1 toDemangler.getFunctionBaseName. However, deeply inside the call stackOutputBuffer::growwill call2std::reallocif it needs to grow the buffer, leading to UB.The demangler APIs specify3 that "
BufandNbehave like the second and third parameters to__cxa_demangle" and the docs for the latter say4 that the output buffer must be allocated withmalloc(but can also beNULLand will then be realloced accordingly).Note: PR #135863 changed this from a stack array to a
std::vectorand increased the size to 65K, but this can still lead to a crash if the demangled name is longer than that - yes, I'm surprised that a >65K-long function name happens in practice...