-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[mlir][gpu][nvptx] Remove null terminator when outputting PTX #133019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-mlir-gpu @llvm/pr-subscribers-mlir-llvm Author: None (modiking) ChangesPTX source files are expected to only contain ASCII text (https://docs.nvidia.com/cuda/parallel-thread-execution/#source-format).
Full diff: https://github.com/llvm/llvm-project/pull/133019.diff 2 Files Affected:
diff --git a/mlir/lib/Target/LLVM/NVVM/Target.cpp b/mlir/lib/Target/LLVM/NVVM/Target.cpp
index cccb2c276ae37..ebdc993a93caf 100644
--- a/mlir/lib/Target/LLVM/NVVM/Target.cpp
+++ b/mlir/lib/Target/LLVM/NVVM/Target.cpp
@@ -722,11 +722,8 @@ NVPTXSerializer::moduleToObject(llvm::Module &llvmModule) {
// Return PTX if the compilation target is `assembly`.
if (targetOptions.getCompilationTarget() ==
- gpu::CompilationTarget::Assembly) {
- // Make sure to include the null terminator.
- StringRef bin(serializedISA->c_str(), serializedISA->size() + 1);
- return SmallVector<char, 0>(bin.begin(), bin.end());
- }
+ gpu::CompilationTarget::Assembly)
+ return SmallVector<char, 0>(serializedISA->begin(), serializedISA->end());
std::optional<SmallVector<char, 0>> result;
moduleToObjectTimer.startTimer();
diff --git a/mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp b/mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp
index eabfd1c4d32eb..1231019e5dadf 100644
--- a/mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp
+++ b/mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp
@@ -129,7 +129,10 @@ TEST_F(MLIRTargetLLVMNVVM, SKIP_WITHOUT_NVPTX(SerializeNVVMToPTX)) {
ASSERT_TRUE(!object->empty());
ASSERT_TRUE(
- StringRef(object->data(), object->size()).contains("nvvm_kernel"));
+ StringRef(object->data(), object->size()).contains("nvvm_kernel"));
+ ASSERT_TRUE(
+ StringRef(object->data(), object->size()).count('\0') == 0);
+
}
}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
Looking around I couldn't find if there was a convention to what should be outputted by |
They don't have it because there's no JIT path for ROCm.
Yes, that was added for the ptx-JIT execution path, see the driver function https://github.com/llvm/llvm-project/blob/main/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp#L143-L144
Also, FWIW |
|
If this is about Here is how we prepare the input to ptxas: llvm-project/mlir/lib/Target/LLVM/NVVM/Target.cpp Lines 390 to 403 in 1c9fe8c
It's not clear to me why there would be the extra null character in the file though? |
That's good to know, thanks! I'm thinking then if this is specific to the JIT execution path it may be better to add the null there instead of to the ptx under
That's true 😅. Nevertheless ptxas team does want to exclude '\0' as well so a way to do that would still be great. I'll make sure the documentation gets updated to reflect this as well.
On that path we're in llvm-project/mlir/lib/Target/LLVM/NVVM/Target.cpp Lines 724 to 739 in 1c9fe8c
On the specific path that's causing issues, |
I think it's cheaper to just not print the terminator, rather than having to copy the string to add one for the JIT path, you can do that with: fileStream << StringRef(ptxData).drop_back();
// Or
fileStream << StringRef(ptxData).consume_back('\0');See Mehdi's comment as well. But I'm ok with either option. |
I don't follow what you're describing. |
|
I dug a bit more to see how things are fitting. The When the output is embedded in the IR, we will always add a null terminator since we store it as a However we currently have also a null terminator within the string: See the So this change LGTM. |
joker-eph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but can you rebase to retrigger CI?
Thanks for taking a look! I'm double checking that the integration tests in JIT mode pass with the change and I'm noticing that: Checking out the dumped object file I'm seeing that when the assembly is embedded into the JIT program it doesn't automatically get a null character at the end: So when it gets embedded in the binary we don't get a null character. Fortunately when this gets embedded as a |
abdafae to
c2dad87
Compare
joker-eph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Thanks for the reviews! |
PTX source files are expected to only contain ASCII text (https://docs.nvidia.com/cuda/parallel-thread-execution/#source-format).
ptxashas so far not enforced this but is moving towards doing so. This revealed a problem where the null terminator is getting printed out in the output file in MLIR path when outputting ptx directly. Remove the extraneous null.