-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[NVPTX] Implement isTruncateFree and isZExtFree for i32/i64 Optimizations #114683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-backend-nvptx Author: None (Quark-69) ChangesSolves #114339 Implemented the isTruncateFree and isZExtFree virtual functions in the NVPTXTargetLowering class. This implementation optimizes truncation from i64 to i32 as a free operation for NVPTX arch. Details:
Testing:
Full diff: https://github.com/llvm/llvm-project/pull/114683.diff 3 Files Affected:
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index d3bf0ecfe2cc92..d208caebbd1151 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -3340,6 +3340,23 @@ bool NVPTXTargetLowering::splitValueIntoRegisterParts(
return false;
}
+bool llvm::NVPTXTargetLowering::isTruncateFree(EVT FromVT, EVT ToVT) const {
+ if (FromVT.isVector() || ToVT.isVector() || !FromVT.isInteger() ||
+ !ToVT.isInteger()) {
+ return false;
+ }
+
+ return FromVT.getSizeInBits() == 64 && ToVT.getSizeInBits() == 32;
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(EVT FromVT, EVT ToVT) const {
+ return false;
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(Type *SrcTy, Type *DstTy) const {
+ return false;
+}
+
// This creates target external symbol for a function parameter.
// Name of the symbol is composed from its index and the function name.
// Negative index corresponds to special parameter (unsized array) used for
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
index c8b589ae39413e..fa73938a35a168 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
@@ -616,6 +616,10 @@ class NVPTXTargetLowering : public TargetLowering {
return true;
}
+ bool isTruncateFree(EVT FromVT, EVT ToVT) const override;
+ bool isZExtFree(EVT FromVT, EVT ToVT) const override;
+ bool isZExtFree(Type *SrcTy, Type *DstTy) const override;
+
private:
const NVPTXSubtarget &STI; // cache the subtarget here
SDValue getParamSymbol(SelectionDAG &DAG, int idx, EVT) const;
diff --git a/llvm/test/CodeGen/NVPTX/truncate_zext.ll b/llvm/test/CodeGen/NVPTX/truncate_zext.ll
new file mode 100644
index 00000000000000..decc02c5840491
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/truncate_zext.ll
@@ -0,0 +1,17 @@
+; RUN: llc -march=nvptx64 < %s | FileCheck %s
+
+; Test for truncation from i64 to i32
+define i32 @test_trunc_i64_to_i32(i64 %val) {
+ ; CHECK-LABEL: test_trunc_i64_to_i32
+ ; CHECK: trunc
+ %trunc = trunc i64 %val to i32
+ ret i32 %trunc
+}
+
+; Test for zero-extension from i32 to i64
+define i64 @test_zext_i32_to_i64(i32 %val) {
+ ; CHECK-LABEL: test_zext_i32_to_i64
+ ; CHECK: zext
+ %zext = zext i32 %val to i64
+ ret i64 %zext
+}
\ No newline at end of file
|
| bool llvm::NVPTXTargetLowering::isTruncateFree(EVT FromVT, EVT ToVT) const { | ||
| if (FromVT.isVector() || ToVT.isVector() || !FromVT.isInteger() || | ||
| !ToVT.isInteger()) { | ||
| return false; | ||
| } | ||
|
|
||
| return FromVT.getSizeInBits() == 64 && ToVT.getSizeInBits() == 32; | ||
| } | ||
|
|
||
| bool llvm::NVPTXTargetLowering::isZExtFree(EVT FromVT, EVT ToVT) const { | ||
| return false; | ||
| } | ||
|
|
||
| bool llvm::NVPTXTargetLowering::isZExtFree(Type *SrcTy, Type *DstTy) const { | ||
| return false; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, let's move the implementation of these functions to NVPTXISelLowering.h. They are small enough that it should be okay, and it's the existing convention.
| bool isTruncateFree(EVT FromVT, EVT ToVT) const override; | ||
| bool isZExtFree(EVT FromVT, EVT ToVT) const override; | ||
| bool isZExtFree(Type *SrcTy, Type *DstTy) const override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this up to where the existing implementation of isTruncateFree() resides.
| ; CHECK: zext | ||
| %zext = zext i32 %val to i64 | ||
| ret i64 %zext | ||
| } No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new line at the end of the file.
| @@ -0,0 +1,17 @@ | |||
| ; RUN: llc -march=nvptx64 < %s | FileCheck %s | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test looks like it's checking that the correct PTX is generated for trunc and zext. This is of course nice to have, but not what we need for this PR.
The purpose of the test associated with this PR should be to ensure that your implementation of isTruncateFree() has an impact. Thus, the test should fail without your change and pass with it.
I like to find a place where isTruncateFree(EVT, EVT) is used in a simple peephole and check against that. For example, you could piggyback off of this peephole (Added and tested in 3332b70).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to that. The test should have some code which has to choose between free ops vs an alternative which would be used otherwise (e.g. i32 trunc(add(i64,i64)) vs i32 add(trunc(i64), trunc(i64)))
| bool llvm::NVPTXTargetLowering::isZExtFree(EVT FromVT, EVT ToVT) const { | ||
| return false; | ||
| } | ||
|
|
||
| bool llvm::NVPTXTargetLowering::isZExtFree(Type *SrcTy, Type *DstTy) const { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm following up offline on what our approach for these should be. I'll update this thread later.
|
|
||
| bool llvm::NVPTXTargetLowering::isTruncateFree(EVT FromVT, EVT ToVT) const { | ||
| if (FromVT.isVector() || ToVT.isVector() || !FromVT.isInteger() || | ||
| !ToVT.isInteger()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit. LLVM prefers no braces around single-statement bodies.
https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements
Also, my personal nit: I prefer minimizing the number of negations. .. || !(FromVT.isInteger() && ToVT.isInteger()) is, IMO, better at conveying the intent ( as in "we want both to be integers") instead of eliminating unwanted parts piecemeal.
| @@ -0,0 +1,17 @@ | |||
| ; RUN: llc -march=nvptx64 < %s | FileCheck %s | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to use llvm/utils/update_llc_test_checks.py to generate the checks.
| @@ -0,0 +1,17 @@ | |||
| ; RUN: llc -march=nvptx64 < %s | FileCheck %s | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to that. The test should have some code which has to choose between free ops vs an alternative which would be used otherwise (e.g. i32 trunc(add(i64,i64)) vs i32 add(trunc(i64), trunc(i64)))
|
@justinfargnoli, @Artem-B can you specify what checks should I append to my test case. |
|
@justinfargnoli has also pointed you at a the code processing IR which would be affected by your changes, You should create an function which does that (or, find an existing example that may already exist). |
|
Thanks I'll look into it. |
Solves #114339
Implemented the isTruncateFree and isZExtFree virtual functions in the NVPTXTargetLowering class. This implementation optimizes truncation from i64 to i32 as a free operation for NVPTX arch.
Details:
Testing: