[NVPTX] Implement isTruncateFree and isZExtFree for i32/i64 Optimizations #114683

SergeantCooper · 2024-11-02T20:52:48Z

Implemented the isTruncateFree and isZExtFree virtual functions in the NVPTXTargetLowering class. This implementation optimizes truncation from i64 to i32 as a free operation for NVPTX arch.

Details:

The isTruncateFree function returns true for truncation from i64 to i32.
The isZExtFree function is implemented to return false for zero-extension from i32 to i64.
These changes improve the efficiency of code generation by leveraging hardware capabilities.

Testing:

Added lit tests to validate the functionality of these optimizations.

…to nvptx arch.

github-actions · 2024-11-02T20:53:13Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2024-11-02T20:53:46Z

@llvm/pr-subscribers-backend-nvptx

Author: None (Quark-69)

Changes

Solves #114339

Implemented the isTruncateFree and isZExtFree virtual functions in the NVPTXTargetLowering class. This implementation optimizes truncation from i64 to i32 as a free operation for NVPTX arch.

Details:

The isTruncateFree function returns true for truncation from i64 to i32.
The isZExtFree function is implemented to return false for zero-extension from i32 to i64.
These changes improve the efficiency of code generation by leveraging hardware capabilities.

Testing:

Added lit tests to validate the functionality of these optimizations.

Full diff: https://github.com/llvm/llvm-project/pull/114683.diff

3 Files Affected:

(modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp (+17)
(modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.h (+4)
(added) llvm/test/CodeGen/NVPTX/truncate_zext.ll (+17)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index d3bf0ecfe2cc92..d208caebbd1151 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -3340,6 +3340,23 @@ bool NVPTXTargetLowering::splitValueIntoRegisterParts(
   return false;
 }
 
+bool llvm::NVPTXTargetLowering::isTruncateFree(EVT FromVT, EVT ToVT) const {
+  if (FromVT.isVector() || ToVT.isVector() || !FromVT.isInteger() ||
+      !ToVT.isInteger()) {
+    return false;
+  }
+
+  return FromVT.getSizeInBits() == 64 && ToVT.getSizeInBits() == 32;
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(EVT FromVT, EVT ToVT) const {
+  return false;
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(Type *SrcTy, Type *DstTy) const {
+  return false;
+}
+
 // This creates target external symbol for a function parameter.
 // Name of the symbol is composed from its index and the function name.
 // Negative index corresponds to special parameter (unsized array) used for
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
index c8b589ae39413e..fa73938a35a168 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
@@ -616,6 +616,10 @@ class NVPTXTargetLowering : public TargetLowering {
     return true;
   }
 
+  bool isTruncateFree(EVT FromVT, EVT ToVT) const override;
+  bool isZExtFree(EVT FromVT, EVT ToVT) const override;
+  bool isZExtFree(Type *SrcTy, Type *DstTy) const override;
+
 private:
   const NVPTXSubtarget &STI; // cache the subtarget here
   SDValue getParamSymbol(SelectionDAG &DAG, int idx, EVT) const;
diff --git a/llvm/test/CodeGen/NVPTX/truncate_zext.ll b/llvm/test/CodeGen/NVPTX/truncate_zext.ll
new file mode 100644
index 00000000000000..decc02c5840491
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/truncate_zext.ll
@@ -0,0 +1,17 @@
+; RUN: llc -march=nvptx64 < %s | FileCheck %s
+
+; Test for truncation from i64 to i32
+define i32 @test_trunc_i64_to_i32(i64 %val) {
+  ; CHECK-LABEL: test_trunc_i64_to_i32
+  ; CHECK: trunc
+  %trunc = trunc i64 %val to i32
+  ret i32 %trunc
+}
+
+; Test for zero-extension from i32 to i64
+define i64 @test_zext_i32_to_i64(i32 %val) {
+  ; CHECK-LABEL: test_zext_i32_to_i64
+  ; CHECK: zext
+  %zext = zext i32 %val to i64
+  ret i64 %zext
+}
\ No newline at end of file

justinfargnoli · 2024-11-04T18:34:08Z

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp

+bool llvm::NVPTXTargetLowering::isTruncateFree(EVT FromVT, EVT ToVT) const {
+  if (FromVT.isVector() || ToVT.isVector() || !FromVT.isInteger() ||
+      !ToVT.isInteger()) {
+    return false;
+  }
+
+  return FromVT.getSizeInBits() == 64 && ToVT.getSizeInBits() == 32;
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(EVT FromVT, EVT ToVT) const {
+  return false;
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(Type *SrcTy, Type *DstTy) const {
+  return false;
+}
+


First, let's move the implementation of these functions to NVPTXISelLowering.h. They are small enough that it should be okay, and it's the existing convention.

justinfargnoli · 2024-11-04T18:34:48Z

llvm/lib/Target/NVPTX/NVPTXISelLowering.h

+  bool isTruncateFree(EVT FromVT, EVT ToVT) const override;
+  bool isZExtFree(EVT FromVT, EVT ToVT) const override;
+  bool isZExtFree(Type *SrcTy, Type *DstTy) const override;


Let's move this up to where the existing implementation of isTruncateFree() resides.

justinfargnoli · 2024-11-04T18:35:15Z

llvm/test/CodeGen/NVPTX/truncate_zext.ll

+  ; CHECK: zext
+  %zext = zext i32 %val to i64
+  ret i64 %zext
+}


Add a new line at the end of the file.

justinfargnoli · 2024-11-04T18:44:51Z

llvm/test/CodeGen/NVPTX/truncate_zext.ll

@@ -0,0 +1,17 @@
+; RUN: llc -march=nvptx64 < %s | FileCheck %s
+


This test looks like it's checking that the correct PTX is generated for trunc and zext. This is of course nice to have, but not what we need for this PR.

The purpose of the test associated with this PR should be to ensure that your implementation of isTruncateFree() has an impact. Thus, the test should fail without your change and pass with it.

I like to find a place where isTruncateFree(EVT, EVT) is used in a simple peephole and check against that. For example, you could piggyback off of this peephole (Added and tested in 3332b70).

+1 to that. The test should have some code which has to choose between free ops vs an alternative which would be used otherwise (e.g. i32 trunc(add(i64,i64)) vs i32 add(trunc(i64), trunc(i64)))

justinfargnoli · 2024-11-04T18:51:05Z

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp

+bool llvm::NVPTXTargetLowering::isZExtFree(EVT FromVT, EVT ToVT) const {
+  return false;
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(Type *SrcTy, Type *DstTy) const {
+  return false;
+}


I'm following up offline on what our approach for these should be. I'll update this thread later.

Artem-B · 2024-11-04T19:15:39Z

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp


+bool llvm::NVPTXTargetLowering::isTruncateFree(EVT FromVT, EVT ToVT) const {
+  if (FromVT.isVector() || ToVT.isVector() || !FromVT.isInteger() ||
+      !ToVT.isInteger()) {


Style nit. LLVM prefers no braces around single-statement bodies.
https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

Also, my personal nit: I prefer minimizing the number of negations. .. || !(FromVT.isInteger() && ToVT.isInteger()) is, IMO, better at conveying the intent ( as in "we want both to be integers") instead of eliminating unwanted parts piecemeal.

Artem-B · 2024-11-04T19:17:25Z

llvm/test/CodeGen/NVPTX/truncate_zext.ll

@@ -0,0 +1,17 @@
+; RUN: llc -march=nvptx64 < %s | FileCheck %s


You may want to use llvm/utils/update_llc_test_checks.py to generate the checks.

Artem-B · 2024-11-04T19:21:09Z

llvm/test/CodeGen/NVPTX/truncate_zext.ll

@@ -0,0 +1,17 @@
+; RUN: llc -march=nvptx64 < %s | FileCheck %s
+


+1 to that. The test should have some code which has to choose between free ops vs an alternative which would be used otherwise (e.g. i32 trunc(add(i64,i64)) vs i32 add(trunc(i64), trunc(i64)))

SergeantCooper · 2024-11-04T19:44:24Z

@justinfargnoli, @Artem-B can you specify what checks should I append to my test case.

Artem-B · 2024-11-04T19:57:19Z

@justinfargnoli has also pointed you at a the code processing IR which would be affected by your changes,
His example is simlar to what I've suggested, only uses select instead of add: trunc (select c, a, b) -> select c, (trunc a), (trunc b)

You should create an function which does that (or, find an existing example that may already exist).
I would check the git blame for the code @justinfargnoli pointed at, and look at the tests that were committed when the optimization was implemented. That may give you an example of what kind of tests you may want to add to this PR.

SergeantCooper · 2024-11-04T20:15:20Z

Thanks I'll look into it.

SergeantCooper added 2 commits November 2, 2024 15:07

Implemented isZextFree and IsTruncateFree in NVPTX target lowering.

f43171b

fixed the implementations of istruncatefree and iszextfree according …

401f834

…to nvptx arch.

llvmbot added the backend:NVPTX label Nov 2, 2024

justinfargnoli self-requested a review November 4, 2024 17:41

justinfargnoli assigned SergeantCooper Nov 4, 2024

justinfargnoli reviewed Nov 4, 2024

View reviewed changes

Artem-B reviewed Nov 4, 2024

View reviewed changes

Merge branch 'llvm:main' into dev

519453e

SergeantCooper force-pushed the dev branch from a3e8fd4 to f43171b Compare November 6, 2024 05:34

Implemented the changes based on the feedback accordingly.

6550044

SergeantCooper closed this Nov 6, 2024

SergeantCooper deleted the dev branch November 6, 2024 09:21

		@@ -0,0 +1,17 @@
		; RUN: llc -march=nvptx64 < %s \| FileCheck %s

[NVPTX] Implement isTruncateFree and isZExtFree for i32/i64 Optimizations #114683

[NVPTX] Implement isTruncateFree and isZExtFree for i32/i64 Optimizations #114683

Uh oh!

Conversation

SergeantCooper commented Nov 2, 2024

Uh oh!

github-actions bot commented Nov 2, 2024

Uh oh!

llvmbot commented Nov 2, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SergeantCooper commented Nov 4, 2024

Uh oh!

Artem-B commented Nov 4, 2024

Uh oh!

SergeantCooper commented Nov 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants