[RISCV] Tune flag for fast vrgather.vv #124664

ppenzin · 2025-01-28T00:53:33Z

Add tune knob for N*Log2(N) vrgather.vv cost.

llvmbot · 2025-01-28T00:54:03Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-risc-v

Author: Petr Penzin (ppenzin)

Changes

WIP, for review

Add tune knob for N*Log2(N) vrgather.vv cost.

Full diff: https://github.com/llvm/llvm-project/pull/124664.diff

4 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+4)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+5)
(modified) llvm/lib/Target/RISCV/RISCVProcessors.td (+1)
(modified) llvm/test/CodeGen/RISCV/features-info.ll (+1)

diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 4119dd77804f1a..966c56185b3fd9 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1365,6 +1365,10 @@ def FeatureUnalignedVectorMem
                       "true", "Has reasonably performant unaligned vector "
                       "loads and stores">;
 
+def TuneFastVRGather
+   : SubtargetFeature<"fast-vrgather", "HasFastVRGather",
+                      "true", "Has vrgather.vv with LMUL*log2(LMUL) latency">;
+
 def TunePostRAScheduler : SubtargetFeature<"use-postra-scheduler",
     "UsePostRAScheduler", "true", "Schedule again after register allocation">;
 
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 5e5bc0819a10cc..e49c1e7ce9edbc 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -2848,6 +2848,11 @@ InstructionCost RISCVTargetLowering::getLMULCost(MVT VT) const {
 /// is generally quadratic in the number of vreg implied by LMUL.  Note that
 /// operand (index and possibly mask) are handled separately.
 InstructionCost RISCVTargetLowering::getVRGatherVVCost(MVT VT) const {
+  auto LMULCost = getLMULCost(VT);
+  if (true && Subtarget.hasFastVRGather() && LMULCost.isValid()) {
+    unsigned Log = Log2_64(*LMULCost.getValue());
+    return LMULCost * Log;
+  }
   return getLMULCost(VT) * getLMULCost(VT);
 }
 
diff --git a/llvm/lib/Target/RISCV/RISCVProcessors.td b/llvm/lib/Target/RISCV/RISCVProcessors.td
index 6dfed7ddeb9f63..4f6c9a0229d51b 100644
--- a/llvm/lib/Target/RISCV/RISCVProcessors.td
+++ b/llvm/lib/Target/RISCV/RISCVProcessors.td
@@ -490,6 +490,7 @@ def TENSTORRENT_ASCALON_D8 : RISCVProcessorModel<"tt-ascalon-d8",
                                                   FeatureUnalignedScalarMem,
                                                   FeatureUnalignedVectorMem]),
                                                  [TuneNoDefaultUnroll,
+                                                  TuneFastVRGather,
                                                   TuneOptimizedZeroStrideLoad,
                                                   TunePostRAScheduler]>;
 
diff --git a/llvm/test/CodeGen/RISCV/features-info.ll b/llvm/test/CodeGen/RISCV/features-info.ll
index 70fbda47a14a14..dab9bf92cef17d 100644
--- a/llvm/test/CodeGen/RISCV/features-info.ll
+++ b/llvm/test/CodeGen/RISCV/features-info.ll
@@ -31,6 +31,7 @@
 ; CHECK:   experimental-zvbc32e             - 'Zvbc32e' (Vector Carryless Multiplication with 32-bits elements).
 ; CHECK:   experimental-zvkgs               - 'Zvkgs' (Vector-Scalar GCM instructions for Cryptography).
 ; CHECK:   f                                - 'F' (Single-Precision Floating-Point).
+; CHECK:   fast-vrgather                    - Has vrgather.vv with LMUL*log2(LMUL) latency
 ; CHECK:   forced-atomics                   - Assume that lock-free native-width atomics are available.
 ; CHECK:   h                                - 'H' (Hypervisor).
 ; CHECK:   i                                - 'I' (Base Integer Instruction Set).

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

lukel97

Thanks, could you add a new RUN line in test/Analysis/CostModel/RISCV/rvv/shuffle-permute that has the fast flag enabled?

ppenzin · 2025-01-28T18:07:09Z

@lukel97 does this work and should I try to remove undef from the test?

github-actions · 2025-01-28T18:07:35Z

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:

git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' cb4f24b0e5c4e7c463e59120af4f13ab81519047 8a48a6c6edf2c317d1a18e1a7db72a6cae84d154 llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/lib/Target/RISCV/RISCVSubtarget.h llvm/test/Analysis/CostModel/RISCV/shuffle-permute.ll llvm/test/CodeGen/RISCV/features-info.ll

The following files introduce new uses of undef:

llvm/test/Analysis/CostModel/RISCV/shuffle-permute.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

ppenzin · 2025-01-29T02:18:21Z

I took a brief look, it doesn't look like it would be very difficult to convert undef into arguments, I can give that a go in a separate PR.

lukel97

I think you can ignore the undef warning for now, pretty much every test in that directory is guilty of using undef!

llvm/test/Analysis/CostModel/RISCV/shuffle-permute.ll

llvm/lib/Target/RISCV/RISCVProcessors.td

lukel97

LGTM

topperc · 2025-01-29T06:05:44Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Can we reuse the LMULCost local here? No reason to call it 3 times.

Good point, addressed

preames

LGTM

ppenzin · 2025-01-30T16:09:56Z

Since there were a few approvals, I've removed WIP. @topperc what do you think?

topperc · 2025-01-30T16:13:24Z

llvm/test/Analysis/CostModel/RISCV/shuffle-permute.ll

Should we test larger LMUL? This is only testing up to LMUL=2?

We are testing 16 x i64 and 16 x double, which gets LMULCost of 8; with quadratic estimate it leads to total cost of 139 and with the fast flag it is 59.

topperc · 2025-01-30T16:16:20Z

llvm/test/Analysis/CostModel/RISCV/shuffle-permute.ll

Why is 256-bit i64 vector cheaper than 256-bit i32 and i16 vectors?

Because IndexCost is cheaper by one, same before this patch.

wangpc-pp · 2025-02-05T05:14:05Z

llvm/lib/Target/RISCV/RISCVFeatures.td

Not blocking, but what if the vrgather.vv is not with LMUL*log2(LMUL) complexity?

IDK, that is the only other alternative I know of personally. We can rename flag to be something like 'HasLog2Vrgather' instead of just "fast". Or change the description to say something like "non-quadratic". I don't have a strong preference.

What about adding an enum to represent the cost model?

enum VRGatherCostModel { Quadratic, // Default Log2, .... }

def TuneLog2VRGather : SubtargetFeature<"log2-vrgather", "VRGatherCostModel", "Log2", "Has vrgather.vv with LMUL*log2(LMUL) latency">;

This is more extensible I think. I don't know the details but I remember that the Ventana's V2 has an optimized vrgather.vv implementation as well.

I think I can do that. @topperc and @lukel97 any thoughts from maintainability point of view? Also, anyone from Ventana would like to share their input? (@mgudim ?)

How many different types of cost model would we have? I would have thought LMUL * log2(LMUL) would be as good as a uarch can get. I'm not strongly opinionated on this though, an enum sounds reasonable to me.

I don't have a strong preference here, and would tend to bias towards support what we know of now, then change later if multiple options show up.

Early x280 cost is linear in VL. Later x280 is constant time for VL<= (VLEN/2)/SEW (single DLEN), but linear in VL for everything else. Later x280 is quadratic in LMUL*2 except for fractional LMUL or when element at a time would be faster (i.e. there are less elements in a destination DLEN than the number of possible source DLENs for them).

p470/p670 is quadratic in LMUL in the worst case. But there is effort made to be linear in number of source VLEN actually used with some overhead.

Enum it is then, I suppose 😄

That should be addressed, PTAL

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

topperc · 2025-02-11T18:53:56Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Comment needs to be updated

topperc · 2025-02-11T19:15:29Z

llvm/lib/Target/RISCV/RISCVFeatures.td

Early x280 cost is linear in VL. Later x280 is constant time for VL<= (VLEN/2)/SEW (single DLEN), but linear in VL for everything else. Later x280 is quadratic in LMUL*2 except for fractional LMUL or when element at a time would be faster (i.e. there are less elements in a destination DLEN than the number of possible source DLENs for them).

p470/p670 is quadratic in LMUL in the worst case. But there is effort made to be linear in number of source VLEN actually used with some overhead.

github-actions · 2025-02-21T08:42:19Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff cb4f24b0e5c4e7c463e59120af4f13ab81519047 8a48a6c6edf2c317d1a18e1a7db72a6cae84d154 --extensions cpp,h -- llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/lib/Target/RISCV/RISCVSubtarget.h

View the diff from clang-format here.

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 6d3241e79a..21a71f926a 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -2867,7 +2867,6 @@ InstructionCost RISCVTargetLowering::getLMULCost(MVT VT) const {
   return Cost;
 }
 
-
 /// Return the cost of a vrgather.vv instruction for the type VT.  vrgather.vv
 /// may be quadratic in the number of vreg implied by LMUL, and is assumed to
 /// be by default.  VRGatherCostModel reflects available options.  Note that
diff --git a/llvm/lib/Target/RISCV/RISCVSubtarget.h b/llvm/lib/Target/RISCV/RISCVSubtarget.h
index cc9aef2d52..d03afe1c23 100644
--- a/llvm/lib/Target/RISCV/RISCVSubtarget.h
+++ b/llvm/lib/Target/RISCV/RISCVSubtarget.h
@@ -160,7 +160,9 @@ public:
   /// initializeProperties().
   RISCVProcFamilyEnum getProcFamily() const { return RISCVProcFamily; }
 
-  RISCVVRGatherCostModelEnum getVRGatherCostModel() const { return RISCVVRGatherCostModel; }
+  RISCVVRGatherCostModelEnum getVRGatherCostModel() const {
+    return RISCVVRGatherCostModel;
+  }
 
 #define GET_SUBTARGETINFO_MACRO(ATTRIBUTE, DEFAULT, GETTER) \
   bool GETTER() const { return ATTRIBUTE; }

topperc · 2025-02-21T17:58:37Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

VRGarther -> VRGather

topperc · 2025-02-21T17:58:58Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Something bad happened to this comment.

Indeed, good catch. Should be addressed.

wangpc-pp

LGTM.

topperc

LGTM

ppenzin · 2025-03-03T18:00:55Z

Can anyone help me merge this? I can rebase if needed.

mshockwave · 2025-03-03T18:05:00Z

Can anyone help me merge this? I can rebase if needed.

Could you fix the test failure on test/CodeGen/RISCV/features-info.ll? I think that is relevant

ppenzin · 2025-03-03T18:42:41Z

Should be addressed now, but I did add it a line to release notes which requires a rebase on top of current main.

mshockwave · 2025-03-03T18:54:54Z

Should be addressed now, but I did add it a line to release notes which requires a rebase on top of current main.

LGTM, please rebase then I'll help you to merge

Add tune knob for N*Log2(N) vrgather.vv cost.

Rename option to explicitly say "log", change tune flag to set one of the values.

mshockwave · 2025-03-04T00:05:58Z

Can anyone help me merge this? I can rebase if needed.

I think there are several patches under your belt now. Feel free to request for commit access.

llvmbot added the backend:RISC-V label Jan 28, 2025

topperc reviewed Jan 28, 2025

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

lukel97 reviewed Jan 28, 2025

View reviewed changes

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jan 28, 2025

lukel97 reviewed Jan 29, 2025

View reviewed changes

llvm/test/Analysis/CostModel/RISCV/shuffle-permute.ll Outdated Show resolved Hide resolved

llvm/lib/Target/RISCV/RISCVProcessors.td Outdated Show resolved Hide resolved

lukel97 approved these changes Jan 29, 2025

View reviewed changes

topperc reviewed Jan 29, 2025

View reviewed changes

preames approved these changes Jan 29, 2025

View reviewed changes

ppenzin changed the title ~~[WIP][RISCV] Tune flag for fast vrgather.vv~~ [RISCV] Tune flag for fast vrgather.vv Jan 30, 2025

topperc reviewed Jan 30, 2025

View reviewed changes

wangpc-pp reviewed Feb 5, 2025

View reviewed changes

arcbbb reviewed Feb 5, 2025

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

topperc reviewed Feb 11, 2025

View reviewed changes

topperc reviewed Feb 21, 2025

View reviewed changes

wangpc-pp approved these changes Feb 25, 2025

View reviewed changes

topperc approved these changes Feb 25, 2025

View reviewed changes

ppenzin and others added 3 commits March 3, 2025 15:51

[RISCV] Tune flag for fast vrgather.vv

227745c

Add tune knob for N*Log2(N) vrgather.vv cost.

Fix typo

472f7a5

Add run line to RISCV shuffle-permute.ll test

f6a3698

ppenzin and others added 7 commits March 3, 2025 15:51

Remove code size from analysis test

65c419d

Reuse calculated LMUL cost

4e1738a

Guard against zero and negative log values

cdb48f4

Change vrgather cost model to use an enum

49ca050

Rename option to explicitly say "log", change tune flag to set one of the values.

Update comment

f62b83a

Address typos

52547d5

Fix test flag spelling, add to release notes

8a48a6c

ppenzin force-pushed the fast-vrgather branch from 1d27ecc to 8a48a6c Compare March 3, 2025 23:54

mshockwave merged commit b44fbde into llvm:main Mar 4, 2025
6 of 10 checks passed

ppenzin deleted the fast-vrgather branch March 4, 2025 05:01

ppenzin mentioned this pull request Mar 4, 2025

Request Commit Access For ppenzin #129647

Closed

[RISCV] Tune flag for fast vrgather.vv #124664

[RISCV] Tune flag for fast vrgather.vv #124664

Uh oh!

Conversation

ppenzin commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

ppenzin commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ppenzin commented Jan 29, 2025

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

ppenzin commented Jan 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ppenzin Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ppenzin commented Jan 28, 2025 •

edited

Loading

llvmbot commented Jan 28, 2025 •

edited

Loading

ppenzin commented Jan 28, 2025 •

edited

Loading

github-actions bot commented Jan 28, 2025 •

edited

Loading

ppenzin Feb 5, 2025 •

edited

Loading

topperc Feb 11, 2025 •

edited

Loading

topperc Feb 11, 2025 •

edited

Loading

github-actions bot commented Feb 21, 2025 •

edited

Loading