[AArch64][NFC] NFC for const vector as Instruction operand #116790

sushgokh · 2024-11-19T11:57:20Z

Current cost-modelling does not take into account cost of materializing const vector. This results in some cases, as the test shows, being vectorized but this may not always be profitable. Future patch will try to address this issue.

llvmbot · 2024-11-19T11:57:57Z

@llvm/pr-subscribers-llvm-transforms

Author: Sushant Gokhale (sushgokh)

Changes

Current cost-modelling does not take into account cost of materializing const vector. This results in some cases, as the test shows, being vectorized but this may not always be profitable. Future patch will try to address this issue.

Full diff: https://github.com/llvm/llvm-project/pull/116790.diff

1 Files Affected:

(added) llvm/test/Transforms/SLPVectorizer/materialize-vector-of-consts.ll (+100)

diff --git a/llvm/test/Transforms/SLPVectorizer/materialize-vector-of-consts.ll b/llvm/test/Transforms/SLPVectorizer/materialize-vector-of-consts.ll
new file mode 100644
index 00000000000000..2f58bd25b75647
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/materialize-vector-of-consts.ll
@@ -0,0 +1,100 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: %if aarch64-registered-target %{ opt -passes=slp-vectorizer -mtriple=aarch64 -S %s | FileCheck %s %}
+
+define <2 x float> @v2f32_diff_consts(float %a, float %b)
+; CHECK-LABEL: define <2 x float> @v2f32_diff_consts(
+; CHECK-SAME: float [[A:%.*]], float [[B:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <2 x float> poison, float [[A]], i32 0
+; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <2 x float> [[TMP1]], float [[B]], i32 1
+; CHECK-NEXT:    [[TMP3:%.*]] = fmul <2 x float> [[TMP2]], <float 2.200000e+01, float 2.300000e+01>
+; CHECK-NEXT:    ret <2 x float> [[TMP3]]
+;
+{
+  %1 = fmul float %a, 22.0
+  %2 = fmul float %b, 23.0
+  %3 = insertelement <2 x float> poison, float %1, i32 0
+  %4 = insertelement <2 x float> %3, float %2, i32 1
+  ret <2 x float> %4
+}
+
+define <2 x float> @v2f32_const_splat(float %a, float %b)
+; CHECK-LABEL: define <2 x float> @v2f32_const_splat(
+; CHECK-SAME: float [[A:%.*]], float [[B:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <2 x float> poison, float [[A]], i32 0
+; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <2 x float> [[TMP1]], float [[B]], i32 1
+; CHECK-NEXT:    [[TMP3:%.*]] = fmul <2 x float> [[TMP2]], splat (float 2.200000e+01)
+; CHECK-NEXT:    ret <2 x float> [[TMP3]]
+;
+{
+  %1 = fmul float %a, 22.0
+  %2 = fmul float %b, 22.0
+  %3 = insertelement <2 x float> poison, float %1, i32 0
+  %4 = insertelement <2 x float> %3, float %2, i32 1
+  ret <2 x float> %4
+}
+
+define <4 x double> @v4f64_illegal_type(double %a, double %b, double %c, double %d)
+; CHECK-LABEL: define <4 x double> @v4f64_illegal_type(
+; CHECK-SAME: double [[A:%.*]], double [[B:%.*]], double [[C:%.*]], double [[D:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <4 x double> poison, double [[A]], i32 0
+; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <4 x double> [[TMP1]], double [[B]], i32 1
+; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <4 x double> [[TMP2]], double [[C]], i32 2
+; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <4 x double> [[TMP3]], double [[D]], i32 3
+; CHECK-NEXT:    [[TMP5:%.*]] = fmul <4 x double> [[TMP4]], <double 2.100000e+01, double 2.200000e+01, double 2.300000e+01, double 2.400000e+01>
+; CHECK-NEXT:    ret <4 x double> [[TMP5]]
+;
+{
+  %1 = fmul double %a, 21.0
+  %2 = fmul double %b, 22.0
+  %3 = fmul double %c, 23.0
+  %4 = fmul double %d, 24.0
+  %5 = insertelement <4 x double> poison, double %1, i32 0
+  %6 = insertelement <4 x double> %5, double %2, i32 1
+  %7 = insertelement <4 x double> %6, double %3, i32 2
+  %8 = insertelement <4 x double> %7, double %4, i32 3
+  ret <4 x double> %8
+}
+
+define <2 x double> @v2f64_dup_const_vector_case1(double %a, double %b, double %c, double %d)
+; CHECK-LABEL: define <2 x double> @v2f64_dup_const_vector_case1(
+; CHECK-SAME: double [[A:%.*]], double [[B:%.*]], double [[C:%.*]], double [[D:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <2 x double> poison, double [[A]], i32 0
+; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <2 x double> [[TMP1]], double [[B]], i32 1
+; CHECK-NEXT:    [[TMP3:%.*]] = fmul <2 x double> [[TMP2]], <double 2.100000e+01, double 2.200000e+01>
+; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <2 x double> poison, double [[C]], i32 0
+; CHECK-NEXT:    [[TMP5:%.*]] = insertelement <2 x double> [[TMP4]], double [[D]], i32 1
+; CHECK-NEXT:    [[TMP6:%.*]] = fmul <2 x double> [[TMP5]], <double 2.100000e+01, double 2.200000e+01>
+; CHECK-NEXT:    [[TMP7:%.*]] = fadd <2 x double> [[TMP3]], [[TMP6]]
+; CHECK-NEXT:    ret <2 x double> [[TMP7]]
+;
+{
+  %1 = fmul double %a, 21.0
+  %2 = fmul double %b, 22.0
+  %3 = fmul double %c, 21.0
+  %4 = fmul double %d, 22.0
+  %5 = insertelement <2 x double> poison, double %1, i32 0
+  %6 = insertelement <2 x double> %5, double %2, i32 1
+  %7 = insertelement <2 x double> poison, double %3, i32 0
+  %8 = insertelement <2 x double> %7, double %4, i32 1
+  %9 = fadd <2 x double> %6, %8
+  ret <2 x double> %9
+}
+
+define <2 x double> @v2f64_dup_const_vector_case2(double %a, double %b, double %c, double %d)
+; CHECK-LABEL: define <2 x double> @v2f64_dup_const_vector_case2(
+; CHECK-SAME: double [[A:%.*]], double [[B:%.*]], double [[C:%.*]], double [[D:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <2 x double> poison, double [[A]], i32 0
+; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <2 x double> [[TMP1]], double [[B]], i32 1
+; CHECK-NEXT:    [[TMP3:%.*]] = fmul <2 x double> [[TMP2]], <double 2.100000e+01, double 2.200000e+01>
+; CHECK-NEXT:    [[TMP4:%.*]] = fadd <2 x double> [[TMP3]], <double 2.100000e+01, double 2.200000e+01>
+; CHECK-NEXT:    ret <2 x double> [[TMP4]]
+;
+{
+  %1 = fmul double %a, 21.0
+  %2 = fmul double %b, 22.0
+  %3 = fadd double %1, 21.0
+  %4 = fadd double %2, 22.0
+  %5 = insertelement <2 x double> poison, double %3, i32 0
+  %6 = insertelement <2 x double> %5, double %4, i32 1
+  ret <2 x double> %6
+}

davemgreen

I have been a little sceptical about costing constants with how they are currently represented in llvm and how often they can be "free". They can certainly be more expensive at times, looking forward to being proved incorrect about it.

The tests LGTM.

sushgokh requested review from davemgreen, david-arm, madhur13490 and sjoerdmeijer November 19, 2024 11:57

llvmbot added the llvm:transforms label Nov 19, 2024

davemgreen approved these changes Nov 20, 2024

View reviewed changes

sushgokh merged commit 197fb27 into llvm:main Nov 21, 2024
8 of 10 checks passed

sushgokh deleted the nfc-materializing-const-vect branch November 21, 2024 04:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64][NFC] NFC for const vector as Instruction operand #116790

[AArch64][NFC] NFC for const vector as Instruction operand #116790

Uh oh!

sushgokh commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

davemgreen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[AArch64][NFC] NFC for const vector as Instruction operand #116790

[AArch64][NFC] NFC for const vector as Instruction operand #116790

Uh oh!

Conversation

sushgokh commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants