[AMDGPU] Update cost model gfx950 min/max tests. NFC. #139310

rampitec · 2025-05-09T18:57:15Z

No description provided.

rampitec · 2025-05-09T18:57:31Z

[AMDGPU] Update cost model gfx950 min/max tests. NFC. #139310 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-05-09T18:58:04Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/139310.diff

2 Files Affected:

(modified) llvm/test/Analysis/CostModel/AMDGPU/maximum.ll (+19)
(modified) llvm/test/Analysis/CostModel/AMDGPU/minimum.ll (+19)

diff --git a/llvm/test/Analysis/CostModel/AMDGPU/maximum.ll b/llvm/test/Analysis/CostModel/AMDGPU/maximum.ll
index 3774c6c0cbbee..0cbc395933efb 100644
--- a/llvm/test/Analysis/CostModel/AMDGPU/maximum.ll
+++ b/llvm/test/Analysis/CostModel/AMDGPU/maximum.ll
@@ -3,6 +3,7 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=ALL,GFX9,GFX90A-FASTF64 %s
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx900 -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=ALL,GFX9,FASTF64 %s
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mattr=-half-rate-64-ops < %s | FileCheck -check-prefixes=ALL,SLOWF64 %s
+; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx950 -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,GFX950-SIZE %s
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,GFX9-SIZE,GFX90A-SIZE %s
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx900 -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,GFX9-SIZE %s
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mattr=-half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,SLOW-SIZE %s
@@ -35,6 +36,15 @@ define void @maximum_f16() {
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 176 for instruction: %v16f16 = call <16 x half> @llvm.maximum.v16f16(<16 x half> undef, <16 x half> undef)
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: ret void
 ;
+; GFX950-SIZE-LABEL: 'maximum_f16'
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %f16 = call half @llvm.maximum.f16(half undef, half undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v2f16 = call <2 x half> @llvm.maximum.v2f16(<2 x half> undef, <2 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v3f16 = call <3 x half> @llvm.maximum.v3f16(<3 x half> undef, <3 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f16 = call <4 x half> @llvm.maximum.v4f16(<4 x half> undef, <4 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f16 = call <8 x half> @llvm.maximum.v8f16(<8 x half> undef, <8 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f16 = call <16 x half> @llvm.maximum.v16f16(<16 x half> undef, <16 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
 ; GFX9-SIZE-LABEL: 'maximum_f16'
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %f16 = call half @llvm.maximum.f16(half undef, half undef)
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f16 = call <2 x half> @llvm.maximum.v2f16(<2 x half> undef, <2 x half> undef)
@@ -98,6 +108,15 @@ define void @maximum_bf16() {
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 176 for instruction: %v16bf16 = call <16 x bfloat> @llvm.maximum.v16bf16(<16 x bfloat> undef, <16 x bfloat> undef)
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: ret void
 ;
+; GFX950-SIZE-LABEL: 'maximum_bf16'
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %bf16 = call bfloat @llvm.maximum.bf16(bfloat undef, bfloat undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2bf16 = call <2 x bfloat> @llvm.maximum.v2bf16(<2 x bfloat> undef, <2 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v3bf16 = call <3 x bfloat> @llvm.maximum.v3bf16(<3 x bfloat> undef, <3 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v4bf16 = call <4 x bfloat> @llvm.maximum.v4bf16(<4 x bfloat> undef, <4 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %v8bf16 = call <8 x bfloat> @llvm.maximum.v8bf16(<8 x bfloat> undef, <8 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 31 for instruction: %v16bf16 = call <16 x bfloat> @llvm.maximum.v16bf16(<16 x bfloat> undef, <16 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
 ; GFX9-SIZE-LABEL: 'maximum_bf16'
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %bf16 = call bfloat @llvm.maximum.bf16(bfloat undef, bfloat undef)
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2bf16 = call <2 x bfloat> @llvm.maximum.v2bf16(<2 x bfloat> undef, <2 x bfloat> undef)
diff --git a/llvm/test/Analysis/CostModel/AMDGPU/minimum.ll b/llvm/test/Analysis/CostModel/AMDGPU/minimum.ll
index 24b9549dfe3a4..64520379e6d55 100644
--- a/llvm/test/Analysis/CostModel/AMDGPU/minimum.ll
+++ b/llvm/test/Analysis/CostModel/AMDGPU/minimum.ll
@@ -3,6 +3,7 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=ALL,GFX9,GFX90A-FASTF64 %s
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx900 -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=ALL,GFX9,FASTF64 %s
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mattr=-half-rate-64-ops < %s | FileCheck -check-prefixes=ALL,SLOWF64 %s
+; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx950 -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,GFX950-SIZE %s
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx90a -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,GFX9-SIZE,GFX90A-SIZE %s
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx900 -mattr=+half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,GFX9-SIZE %s
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=amdgcn-unknown-amdhsa -mattr=-half-rate-64-ops < %s | FileCheck -check-prefixes=SIZE,SLOW-SIZE %s
@@ -35,6 +36,15 @@ define void @minimum_f16() {
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 176 for instruction: %v16f16 = call <16 x half> @llvm.minimum.v16f16(<16 x half> undef, <16 x half> undef)
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: ret void
 ;
+; GFX950-SIZE-LABEL: 'minimum_f16'
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %f16 = call half @llvm.minimum.f16(half undef, half undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v2f16 = call <2 x half> @llvm.minimum.v2f16(<2 x half> undef, <2 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v3f16 = call <3 x half> @llvm.minimum.v3f16(<3 x half> undef, <3 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f16 = call <4 x half> @llvm.minimum.v4f16(<4 x half> undef, <4 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f16 = call <8 x half> @llvm.minimum.v8f16(<8 x half> undef, <8 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f16 = call <16 x half> @llvm.minimum.v16f16(<16 x half> undef, <16 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
 ; GFX9-SIZE-LABEL: 'minimum_f16'
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %f16 = call half @llvm.minimum.f16(half undef, half undef)
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f16 = call <2 x half> @llvm.minimum.v2f16(<2 x half> undef, <2 x half> undef)
@@ -98,6 +108,15 @@ define void @minimum_bf16() {
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 176 for instruction: %v16bf16 = call <16 x bfloat> @llvm.minimum.v16bf16(<16 x bfloat> undef, <16 x bfloat> undef)
 ; SLOWF64-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: ret void
 ;
+; GFX950-SIZE-LABEL: 'minimum_bf16'
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %bf16 = call bfloat @llvm.minimum.bf16(bfloat undef, bfloat undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2bf16 = call <2 x bfloat> @llvm.minimum.v2bf16(<2 x bfloat> undef, <2 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v3bf16 = call <3 x bfloat> @llvm.minimum.v3bf16(<3 x bfloat> undef, <3 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v4bf16 = call <4 x bfloat> @llvm.minimum.v4bf16(<4 x bfloat> undef, <4 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %v8bf16 = call <8 x bfloat> @llvm.minimum.v8bf16(<8 x bfloat> undef, <8 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 31 for instruction: %v16bf16 = call <16 x bfloat> @llvm.minimum.v16bf16(<16 x bfloat> undef, <16 x bfloat> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
+;
 ; GFX9-SIZE-LABEL: 'minimum_bf16'
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %bf16 = call bfloat @llvm.minimum.bf16(bfloat undef, bfloat undef)
 ; GFX9-SIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2bf16 = call <2 x bfloat> @llvm.minimum.v2bf16(<2 x bfloat> undef, <2 x bfloat> undef)

arsenm · 2025-05-09T18:59:15Z

llvm/test/Analysis/CostModel/AMDGPU/maximum.ll

+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %f16 = call half @llvm.maximum.f16(half undef, half undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v2f16 = call <2 x half> @llvm.maximum.v2f16(<2 x half> undef, <2 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v3f16 = call <3 x half> @llvm.maximum.v3f16(<3 x half> undef, <3 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f16 = call <4 x half> @llvm.maximum.v4f16(<4 x half> undef, <4 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f16 = call <8 x half> @llvm.maximum.v8f16(<8 x half> undef, <8 x half> undef)
+; GFX950-SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f16 = call <16 x half> @llvm.maximum.v16f16(<16 x half> undef, <16 x half> undef)


These values look broken. 1 element is more expensive than 2, and every other vector size is also size 2

Probably, but that's what we have. This is also exactly that in downstream. I'd say we need to land tests and then fix it separately.

That is because fmaximum.f16 is custom and fmaximum.v2f16 is legal. Generic code multiplies cost by 2 for custom operations. And that is no so wrong... Our tests:

define <2 x half> @v_maximum_v2f16(<2 x half> %src0, <2 x half> %src1) { ; GFX950-LABEL: v_maximum_v2f16: ; GFX950: ; %bb.0: ; GFX950-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; GFX950-NEXT: v_pk_maximum3_f16 v0, v0, v1, v1 ; GFX950-NEXT: s_setpc_b64 s[30:31] %op = call <2 x half> @llvm.maximum.v2f16(<2 x half> %src0, <2 x half> %src1) ret <2 x half> %op } define half @v_maximum_f16(half %src0, half %src1) { ; GFX950-LABEL: v_maximum_f16: ; GFX950: ; %bb.0: ; GFX950-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ; GFX950-NEXT: v_max_f16_e32 v2, v0, v1 ; GFX950-NEXT: v_mov_b32_e32 v3, 0x7e00 ; GFX950-NEXT: v_cmp_o_f16_e32 vcc, v0, v1 ; GFX950-NEXT: s_nop 1 ; GFX950-NEXT: v_cndmask_b32_e32 v0, v3, v2, vcc ; GFX950-NEXT: s_setpc_b64 s[30:31] %op = call half @llvm.maximum.f16(half %src0, half %src1) ret half %op }

I'd say the cost shall be 4 here, not even 2.

v8 and v16 cases are wrong of course, it shall be 4 and 8 respectively, but we do not handle it anywhere ourselves.

github-actions · 2025-05-09T18:59:32Z

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:

git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/test/Analysis/CostModel/AMDGPU/maximum.ll llvm/test/Analysis/CostModel/AMDGPU/minimum.ll

The following files introduce new uses of undef:

llvm/test/Analysis/CostModel/AMDGPU/maximum.ll
llvm/test/Analysis/CostModel/AMDGPU/minimum.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

[AMDGPU] Update cost model gfx950 min/max tests. NFC.

0ca888a

rampitec requested a review from arsenm May 9, 2025 18:57

rampitec marked this pull request as ready for review May 9, 2025 18:57

llvmbot added backend:AMDGPU llvm:analysis Includes value tracking, cost tables and constant folding labels May 9, 2025

arsenm reviewed May 9, 2025

View reviewed changes

arsenm approved these changes May 9, 2025

View reviewed changes

rampitec merged commit 14be7a7 into main May 9, 2025
13 of 15 checks passed

rampitec deleted the users/rampitec/05-09-_amdgpu_update_cost_model_gfx950_min_max_tests._nfc branch May 9, 2025 20:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Update cost model gfx950 min/max tests. NFC. #139310

[AMDGPU] Update cost model gfx950 min/max tests. NFC. #139310

Uh oh!

rampitec commented May 9, 2025

Uh oh!

rampitec commented May 9, 2025

Uh oh!

llvmbot commented May 9, 2025 •

edited

Loading

Uh oh!

arsenm May 9, 2025

Uh oh!

rampitec May 9, 2025 •

edited

Loading

Uh oh!

rampitec May 9, 2025

Uh oh!

github-actions bot commented May 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[AMDGPU] Update cost model gfx950 min/max tests. NFC. #139310

[AMDGPU] Update cost model gfx950 min/max tests. NFC. #139310

Uh oh!

Conversation

rampitec commented May 9, 2025

Uh oh!

rampitec commented May 9, 2025

Uh oh!

llvmbot commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm May 9, 2025

Choose a reason for hiding this comment

Uh oh!

rampitec May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rampitec May 9, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

llvmbot commented May 9, 2025 •

edited

Loading

rampitec May 9, 2025 •

edited

Loading