Skip to content

Conversation

@MacDue
Copy link
Member

@MacDue MacDue commented Nov 7, 2025

This patch adds cost model tests for bfloat operations with +sve-b16b16. Currently, some of these costs are higher than they should be as the cost model is assuming bfloats need promotion, but some of these operations are natively supported with +sve-b16b16.

@MacDue MacDue requested review from c-rhodes and davemgreen November 7, 2025 15:10
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Nov 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Benjamin Maxwell (MacDue)

Changes

This patch adds cost model tests for bfloat operations with +sve-b16b16. Currently, some of these costs are higher than they should be as the cost model is assuming bfloats need promotion, but some of these operations are natively supported with +sve-b16b16.


Full diff: https://github.com/llvm/llvm-project/pull/166951.diff

1 Files Affected:

  • (modified) llvm/test/Analysis/CostModel/AArch64/sve-arith-fp.ll (+59-1)
diff --git a/llvm/test/Analysis/CostModel/AArch64/sve-arith-fp.ll b/llvm/test/Analysis/CostModel/AArch64/sve-arith-fp.ll
index ec848c2c08305..41e08fa903604 100644
--- a/llvm/test/Analysis/CostModel/AArch64/sve-arith-fp.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/sve-arith-fp.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
-; RUN: opt < %s -enable-no-nans-fp-math -passes="print<cost-model>" -cost-kind=all 2>&1 -disable-output -mtriple=aarch64 -mattr=+fullfp16 -mattr=+sve | FileCheck %s
+; RUN: opt < %s -enable-no-nans-fp-math -passes="print<cost-model>" -cost-kind=all 2>&1 -disable-output -mtriple=aarch64 -mattr=+fullfp16 -mattr=+sve-b16b16 -mattr=+sve | FileCheck %s
 
 target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
 
@@ -8,6 +8,9 @@ define void @fadd() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V4F16 = fadd <vscale x 4 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V8F16 = fadd <vscale x 8 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V16F16 = fadd <vscale x 16 x half> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:11 CodeSize:1 Lat:3 SizeLat:1 for: %V4BF16 = fadd <vscale x 4 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:27 CodeSize:1 Lat:3 SizeLat:1 for: %V8BF16 = fadd <vscale x 8 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:54 CodeSize:1 Lat:3 SizeLat:1 for: %V16BF16 = fadd <vscale x 16 x bfloat> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of Invalid for: %V1F32 = fadd <vscale x 1 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V2F32 = fadd <vscale x 2 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V4F32 = fadd <vscale x 4 x float> poison, poison
@@ -20,6 +23,10 @@ define void @fadd() {
   %V8F16 = fadd <vscale x 8 x half> poison, poison
   %V16F16 = fadd <vscale x 16 x half> poison, poison
 
+  %V4BF16 = fadd <vscale x 4 x bfloat> poison, poison
+  %V8BF16 = fadd <vscale x 8 x bfloat> poison, poison
+  %V16BF16 = fadd <vscale x 16 x bfloat> poison, poison
+
   %V1F32 = fadd <vscale x 1 x float> poison, poison
   %V2F32 = fadd <vscale x 2 x float> poison, poison
   %V4F32 = fadd <vscale x 4 x float> poison, poison
@@ -36,6 +43,9 @@ define void @fsub() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V4F16 = fsub <vscale x 4 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V8F16 = fsub <vscale x 8 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V16F16 = fsub <vscale x 16 x half> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:11 CodeSize:1 Lat:3 SizeLat:1 for: %V4BF16 = fsub <vscale x 4 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:27 CodeSize:1 Lat:3 SizeLat:1 for: %V8BF16 = fsub <vscale x 8 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:54 CodeSize:1 Lat:3 SizeLat:1 for: %V16BF16 = fsub <vscale x 16 x bfloat> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of Invalid for: %V1F32 = fsub <vscale x 1 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V2F32 = fsub <vscale x 2 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V4F32 = fsub <vscale x 4 x float> poison, poison
@@ -48,6 +58,10 @@ define void @fsub() {
   %V8F16 = fsub <vscale x 8 x half> poison, poison
   %V16F16 = fsub <vscale x 16 x half> poison, poison
 
+  %V4BF16 = fsub <vscale x 4 x bfloat> poison, poison
+  %V8BF16 = fsub <vscale x 8 x bfloat> poison, poison
+  %V16BF16 = fsub <vscale x 16 x bfloat> poison, poison
+
   %V1F32 = fsub <vscale x 1 x float> poison, poison
   %V2F32 = fsub <vscale x 2 x float> poison, poison
   %V4F32 = fsub <vscale x 4 x float> poison, poison
@@ -65,6 +79,10 @@ define void @fneg() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V4F16 = fneg <vscale x 4 x half> poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V8F16 = fneg <vscale x 8 x half> poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V16F16 = fneg <vscale x 16 x half> poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V2BF16 = fneg <vscale x 2 x bfloat> poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V4BF16 = fneg <vscale x 4 x bfloat> poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V8BF16 = fneg <vscale x 8 x bfloat> poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V16BF16 = fneg <vscale x 16 x bfloat> poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V2F32 = fneg <vscale x 2 x float> poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:1 CodeSize:1 Lat:3 SizeLat:1 for: %V4F32 = fneg <vscale x 4 x float> poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V8F32 = fneg <vscale x 8 x float> poison
@@ -77,6 +95,11 @@ define void @fneg() {
   %V8F16 = fneg <vscale x 8 x half> poison
   %V16F16 = fneg <vscale x 16 x half> poison
 
+  %V2BF16 = fneg <vscale x 2 x bfloat> poison
+  %V4BF16 = fneg <vscale x 4 x bfloat> poison
+  %V8BF16 = fneg <vscale x 8 x bfloat> poison
+  %V16BF16 = fneg <vscale x 16 x bfloat> poison
+
   %V2F32 = fneg <vscale x 2 x float> poison
   %V4F32 = fneg <vscale x 4 x float> poison
   %V8F32 = fneg <vscale x 8 x float> poison
@@ -92,6 +115,9 @@ define void @fmul() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V4F16 = fmul <vscale x 4 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V8F16 = fmul <vscale x 8 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:3 SizeLat:1 for: %V16F16 = fmul <vscale x 16 x half> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:12 CodeSize:1 Lat:3 SizeLat:1 for: %V4BF16 = fmul <vscale x 4 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:29 CodeSize:1 Lat:3 SizeLat:1 for: %V8BF16 = fmul <vscale x 8 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:58 CodeSize:1 Lat:3 SizeLat:1 for: %V16BF16 = fmul <vscale x 16 x bfloat> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V2F32 = fmul <vscale x 2 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V4F32 = fmul <vscale x 4 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:3 SizeLat:1 for: %V8F32 = fmul <vscale x 8 x float> poison, poison
@@ -103,6 +129,10 @@ define void @fmul() {
   %V8F16 = fmul <vscale x 8 x half> poison, poison
   %V16F16 = fmul <vscale x 16 x half> poison, poison
 
+  %V4BF16 = fmul <vscale x 4 x bfloat> poison, poison
+  %V8BF16 = fmul <vscale x 8 x bfloat> poison, poison
+  %V16BF16 = fmul <vscale x 16 x bfloat> poison, poison
+
   %V2F32 = fmul <vscale x 2 x float> poison, poison
   %V4F32 = fmul <vscale x 4 x float> poison, poison
   %V8F32 = fmul <vscale x 8 x float> poison, poison
@@ -118,6 +148,9 @@ define void @fdiv() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:4 Lat:4 SizeLat:4 for: %V4F16 = fdiv <vscale x 4 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:4 Lat:4 SizeLat:4 for: %V8F16 = fdiv <vscale x 8 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of 4 for: %V16F16 = fdiv <vscale x 16 x half> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:12 CodeSize:4 Lat:4 SizeLat:4 for: %V4BF16 = fdiv <vscale x 4 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:29 CodeSize:4 Lat:4 SizeLat:4 for: %V8BF16 = fdiv <vscale x 8 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:58 CodeSize:4 Lat:4 SizeLat:4 for: %V16BF16 = fdiv <vscale x 16 x bfloat> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:4 Lat:4 SizeLat:4 for: %V2F32 = fdiv <vscale x 2 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:4 Lat:4 SizeLat:4 for: %V4F32 = fdiv <vscale x 4 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of 4 for: %V8F32 = fdiv <vscale x 8 x float> poison, poison
@@ -129,6 +162,10 @@ define void @fdiv() {
   %V8F16 = fdiv <vscale x 8 x half> poison, poison
   %V16F16 = fdiv <vscale x 16 x half> poison, poison
 
+  %V4BF16 = fdiv <vscale x 4 x bfloat> poison, poison
+  %V8BF16 = fdiv <vscale x 8 x bfloat> poison, poison
+  %V16BF16 = fdiv <vscale x 16 x bfloat> poison, poison
+
   %V2F32 = fdiv <vscale x 2 x float> poison, poison
   %V4F32 = fdiv <vscale x 4 x float> poison, poison
   %V8F32 = fdiv <vscale x 8 x float> poison, poison
@@ -144,6 +181,9 @@ define void @frem() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V4F16 = frem <vscale x 4 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V8F16 = frem <vscale x 8 x half> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V16F16 = frem <vscale x 16 x half> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V4BF16 = frem <vscale x 4 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V8BF16 = frem <vscale x 8 x bfloat> poison, poison
+; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V16BF16 = frem <vscale x 16 x bfloat> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V2F32 = frem <vscale x 2 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V4F32 = frem <vscale x 4 x float> poison, poison
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:Invalid CodeSize:4 Lat:4 SizeLat:4 for: %V8F32 = frem <vscale x 8 x float> poison, poison
@@ -155,6 +195,10 @@ define void @frem() {
   %V8F16 = frem <vscale x 8 x half> poison, poison
   %V16F16 = frem <vscale x 16 x half> poison, poison
 
+  %V4BF16 = frem <vscale x 4 x bfloat> poison, poison
+  %V8BF16 = frem <vscale x 8 x bfloat> poison, poison
+  %V16BF16 = frem <vscale x 16 x bfloat> poison, poison
+
   %V2F32 = frem <vscale x 2 x float> poison, poison
   %V4F32 = frem <vscale x 4 x float> poison, poison
   %V8F32 = frem <vscale x 8 x float> poison, poison
@@ -170,6 +214,9 @@ define void @fma() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V4F16 = call <vscale x 4 x half> @llvm.fma.nxv4f16(<vscale x 4 x half> poison, <vscale x 4 x half> poison, <vscale x 4 x half> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V8F16 = call <vscale x 8 x half> @llvm.fma.nxv8f16(<vscale x 8 x half> poison, <vscale x 8 x half> poison, <vscale x 8 x half> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:3 SizeLat:1 for: %V16F16 = call <vscale x 16 x half> @llvm.fma.nxv16f16(<vscale x 16 x half> poison, <vscale x 16 x half> poison, <vscale x 16 x half> poison)
+; CHECK-NEXT:  Cost Model: Found costs of 2 for: %V4BF16 = call <vscale x 4 x bfloat> @llvm.fma.nxv4bf16(<vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison)
+; CHECK-NEXT:  Cost Model: Found costs of 2 for: %V8BF16 = call <vscale x 8 x bfloat> @llvm.fma.nxv8bf16(<vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison)
+; CHECK-NEXT:  Cost Model: Found costs of 4 for: %V16BF16 = call <vscale x 16 x bfloat> @llvm.fma.nxv16bf16(<vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V2F32 = call <vscale x 2 x float> @llvm.fma.nxv2f32(<vscale x 2 x float> poison, <vscale x 2 x float> poison, <vscale x 2 x float> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V4F32 = call <vscale x 4 x float> @llvm.fma.nxv4f32(<vscale x 4 x float> poison, <vscale x 4 x float> poison, <vscale x 4 x float> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:3 SizeLat:1 for: %V8F32 = call <vscale x 8 x float> @llvm.fma.nxv8f32(<vscale x 8 x float> poison, <vscale x 8 x float> poison, <vscale x 8 x float> poison)
@@ -181,6 +228,10 @@ define void @fma() {
   %V8F16 = call <vscale x 8 x half> @llvm.fma.v8f16(<vscale x 8 x half> poison, <vscale x 8 x half> poison, <vscale x 8 x half> poison)
   %V16F16 = call <vscale x 16 x half> @llvm.fma.v16f16(<vscale x 16 x half> poison, <vscale x 16 x half> poison, <vscale x 16 x half> poison)
 
+  %V4BF16 = call <vscale x 4 x bfloat> @llvm.fma.v4BF16(<vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison)
+  %V8BF16 = call <vscale x 8 x bfloat> @llvm.fma.v8BF16(<vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison)
+  %V16BF16 = call <vscale x 16 x bfloat> @llvm.fma.v16BF16(<vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison)
+
   %V2F32 = call <vscale x 2 x float> @llvm.fma.v2f32(<vscale x 2 x float> poison, <vscale x 2 x float> poison, <vscale x 2 x float> poison)
   %V4F32 = call <vscale x 4 x float> @llvm.fma.v4f32(<vscale x 4 x float> poison, <vscale x 4 x float> poison, <vscale x 4 x float> poison)
   %V8F32 = call <vscale x 8 x float> @llvm.fma.v8f32(<vscale x 8 x float> poison, <vscale x 8 x float> poison, <vscale x 8 x float> poison)
@@ -196,6 +247,9 @@ define void @fmuladd() {
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V4F16 = call <vscale x 4 x half> @llvm.fmuladd.nxv4f16(<vscale x 4 x half> poison, <vscale x 4 x half> poison, <vscale x 4 x half> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V8F16 = call <vscale x 8 x half> @llvm.fmuladd.nxv8f16(<vscale x 8 x half> poison, <vscale x 8 x half> poison, <vscale x 8 x half> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:3 SizeLat:1 for: %V16F16 = call <vscale x 16 x half> @llvm.fmuladd.nxv16f16(<vscale x 16 x half> poison, <vscale x 16 x half> poison, <vscale x 16 x half> poison)
+; CHECK-NEXT:  Cost Model: Found costs of 2 for: %V4BF16 = call <vscale x 4 x bfloat> @llvm.fmuladd.nxv4bf16(<vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison)
+; CHECK-NEXT:  Cost Model: Found costs of 2 for: %V8BF16 = call <vscale x 8 x bfloat> @llvm.fmuladd.nxv8bf16(<vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison)
+; CHECK-NEXT:  Cost Model: Found costs of 4 for: %V16BF16 = call <vscale x 16 x bfloat> @llvm.fmuladd.nxv16bf16(<vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V2F32 = call <vscale x 2 x float> @llvm.fmuladd.nxv2f32(<vscale x 2 x float> poison, <vscale x 2 x float> poison, <vscale x 2 x float> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:2 CodeSize:1 Lat:3 SizeLat:1 for: %V4F32 = call <vscale x 4 x float> @llvm.fmuladd.nxv4f32(<vscale x 4 x float> poison, <vscale x 4 x float> poison, <vscale x 4 x float> poison)
 ; CHECK-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:1 Lat:3 SizeLat:1 for: %V8F32 = call <vscale x 8 x float> @llvm.fmuladd.nxv8f32(<vscale x 8 x float> poison, <vscale x 8 x float> poison, <vscale x 8 x float> poison)
@@ -207,6 +261,10 @@ define void @fmuladd() {
   %V8F16 = call <vscale x 8 x half> @llvm.fmuladd.v8f16(<vscale x 8 x half> poison, <vscale x 8 x half> poison, <vscale x 8 x half> poison)
   %V16F16 = call <vscale x 16 x half> @llvm.fmuladd.v16f16(<vscale x 16 x half> poison, <vscale x 16 x half> poison, <vscale x 16 x half> poison)
 
+  %V4BF16 = call <vscale x 4 x bfloat> @llvm.fmuladd.v4BF16(<vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison, <vscale x 4 x bfloat> poison)
+  %V8BF16 = call <vscale x 8 x bfloat> @llvm.fmuladd.v8BF16(<vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison, <vscale x 8 x bfloat> poison)
+  %V16BF16 = call <vscale x 16 x bfloat> @llvm.fmuladd.v16BF16(<vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison, <vscale x 16 x bfloat> poison)
+
   %V2F32 = call <vscale x 2 x float> @llvm.fmuladd.v2f32(<vscale x 2 x float> poison, <vscale x 2 x float> poison, <vscale x 2 x float> poison)
   %V4F32 = call <vscale x 4 x float> @llvm.fmuladd.v4f32(<vscale x 4 x float> poison, <vscale x 4 x float> poison, <vscale x 4 x float> poison)
   %V8F32 = call <vscale x 8 x float> @llvm.fmuladd.v8f32(<vscale x 8 x float> poison, <vscale x 8 x float> poison, <vscale x 8 x float> poison)

Copy link
Collaborator

@c-rhodes c-rhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of minor comments but otherwise LGTM cheers

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding the fp16 tests for fp/neon (which is in a similar situation to bf16 for sve), we separate the fp16 tests into their own function and then add two runs lines with and without +fullfp16. Can we do the same here so we can test both costs?

@MacDue
Copy link
Member Author

MacDue commented Nov 10, 2025

When adding the fp16 tests for fp/neon (which is in a similar situation to bf16 for sve), we separate the fp16 tests into their own function and then add two runs lines with and without +fullfp16. Can we do the same here so we can test both costs?

Done 👍

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - this has worked quite well for FP16, hopefully it can work well here too.

LGTM

@MacDue MacDue merged commit efc0ab0 into llvm:main Nov 11, 2025
10 checks passed
@MacDue MacDue deleted the tests_base branch November 11, 2025 10:43
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 11, 2025

LLVM Buildbot has detected a new failure on builder ppc64le-flang-rhel-clang running on ppc64le-flang-rhel-test while building llvm at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/42112

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-flang) failure: 1200 seconds without output running [b'ninja', b'check-flang'], attempting to kill
...
PASS: Flang :: Transforms/DoConcurrent/use_loop_bounds_in_body.f90 (3974 of 3984)
PASS: Flang :: Transforms/external-name-interop-symref-array.fir (3975 of 3984)
PASS: Flang :: Driver/fopenmp.f90 (3976 of 3984)
PASS: Flang :: Semantics/modfile73.f90 (3977 of 3984)
PASS: Flang :: Transforms/dlti-dependency.fir (3978 of 3984)
PASS: Flang :: Transforms/constant-argument-globalisation.fir (3979 of 3984)
PASS: Flang :: Transforms/debug-dwarf-version.fir (3980 of 3984)
PASS: Flang :: Driver/omp-driver-offload.f90 (3981 of 3984)
PASS: Flang :: Driver/linker-options.f90 (3982 of 3984)
PASS: Flang :: Intrinsics/math-codegen.fir (3983 of 3984)
command timed out: 1200 seconds without output running [b'ninja', b'check-flang'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=2335.107099

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants