-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Add cross builtins and cross HLSL function to DirectX and SPIR-V backend #109180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
3c58861
9109b48
b5dc3e7
a8f8faf
32d6add
c13d2b1
f31ecc6
81aa399
15defe2
8a367a5
514c40b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1704,6 +1704,20 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned BuiltinID, CallExpr *TheCall) { | |
| return true; | ||
| break; | ||
| } | ||
| case Builtin::BI__builtin_hlsl_cross: { | ||
| if (SemaRef.checkArgCount(TheCall, 2)) | ||
| return true; | ||
| if (CheckVectorElementCallArgs(&SemaRef, TheCall)) | ||
| return true; | ||
| if (CheckFloatOrHalfRepresentations(&SemaRef, TheCall)) | ||
| return true; | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should do a vector size == 3 check to prevent misuse of the builtin? Downsize is the check isn't triggerable from the hlsl function name.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are also other asserts to catch when the size isn't 3, I don't think this is needed.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If thats the case could you add a test to float2 builtin_cross_float2(float2 p1, float2 p2)
{
return __builtin_hlsl_cross(p1, p2);
}
float4 builtin_cross_float4(float4 p1, float4 p2)
{
return __builtin_hlsl_cross(p1, p2);
}
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added a diagnostic and put the test in cross-errors.hlsl instead.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can't find a Do |
||
| ExprResult A = TheCall->getArg(0); | ||
| QualType ArgTyA = A.get()->getType(); | ||
| // return type is the same as the input type | ||
| TheCall->setType(ArgTyA); | ||
| break; | ||
| } | ||
| case Builtin::BI__builtin_hlsl_dot: { | ||
| if (SemaRef.checkArgCount(TheCall, 2)) | ||
| return true; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \ | ||
| // RUN: dxil-pc-shadermodel6.3-library %s -fnative-half-type \ | ||
| // RUN: -emit-llvm -disable-llvm-passes -o - | FileCheck %s \ | ||
| // RUN: --check-prefixes=CHECK,NATIVE_HALF \ | ||
| // RUN: -DFNATTRS=noundef -DTARGET=dx | ||
| // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \ | ||
| // RUN: dxil-pc-shadermodel6.3-library %s -emit-llvm -disable-llvm-passes \ | ||
| // RUN: -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF \ | ||
| // RUN: -DFNATTRS=noundef -DTARGET=dx | ||
| // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \ | ||
| // RUN: spirv-unknown-vulkan-compute %s -fnative-half-type \ | ||
| // RUN: -emit-llvm -disable-llvm-passes -o - | FileCheck %s \ | ||
| // RUN: --check-prefixes=CHECK,NATIVE_HALF \ | ||
| // RUN: -DFNATTRS="spir_func noundef" -DTARGET=spv | ||
| // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \ | ||
| // RUN: spirv-unknown-vulkan-compute %s -emit-llvm -disable-llvm-passes \ | ||
| // RUN: -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF \ | ||
| // RUN: -DFNATTRS="spir_func noundef" -DTARGET=spv | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's harmless, but given that these run lines are long and hard to compare as-is, I'll note for next time that |
||
|
|
||
| // NATIVE_HALF: define [[FNATTRS]] <3 x half> @ | ||
| // NATIVE_HALF: call <3 x half> @llvm.[[TARGET]].cross.v3f16(<3 x half> | ||
| // NO_HALF: call <3 x float> @llvm.[[TARGET]].cross.v3f32(<3 x float> | ||
| // NATIVE_HALF: ret <3 x half> %hlsl.cross | ||
| // NO_HALF: ret <3 x float> %hlsl.cross | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's probably clearer to put the |
||
| half3 test_cross_half3(half3 p0, half3 p1) | ||
| { | ||
| return cross(p0, p1); | ||
| } | ||
|
|
||
| // CHECK: define [[FNATTRS]] <3 x float> @ | ||
| // CHECK: %hlsl.cross = call <3 x float> @llvm.[[TARGET]].cross.v3f32( | ||
| // CHECK: ret <3 x float> %hlsl.cross | ||
| float3 test_cross_float3(float3 p0, float3 p1) | ||
| { | ||
| return cross(p0, p1); | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| // RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -disable-llvm-passes -verify -verify-ignore-unexpected | ||
|
||
|
|
||
| void test_too_few_arg() | ||
| { | ||
| return __builtin_hlsl_cross(); | ||
| // expected-error@-1 {{too few arguments to function call, expected 2, have 0}} | ||
| } | ||
|
|
||
| void test_too_many_arg(float3 p0) | ||
| { | ||
| return __builtin_hlsl_cross(p0, p0, p0); | ||
| // expected-error@-1 {{too many arguments to function call, expected 2, have 3}} | ||
| } | ||
|
|
||
| bool builtin_bool_to_float_type_promotion(bool p1) | ||
| { | ||
| return __builtin_hlsl_cross(p1, p1); | ||
| // expected-error@-1 {passing 'bool' to parameter of incompatible type 'float'}} | ||
| } | ||
|
|
||
| bool builtin_cross_int_to_float_promotion(int p1) | ||
| { | ||
| return __builtin_hlsl_cross(p1, p1); | ||
| // expected-error@-1 {{passing 'int' to parameter of incompatible type 'float'}} | ||
| } | ||
|
|
||
| bool2 builtin_cross_int2_to_float2_promotion(int2 p1) | ||
| { | ||
| return __builtin_hlsl_cross(p1, p1); | ||
| // expected-error@-1 {{passing 'int2' (aka 'vector<int, 2>') to parameter of incompatible type '__attribute__((__vector_size__(2 * sizeof(float)))) float' (vector of 2 'float' values)}} | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what happens if its float3 builtin_cross_float3_int3(float3 p1, int3 p2)
{
return __builtin_hlsl_cross(p1, p2);
}
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. emits an error that all args must be the same type. Added this case to cross-errors.hlsl
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah ok it got promoted in DXC so wanted t check: https://hlsl.godbolt.org/z/3nvGve7WP |
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -43,6 +43,7 @@ static bool isIntrinsicExpansion(Function &F) { | |
| case Intrinsic::dx_all: | ||
| case Intrinsic::dx_any: | ||
| case Intrinsic::dx_clamp: | ||
| case Intrinsic::dx_cross: | ||
| case Intrinsic::dx_uclamp: | ||
| case Intrinsic::dx_lerp: | ||
| case Intrinsic::dx_length: | ||
|
|
@@ -73,6 +74,42 @@ static Value *expandAbs(CallInst *Orig) { | |
| "dx.max"); | ||
| } | ||
|
|
||
| static Value *expandCrossIntrinsic(CallInst *Orig) { | ||
|
|
||
| VectorType *VT = cast<VectorType>(Orig->getType()); | ||
| if (cast<FixedVectorType>(VT)->getNumElements() != 3) | ||
| report_fatal_error(Twine("return vector must have exactly 3 elements"), | ||
| /* gen_crash_diag=*/false); | ||
|
|
||
| Value *op0 = Orig->getOperand(0); | ||
| Value *op1 = Orig->getOperand(1); | ||
| IRBuilder<> Builder(Orig); | ||
|
|
||
| Value *op0_x = Builder.CreateExtractElement(op0, (uint64_t)0, "x0"); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just curious why the 0 needs a cast and the others don't.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Without the cast I get:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't C++ fun! |
||
| Value *op0_y = Builder.CreateExtractElement(op0, 1, "x1"); | ||
| Value *op0_z = Builder.CreateExtractElement(op0, 2, "x2"); | ||
|
|
||
| Value *op1_x = Builder.CreateExtractElement(op1, (uint64_t)0, "y0"); | ||
| Value *op1_y = Builder.CreateExtractElement(op1, 1, "y1"); | ||
| Value *op1_z = Builder.CreateExtractElement(op1, 2, "y2"); | ||
|
|
||
| auto MulSub = [&](Value *x0, Value *y0, Value *x1, Value *y1) -> Value * { | ||
| Value *xy = Builder.CreateFMul(x0, y1); | ||
| Value *yx = Builder.CreateFMul(y0, x1); | ||
| return Builder.CreateFSub(xy, yx); | ||
|
||
| }; | ||
|
|
||
| Value *yz_zy = MulSub(op0_y, op0_z, op1_y, op1_z); | ||
| Value *zx_xz = MulSub(op0_z, op0_x, op1_z, op1_x); | ||
| Value *xy_yx = MulSub(op0_x, op0_y, op1_x, op1_y); | ||
|
|
||
| Value *cross = UndefValue::get(VT); | ||
| cross = Builder.CreateInsertElement(cross, yz_zy, (uint64_t)0); | ||
| cross = Builder.CreateInsertElement(cross, zx_xz, 1); | ||
| cross = Builder.CreateInsertElement(cross, xy_yx, 2); | ||
| return cross; | ||
| } | ||
|
|
||
| // Create appropriate DXIL float dot intrinsic for the given A and B operands | ||
| // The appropriate opcode will be determined by the size of the operands | ||
| // The dot product is placed in the position indicated by Orig | ||
|
|
@@ -434,6 +471,9 @@ static bool expandIntrinsic(Function &F, CallInst *Orig) { | |
| case Intrinsic::dx_any: | ||
| Result = expandAnyOrAllIntrinsic(Orig, IntrinsicId); | ||
| break; | ||
| case Intrinsic::dx_cross: | ||
| Result = expandCrossIntrinsic(Orig); | ||
| break; | ||
| case Intrinsic::dx_uclamp: | ||
| case Intrinsic::dx_clamp: | ||
| Result = expandClampIntrinsic(Orig, IntrinsicId); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| ; RUN: opt -S -dxil-intrinsic-expansion < %s | FileCheck %s --check-prefix=CHECK | ||
|
||
| ; RUN: opt -S -dxil-op-lower -mtriple=dxil-pc-shadermodel6.3-library < %s | FileCheck %s --check-prefix=CHECK | ||
|
|
||
| ; Make sure dxil operation function calls for cross are generated for half/float. | ||
|
|
||
| declare <3 x half> @llvm.dx.cross.v3f16(<3 x half>, <3 x half>) | ||
| declare <3 x float> @llvm.dx.cross.v3f32(<3 x float>, <3 x float>) | ||
|
|
||
| define noundef <3 x half> @test_cross_half3(<3 x half> noundef %p0, <3 x half> noundef %p1) { | ||
| entry: | ||
| ; CHECK: %x0 = extractelement <3 x half> %p0, i64 0 | ||
| ; CHECK: %x1 = extractelement <3 x half> %p0, i64 1 | ||
| ; CHECK: %x2 = extractelement <3 x half> %p0, i64 2 | ||
| ; CHECK: %y0 = extractelement <3 x half> %p1, i64 0 | ||
| ; CHECK: %y1 = extractelement <3 x half> %p1, i64 1 | ||
| ; CHECK: %y2 = extractelement <3 x half> %p1, i64 2 | ||
| ; CHECK: %0 = fmul half %x1, %y2 | ||
| ; CHECK: %1 = fmul half %x2, %y1 | ||
| ; CHECK: %2 = fsub half %0, %1 | ||
| ; CHECK: %3 = fmul half %x2, %y0 | ||
| ; CHECK: %4 = fmul half %x0, %y2 | ||
| ; CHECK: %5 = fsub half %3, %4 | ||
| ; CHECK: %6 = fmul half %x0, %y1 | ||
| ; CHECK: %7 = fmul half %x1, %y0 | ||
| ; CHECK: %8 = fsub half %6, %7 | ||
| ; CHECK: %9 = insertelement <3 x half> undef, half %2, i64 0 | ||
| ; CHECK: %10 = insertelement <3 x half> %9, half %5, i64 1 | ||
| ; CHECK: %11 = insertelement <3 x half> %10, half %8, i64 2 | ||
| ; CHECK: ret <3 x half> %11 | ||
| %hlsl.cross = call <3 x half> @llvm.dx.cross.v3f16(<3 x half> %p0, <3 x half> %p1) | ||
| ret <3 x half> %hlsl.cross | ||
| } | ||
|
|
||
| define noundef <3 x float> @test_cross_float3(<3 x float> noundef %p0, <3 x float> noundef %p1) { | ||
| entry: | ||
| ; CHECK: %x0 = extractelement <3 x float> %p0, i64 0 | ||
| ; CHECK: %x1 = extractelement <3 x float> %p0, i64 1 | ||
| ; CHECK: %x2 = extractelement <3 x float> %p0, i64 2 | ||
| ; CHECK: %y0 = extractelement <3 x float> %p1, i64 0 | ||
| ; CHECK: %y1 = extractelement <3 x float> %p1, i64 1 | ||
| ; CHECK: %y2 = extractelement <3 x float> %p1, i64 2 | ||
| ; CHECK: %0 = fmul float %x1, %y2 | ||
| ; CHECK: %1 = fmul float %x2, %y1 | ||
| ; CHECK: %2 = fsub float %0, %1 | ||
| ; CHECK: %3 = fmul float %x2, %y0 | ||
| ; CHECK: %4 = fmul float %x0, %y2 | ||
| ; CHECK: %5 = fsub float %3, %4 | ||
| ; CHECK: %6 = fmul float %x0, %y1 | ||
| ; CHECK: %7 = fmul float %x1, %y0 | ||
| ; CHECK: %8 = fsub float %6, %7 | ||
| ; CHECK: %9 = insertelement <3 x float> undef, float %2, i64 0 | ||
| ; CHECK: %10 = insertelement <3 x float> %9, float %5, i64 1 | ||
| ; CHECK: %11 = insertelement <3 x float> %10, float %8, i64 2 | ||
| ; CHECK: ret <3 x float> %11 | ||
| %hlsl.cross = call <3 x float> @llvm.dx.cross.v3f32(<3 x float> %p0, <3 x float> %p1) | ||
| ret <3 x float> %hlsl.cross | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| ; RUN: llc -O0 -mtriple=spirv-unknown-unknown %s -o - | FileCheck %s | ||
| ; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv-unknown-unknown %s -o - -filetype=obj | spirv-val %} | ||
|
|
||
| ; Make sure SPIRV operation function calls for cross are lowered correctly. | ||
|
|
||
| ; CHECK-DAG: %[[#op_ext_glsl:]] = OpExtInstImport "GLSL.std.450" | ||
| ; CHECK-DAG: %[[#float_32:]] = OpTypeFloat 32 | ||
| ; CHECK-DAG: %[[#float_16:]] = OpTypeFloat 16 | ||
| ; CHECK-DAG: %[[#vec3_float_16:]] = OpTypeVector %[[#float_16]] 3 | ||
| ; CHECK-DAG: %[[#vec3_float_32:]] = OpTypeVector %[[#float_32]] 3 | ||
|
|
||
| define noundef <3 x half> @cross_half4(<3 x half> noundef %a, <3 x half> noundef %b) { | ||
| entry: | ||
| ; CHECK: %[[#]] = OpFunction %[[#vec3_float_16]] None %[[#]] | ||
| ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec3_float_16]] | ||
| ; CHECK: %[[#arg1:]] = OpFunctionParameter %[[#vec3_float_16]] | ||
| ; CHECK: %[[#]] = OpExtInst %[[#vec3_float_16]] %[[#op_ext_glsl]] Cross %[[#arg0]] %[[#arg1]] | ||
| %hlsl.cross = call <3 x half> @llvm.spv.cross.v4f16(<3 x half> %a, <3 x half> %b) | ||
| ret <3 x half> %hlsl.cross | ||
| } | ||
|
|
||
| define noundef <3 x float> @cross_float4(<3 x float> noundef %a, <3 x float> noundef %b) { | ||
| entry: | ||
| ; CHECK: %[[#]] = OpFunction %[[#vec3_float_32]] None %[[#]] | ||
| ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec3_float_32]] | ||
| ; CHECK: %[[#arg1:]] = OpFunctionParameter %[[#vec3_float_32]] | ||
| ; CHECK: %[[#]] = OpExtInst %[[#vec3_float_32]] %[[#op_ext_glsl]] Cross %[[#arg0]] %[[#arg1]] | ||
| %hlsl.cross = call <3 x float> @llvm.spv.cross.v4f32(<3 x float> %a, <3 x float> %b) | ||
| ret <3 x float> %hlsl.cross | ||
| } | ||
|
|
||
| declare <3 x half> @llvm.spv.cross.v4f16(<3 x half>, <3 x half>) | ||
| declare <3 x float> @llvm.spv.cross.v4f32(<3 x float>, <3 x float>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an unrelated change sliding back in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the result of a merge, I'll remove it.