Skip to content

Implement the msad4 HLSL Function #99137

@farzonl

Description

@farzonl
  • Implement msad4 clang builtin,
  • Link msad4 clang builtin with hlsl_intrinsics.h
  • Add sema checks for msad4 to CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
  • Add codegen for msad4 to EmitHLSLBuiltinExpr in CGBuiltin.cpp
  • Add codegen tests to clang/test/CodeGenHLSL/builtins/msad4.hlsl
  • Add sema tests to clang/test/SemaHLSL/BuiltIns/msad4-errors.hlsl
  • Create the int_dx_msad4 intrinsic in IntrinsicsDirectX.td
  • Create the DXILOpMapping of int_dx_msad4 to 53 in DXIL.td
  • Create the msad4.ll and msad4_errors.ll tests in llvm/test/CodeGen/DirectX/
  • Create the int_spv_msad4 intrinsic in IntrinsicsSPIRV.td
  • In SPIRVInstructionSelector.cpp create the msad4 lowering and map it to int_spv_msad4 in SPIRVInstructionSelector::selectIntrinsic.
  • Create SPIR-V backend test case in llvm/test/CodeGen/SPIRV/hlsl-intrinsics/msad4.ll

DirectX

DXIL Opcode DXIL OpName Shader Model Shader Stages
53 Bfi 6.0 ()

SPIR-V

SAbs:

Description:

SAbs

Result is x if x ≥ 0; otherwise result is -x, where x is
interpreted as a signed integer.

Result Type and the type of x must both be integer scalar or integer
vector types. Result Type and operand types must have the same number
of components with the same component width. Results are computed per
component.

Number Operand 1 Operand 2 Operand 3 Operand 4

5

<id>
x

Test Case(s)

Example 1

//dxc msad4_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export uint4 fn(uint p1, uint2 p2, uint4 p3) {
    return msad4(p1, p2, p3);
}

HLSL:

Compares a 4-byte reference value and an 8-byte source value and accumulates a vector of 4 sums. Each sum corresponds to the masked sum of absolute differences of a different byte alignment between the reference value and the source value.

uint4 result = msad4(uint reference, uint2 source, uint4 accum);

Parameters

reference

[in] The reference array of 4 bytes in one uint value.

source

[in] The source array of 8 bytes in two uint2 values.

accum

[in] A vector of 4 values. msad4 adds this vector to the masked sum of absolute differences of the different byte alignments between the reference value and the source value.

Return Value

A vector of 4 sums. Each sum corresponds to the masked sum of absolute differences of different byte alignments between the reference value and the source value. msad4 doesn't include a difference in the sum if that difference is masked (that is, the reference byte is 0).

Remarks

To use the msad4 intrinsic in your shader code, call the ID3D11Device::CheckFeatureSupport method with D3D11_FEATURE_D3D11_OPTIONS to verify that the Direct3D device supports the SAD4ShaderInstructions feature option. The msad4 intrinsic requires a WDDM 1.2 display driver, and all WDDM 1.2 display drivers must support msad4. If your app creates a rendering device with feature level 11.0 or 11.1 and the compilation target is shader model 5 or later, the HLSL source code can use the msad4 intrinsic.

Return values are only accurate up to 65535. If you call the msad4 intrinsic with inputs that might result in return values greater than 65535, msad4 produces undefined results.

Minimum Shader Model

This function is supported in the following shader models.

Shader Model Supported
Shader model 5 or later yes

Examples

Here is an example result calculation for msad4:

reference = 0xA100B2C3;
source.x = 0xD7B0C372
source.y = 0x4F57C2A3
accum = {1,2,3,4}
result.x alignment source: 0xD7B0C372
result.x = accum.x + |0xD7   0xA1| + 0 (masked) + |0xC3   0xB2| + |0x72   0xC3| = 1 + 54 + 0 + 17 + 81 = 153
result.y alignment source: 0xA3D7B0C3
result.y = accum.y + |0xA3   0xA1| + 0 (masked) + |0xB0   0xB2| + |0xC3   0xC3| = 2 + 2 + 0 + 2 + 0 = 6
result.z alignment source: 0xC2A3D7B0
result.z = accum.z + |0xC2   0xA1| + 0 (masked) + |0xD7   0xB2| + |0xB0   0xC3| = 3 + 33 + 0 + 37 + 19 = 92
result.w alignment source: 0x57C2A3D7
result.w = accum.w + |0x57   0xA1| + 0 (masked) + |0xA3   0xB2| + |0xD7   0xC3| = 4 + 74 + 0 + 15 + 20 = 113
result = {153,6,92,113}

Here is an example of how you can use msad4 to search for a reference pattern within a buffer:

uint4 accum = {0,0,0,0};
for(uint i=0;i<REF_SIZE;i++)
    accum = msad4(
        buf_ref[i], 
        uint2(buf_src[DTid.x+i], buf_src[DTid.x+i+1]), 
        accum);
buf_accum[DTid.x] = accum;

Requirements

Requirement Value
Minimum supported client
Windows 8 [desktop apps | UWP apps]
Minimum supported server
Windows Server 2012 [desktop apps | UWP apps]

See also

Intrinsic Functions

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions