-
Notifications
You must be signed in to change notification settings - Fork 14.8k
Description
According to #120119, the DXIL Shader Flags pass needs to be executed before the DXIL Op Lowering pass in order to simplify its implementation by being able to work directly with DirectX target intrinsics. However, this dependency creates a challenge, as the shader flag analysis is based on instructions that may not exist after the lowering pass.
This issue was discovered with the implementation of the Int64Ops Shader Flags Analysis and the resulting DXIL failing validation by dxv
due to mismatched flags (#129089 (comment)). The Shader Flags Analysis currently enables the Int64Ops shader flag in the presence of extractelement
instructions introduced by the Scalarizer pass. These extractelement
instructions are subsequently be removed by the DXIL Op Lowering pass.
RWBuffer<float4> In : register(u0, space0);
RWBuffer<float4> Out : register(u1, space4);
[numthreads(1,1,1)]
void main(uint GI : SV_GroupIndex) {
Out[GI] = In[GI] * In[GI];
}
> clang-dxc -T cs_6_0 -Fo simple.dxil simple.hlsl
error: Flags must match usage.
note: Flags declared=1056768, actual=8192
Validation failed.
clang-dxc: error: dxv command failed with exit code 1 (use -v to see invocation)
For reference: 1056768 is 0x00102000, and 8192 is 0x00002000.
0x00100000 corresponds to the Int64Ops DXIL module flag.
0x00002000 corresponds to the TypedUAVLoadAdditionalFormats DXIL module flag.
The Int64Ops flag is being set due to extractelement
instructions with i64 indices introduced by the Scalarizer pass.
The setting of the TypedUAVLoadAdditionalFormats flag relies on the presence of DirectX target intrinsics before DXIL Op Lowering, which occurs after Scalarization.
Are there other shader flags that could produce incorrect results when performed before DXIL Op Lowering?
- Probably not; issues like with the int64ops are likely to be one-offs
Potential Solutions:
-
Perform Shader Flag Analysis before Scalarization: This would ensure that the
extractelement
andinsertelement
instructions from the scalarizer are not yet introduced, thereby avoiding the need to account for their removal later. But it may impact the implementation of current and/or future Shader Flag Analyses. There may also be other sources of i64s we have not accounted for. -
Split the Shader Flag Analysis into two stages: one before the DXIL Op Lowering Pass and one after. This would also require moving the DXIL Translate Metadata pass to follow after the later Shader Flag Analysis. Shader Flag Analyses that benefit from the DirectX target intrinsics could be performed before DXIL Op Lowering, and the Shader Flag Analyses that don't benefit from that should be performed after DXIL Op Lowering.
-
Complicate the logic for the Int64Ops Shader Flag Analysis: Detect when an instruction returning an i64 or using i64 operands will be removed by a subsequent DXIL Op Lowering pass. This would probably be very ugly.
-
Modify the Scalarizer pass use i32 for
extractelement
andinsertelement
indices instead of i64 -
Modify the IRBuilder to default to i32-typed indices when constructing ExtractElement and InsertElement instructions
-
Make the Int64Ops Shader Flag Analysis ignore constant i64 operands whose values fit within the range of an i32.
-
Add a pass to convert i64 indices into i32 for
extractelement
andinsertelement
(after scalarization and before metadata analysis).
Chosen Solution: 7
Metadata
Metadata
Assignees
Labels
Type
Projects
Status