-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
After resolving #146974, validation errors of the form Pointer type bitcast must be have same size
and Bitcast on struct types is not allowed
appear.
There are 16 occurrences of both errors in total, and they all originate from four DML shaders: OneHot_256_uint16_native_int64_emulated_{4/8}
and OneHot_256_uint16_native_int32_native_{4/8}
The issue is that Clang is emitting an alloca for structs that contain one or more array-typed fields.
For example, take this shader: https://godbolt.org/z/o5he6GnG9
// compile args: -E CSMain -T cs_6_2 -enable-16bit-types -Xclang -emit-llvm
RWStructuredBuffer<uint> output;
struct MyStruct {
uint arr[2];
};
[numthreads(1, 1, 1)]
void CSMain(uint3 Tid: SV_DispatchThreadID) {
MyStruct s = {Tid.x, 0};
uint d = s.arr[Tid.y];
output[0] = d ;
}
A struct definition and an alloca of the struct appear in the IR that is to be fed to the DirectX backend.
%struct.MyStruct = type { [2 x i32] }
define void @CSMain() local_unnamed_addr #0 {
%1 = alloca %struct.MyStruct, align 4, !DIAssignID !56
...
This is causing issues because the DirectX backend does not handle allocas for structs and thus causes a cascade of validation errors.
Function: CSMain: error: Pointer type bitcast must be have same size.
note: at '%3 = bitcast %struct.MyStruct* %s.i to i32*' in block 'entry' of function 'CSMain'.
Function: CSMain: error: Bitcast on struct types is not allowed.
note: at '%4 = bitcast %struct.MyStruct* %s.i to [1 x %struct.MyStruct]*' in block 'entry' of function 'CSMain'.
Function: CSMain: error: Pointer type bitcast must be have same size.
note: at '%5 = bitcast %struct.MyStruct* %arrayinit.element.i1 to i32*' in block 'entry' of function 'CSMain'.
Function: CSMain: error: Bitcast on struct types is not allowed.
note: at '%6 = bitcast %struct.MyStruct* %s.i to [2 x i32]*' in block 'entry' of function 'CSMain'.
Validation failed.
DXC will instead eliminate the struct
define void @CSMain() {
%s.0 = alloca [2 x i32], align 4
Additionally, if you compile with -O0
, Clang will emit structs and struct allocas regardless of whether or not the struct contains an array-typed field.
The issue is that the SROA pass is responsible for eliminating the structs, but fails to do so when the struct contains an array-typed field. Furthermore, the SROA pass does not run when the flag -O0
is specified.
Unknown load/store offsets prevent struct decomposition
The reason SROA will not decompose the struct into separate fields in this instance is due to the dynamic index into the array field.
As an example, take this struct
struct S { int arr[2]; int x; };
S s;
s.arr[i] = 1; // i is unknown, s.arr[i] may alias s.x
s.x = 2;
The SROA pass will not decompose the struct into
int arr[2];
int x;
arr[i] = 1; // i is unknown, but arr[i] may not alias x. (x may even get removed)
x = 2;
because there exists a value of i
that could be used to access s.x
from s.arr
since they are laid out contiguously in memory due to being in the same struct. This property is not guaranteed to the be case when x
and arr
are independent variables.
Perhaps we can implement a way to decompose the struct if the access is known to be inbounds?
SROA will not run on functions with OptimizeNone attribute (-O0
)
llvm-project/llvm/lib/Transforms/Scalar/SROA.cpp
Lines 5824 to 5826 in c93d166
bool runOnFunction(Function &F) override { | |
if (skipFunction(F)) | |
return false; |
The SROA pass explicitly refuses to run on functions with the OptimizeNone attribute.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status