-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
Consider the following:
RWStructuredBuffer<uint4> Out : register(u0);
groupshared uint4 SharedData;
[numthreads(128,4,1)]
void main(uint3 ThreadID : SV_GroupThreadID) {
if (ThreadID.x == 0 && ThreadID.y == 0) {
SharedData = 0;
}
GroupMemoryBarrierWithGroupSync();
for (uint I = 0; I < 128; I++) {
if (ThreadID.x == I) {
SharedData[ThreadID.y] = SharedData[ThreadID.y] + 1;
}
GroupMemoryBarrierWithGroupSync();
}
if (ThreadID.x == 0) {
Out[0][ThreadID.y] = SharedData[ThreadID.y];
}
}
We would expect Out[0] = {128, 128, 128, 128}
, and this is observed when using WARP, NV and AMD with DXC.
Specific to Intel, the output is {0, 0, 0, 0}
, demonstrated here.
Hence, it is suspected to be a runtime (driver) bug specific to intel.
This issue is to track a further investigation to confirm this is the case. For further reference, please see the ComponentAccumulationDataRace
test-case, introduced here.
Metadata
Metadata
Assignees
Labels
No labels