Skip to content

Commit 1bab309

Browse files
Document Virtual Workgroups
1 parent afae8d0 commit 1bab309

File tree

1 file changed

+28
-2
lines changed

1 file changed

+28
-2
lines changed

include/nbl/video/utilities/CScanner.h

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,7 @@ The scheduling relies on two principles:
3535
- Virtual and Persistent Workgroups
3636
- Atomic Counters as Sempahores
3737
38-
# Virtual Workgroups
39-
TODO: Move this Paragraph somewhere else.
38+
# Virtual Workgroups TODO: Move this Paragraph somewhere else.
4039
Generally speaking, launching a new workgroup has non-trivial overhead.
4140
4241
Also most IHVs, especially AMD have silly limits on the ranges of dispatches (like 64k workgroups), which also apply to 1D dispatches.
@@ -55,6 +54,33 @@ for (uint virtualWorkgroupIndex=gl_GlobalInvocationID.x; virtualWorkgroupIndex<v
5554
// do actual work for a single workgroup
5655
}
5756
```
57+
58+
This actually opens some avenues to abusing the system to achieve customized scheduling.
59+
60+
The GLSL and underlying spec give no guarantees and explicitly warn AGAINST assuming that a workgroup with a lower ID will begin executing
61+
no later than a workgroup with a higher ID. Actually attempting to enforce this, such as this
62+
```glsl
63+
layout() buffer coherent Sched
64+
{
65+
uint nextWorkgroup; // initial value is 0 before the dispatch
66+
};
67+
68+
while (nextWorkgroup!=gl_GlobalInvocationID.x) {}
69+
atomicMax(nextWorkgroup,gl_GlobalInvocationID.x+1);
70+
```
71+
has the potential to deadlock and TDR your GPU.
72+
73+
However if you use a global counter of dispatched workgroups in an SSBO and `atomicAdd` to assign the `virtualWorkgroupIndex`
74+
```glsl
75+
uint virtualWorkgroupIndex;
76+
for ((virtualWorkgroupIndex=atomicAdd(nextWorkgroup,1u))<virtualWorkgroupCount)
77+
{
78+
// do actual work for a single workgroup
79+
}
80+
```
81+
the ordering of starting work is now enforced (still wont guarantee the order of completion).
82+
83+
# Atomic Counters as Semaphores
5884
**/
5985
class NBL_API CScanner final : public core::IReferenceCounted
6086
{

0 commit comments

Comments
 (0)