You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: chapters/buffer_device_address.adoc
+82-1Lines changed: 82 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -81,7 +81,88 @@ OpStore %ptr %obj Aligned 16
81
81
82
82
Shading languages will have a default, but can allow you to align it explicitly (ex `buffer_reference_alignment`).
83
83
84
-
The goal of this alignment is this is a promise for how aligned this specific pointer is. The user is responsible to confirm the address they use is aligned to it.
84
+
The goal of this alignment is this is a promise for how aligned this specific pointer is.
85
+
The compiler has no idea what the address will be when the shader is compiled.
86
+
By providing an alignment it can generate valid code to match the requirement.
87
+
The user is responsible to confirm the address they use is aligned to it.
Here we have 2 options, we could set the `Aligned` to be `4` or `16`.
140
+
141
+
If we set alignment to `16` we are letting the compiler know it can load 16 bytes at a time, so it will hopefully do a vector load/store on the memory.
142
+
143
+
If we set alignment to `4` the compiler will likely have no way to infer the real alignment and will now do 4 scalar int load/store on the memory.
144
+
145
+
[NOTE]
146
+
====
147
+
Some GPUs can do vector load/store even on unaligned addresses.
148
+
====
149
+
150
+
For the next case, if we had `uvec3` instead of `uvec4` such as
We know that setting the alignment to `16` would be violated at `data[1]` and therefore we need to use an alignment of `4` in this case.
165
+
Luckily shading languages will help do this for you as seen in both link:https://godbolt.org/z/jWGKax1ed[glslang] and link:https://godbolt.org/z/Y7xW3Mfd4[slang] .
0 commit comments