You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
// - Sin and Cos in HLSL take 32-bit floats. Using this library with 64-bit floats works perfectly fine, but DXC will emit warnings
@@ -274,14 +274,17 @@ template<bool Inverse, typename consteval_params_t, class device_capabilities=vo
274
274
struct FFT;
275
275
276
276
// For the FFT methods below, we assume:
277
-
// - Accessor is a global memory accessor to an array fitting 2 * WorkgroupSize elements of type complex_t<Scalar>, used to get inputs / set outputs of the FFT,
278
-
// that is, one "lo" and one "hi" complex numbers per thread, essentially 4 Scalars per thread. The arrays it accesses with `get` and `set` can optionally be
279
-
// different, if you don't want the FFT to be done in-place.
280
-
// The Accessor MUST provide a typename `Accessor::scalar_t`, and this type MUST be the same as the `Scalar` template parameter of the FFT struct's consteval parameters
277
+
// - Accessor is an accessor to an array fitting 2 * WorkgroupSize elements of type complex_t<Scalar>, used to get inputs / set outputs of the FFT,
278
+
// that is, one "lo" and one "hi" complex numbers per thread, essentially 4 Scalars per thread. If `ConstevalParameters::ElementsPerInvocationLog2 == 1`,
279
+
// the arrays it accesses with `get` and `set` can optionally be different, if you don't want the FFT to be done in-place. Otherwise, you MUST make it in-place
280
+
// (this is because if using more than 2 elements per invocation, we use the same array to store intermediate operations).
281
281
// The Accessor MUST provide the following methods:
// * void set(uint32_t index, in complex_t<Scalar> value);
284
284
// * void memoryBarrier();
285
+
// For it to work correctly, this memory barrier must use `AcquireRelease` semantics, with the proper flags set for the memory type.
286
+
// If using `ConstevalParameters::ElementsPerInvocationLog2 == 1` or otherwise not needing it (such as when using preloaded accessors) we still require the method to exist
287
+
// but you can just make it do nothing.
285
288
286
289
// - SharedMemoryAccessor accesses a workgroup-shared memory array of size `2 * sizeof(Scalar) * WorkgroupSize`.
287
290
// The SharedMemoryAccessor MUST provide the following methods:
0 commit comments