You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* When writing to an output array stored in linear memory, we can reinterpret the array as a three-dimensional logical view containing `L` independent sub-sequences having two "columns" corresponding to the real and imaginary parts of a folded complex vector and where each "column" has `M` elements.
741
+
* When writing to an output array stored in linear memory, we can reinterpret the array as a three-dimensional logical view containing `L` independent sub-sequences having two "rows" corresponding to the two parts of the butterfly (`even + odd` and `even - odd`, respectively) and where each "row" has `M*L` elements.
688
742
*
689
-
* Accordingly, the following is a logical view of an input array (zero-based indexing) which contains `L = 3` transforms and in which each "column" sub-sequence has length `M = 4`:
743
+
* Accordingly, the following is a logical view of an output array (zero-based indexing) which contains `L = 3` transforms and in which each sub-sequence has length `M = 4`:
744
+
*
745
+
* ```text
746
+
* │ k = 0 k = 1 k = 2
747
+
* │ ───────────────────────────────────────────────────────────────────────────────→ k
* └────────────────────────────────────────────────────────────────────────────────→ i
752
+
* ↑ ↑ ↑ ↑ ↑ ↑
753
+
* i = 0 M-1 0 M-1 0 M-1
754
+
* ```
755
+
*
756
+
* In the above,
757
+
*
758
+
* - `i` is the fastest varying index, which walks within one short sub-sequence corresponding to either the `even + odd` or `even - odd` part of the butterfly.
759
+
* - `j` selects between the `even + odd` and `even - odd` part of the butterfly.
760
+
* - `k` specifies the index of one of the `L` independent transforms we are processing.
761
+
*
762
+
* In linear memory, the three-dimensional logical view is arranged as follows:
* As may be observed, when resolving an index in the output array, the `j` and `k` dimensions are swapped. This stems from `radb2` being only one stage in a multi-stage driver which alternates between using `ch` and `out` as workspace buffers. After each stage, the next stage reads what the previous stage wrote.
769
+
*
770
+
* Each stage expects a transpose, and, in order to avoid explicit transposition between the stages, we swap the last two logical dimensions while still maintaining cache locality within the inner loop logical dimension, as indexed by `i`.
690
771
*
691
772
* @private
692
773
* @param {NonNegativeInteger} i - index of an element within a sub-sequence
0 commit comments