Skip to content

Commit 7687188

Browse files
authored
[oneMKL, DFT] Suggested changes for oneMKL DFT APIs (typo fixes, corrections, revisions, and type-safety-motivated changes) (#593)
* [DFT] Suggested corrections and re-structuring for overall consistency of the oneMKL DFT specs. Notes: - removed explicit reference to "periodic" sequences in intro - private conditional type 'real_scalar_t' added to the descriptor class template to alleviate ambiguity in declaration of workspace-related member functions; - fixed typo 'oneapi::mkl::dft::*precision*::{REAL,COMPLEX}' in _onemkl_dft_descriptor_template_parameters - revised "syntax" parts and _onemkl_dft_descriptor_member_table's description for the constructors - added case of "workspace not accessible to the device" in exceptions for the set_workspace member function - revised "syntax" parts for the scoped enumeration types - fixed typo in step 1 of _onemkl_dft_typical_usage_of_workspace_external - unified and fixed namespace ambiguities in illustrative code snippets - unified and generalized the use of inline literals where relevant (e.g., referring to types, enum, class, objects, args, ..i.), throughout (several internal links removed or slightly rephrased to that end) - revised all parts referring to "at construction time" as they were ambiguous w.r.t. the copy and move constructors added in the meantime - moved WORKSPACE_EXTERNAL_BYTES to read-only items in config_param - completed specification for config_value in page dedicated to scoped enumeration types * [DFT] Suggested changes to deprecate variadic member function, clarify their behavior and introduce type-safe substitute overloads * [DFT] slight rephrasing regarding commit step in introductory page
1 parent 590a1f0 commit 7687188

File tree

8 files changed

+1366
-984
lines changed

8 files changed

+1366
-984
lines changed

source/elements/oneMKL/source/domains/dft/compute_backward.rst

Lines changed: 194 additions & 179 deletions
Large diffs are not rendered by default.

source/elements/oneMKL/source/domains/dft/compute_forward.rst

Lines changed: 189 additions & 175 deletions
Large diffs are not rendered by default.

source/elements/oneMKL/source/domains/dft/config_params/data_layouts.rst

Lines changed: 45 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,12 @@
44
55
.. _onemkl_dft_config_data_layouts:
66

7-
Configuration of Data Layouts
7+
Configuration of data layouts
88
-----------------------------
99

10+
The usage of prepended namespace specifiers ``oneapi::mkl::dft`` is
11+
omitted below for conciseness.
12+
1013
The DFT interface provides the configuration parameters
1114
``config_param::FWD_STRIDES`` (resp. ``config_param::BWD_STRIDES``)
1215
to define the data layout locating entries of relevant data sequences in the
@@ -22,8 +25,8 @@ superscript :math:`\text{fwd}` (resp. :math:`\text{bwd}`) for data sequences
2225
belonging to forward (resp. backward) domain, for any :math:`m` and multi-index
2326
:math:`\left(k_1, k_2, \ldots, k_d\right)` within :ref:`valid
2427
range<onemkl_dft_elementary_range_of_indices>`, the corresponding entry
25-
:math:`\left(\cdot\right)^{m}_{k_{1}, k_{2}, \dots, k_d }` - or the real or
26-
imaginary part thereof - of the relevant data sequence is located at index
28+
:math:`\left(\cdot\right)^{m}_{k_{1}, k_{2}, \dots, k_d }` or the real or
29+
imaginary part thereof of the relevant data sequence is located at index
2730

2831
.. math::
2932
s^{\text{xwd}}_0 + k_1\ s^{\text{xwd}}_1 + k_2\ s^{\text{xwd}}_2 + \dots + k_d\ s^{\text{xwd}}_d + m\ l^{\text{xwd}}
@@ -61,13 +64,13 @@ forward-domain (resp. backward-domain) data sequences and
6164
.. rubric:: Implicitly-assumed elementary data type
6265

6366
When reading or writing an element at index :eq:`eq_idx_data_layout` of any
64-
user-provided data container used at compute time, a
65-
:ref:`descriptor<onemkl_dft_descriptor>` object may re-interpret the base data
66-
type of that data container into an implicitly-assumed elementary data type.
67+
user-provided data container used at compute time, a ``descriptor`` object may
68+
re-interpret the base data type of that data container into an
69+
implicitly-assumed elementary data type.
6770
That implicitly-assumed data type depends on the object type, *i.e.*, on the
6871
specialization values used for the template parameters when instantiating the
69-
:ref:`descriptor<onemkl_dft_descriptor>` class, and, in case of complex
70-
descriptors, on the configuration value set for its configuration parameter
72+
``descriptor`` :ref:`class template<onemkl_dft_descriptor>`, and, in case of
73+
complex descriptors, on the configuration value set for its configuration parameter
7174
``config_param::COMPLEX_STORAGE``. The table below lists the implicitly-assumed
7275
data type in either domain (last 2 columns) based on the object type and
7376
its configuration value for ``config_param::COMPLEX_STORAGE`` (first 2 columns).
@@ -213,59 +216,59 @@ configuration parameter ``config_param::INPUT_STRIDES`` if
213216
The values of :math:`s^{\text{i}}_{j}` and :math:`s^{\text{o}}_{j}` are to be
214217
used and considered by oneMKL if and only if
215218
:math:`s^{\text{fwd}}_{j} = s^{\text{bwd}}_{j} = 0, \forall j \in \lbrace 0, 1, \ldots, d\rbrace`.
216-
(This will happen automatically if ``config_param::INPUT_STRIDES`` and ``config_param::OUTPUT_STRIDES``
217-
are set and ``config_param::FWD_STRIDES`` and ``config_param::BWD_STRIDES`` are not. See note below.)
218-
In such a case, :ref:`descriptor<onemkl_dft_descriptor>` objects must consider
219-
the data layouts corresponding to the two compute directions separately. As
220-
detailed above, relevant data sequence entries are accessed as elements of data
221-
containers (``sycl::buffer`` objects or device-accessible USM allocations)
222-
provided to the compute function, the base data type of which is (possibly
223-
implicitly re-interpreted) as documented in :ref:`this
224-
table<onemkl_dft_config_data_implicitly_assumed_elementary_data_type>`. If using
225-
input and output strides, for any :math:`m` and multi-index
219+
This will happen automatically if ``config_param::INPUT_STRIDES`` and
220+
``config_param::OUTPUT_STRIDES`` are set and ``config_param::FWD_STRIDES`` and
221+
``config_param::BWD_STRIDES`` are not (see note below).
222+
In such a case, ``descriptor`` objects must consider the data layouts
223+
corresponding to the two compute directions separately. As detailed above,
224+
relevant data sequence entries are accessed as elements of data containers
225+
(``sycl::buffer`` objects or device-accessible USM allocations) provided to the
226+
compute function, the base data type of which is (possibly implicitly re-interpreted)
227+
as documented in the above
228+
:ref:`table<onemkl_dft_config_data_implicitly_assumed_elementary_data_type>`. If
229+
using input and output strides, for any :math:`m` and multi-index
226230
:math:`\left(k_1, k_2, \ldots, k_d\right)` within :ref:`valid
227231
range<onemkl_dft_elementary_range_of_indices>`, the index to be used when
228-
accessing a data sequence entry - or part thereof - in forward domain is
232+
accessing a data sequence entry or part thereof in forward domain is
229233

230234
.. math::
231235
s^{\text{x}}_0 + k_1\ s^{\text{x}}_1 + k_2\ s^{\text{x}}_2 + \dots + k_d\ s^{\text{x}}_d + m\ l^{\text{fwd}}
232236
233237
where :math:`\text{x} = \text{i}` (resp. :math:`\text{x} = \text{o}`) for
234238
forward (resp. backward) DFT(s). Similarly, the index to be used when accessing
235-
a data sequence entry - or part thereof - in backward domain is
239+
a data sequence entry or part thereof in backward domain is
236240

237241
.. math::
238242
s^{\text{x}}_0 + k_1\ s^{\text{x}}_1 + k_2\ s^{\text{x}}_2 + \dots + k_d\ s^{\text{x}}_d + m\ l^{\text{bwd}}
239243
240244
where :math:`\text{x} = \text{o}` (resp. :math:`\text{x} = \text{i}`) for
241245
forward (resp. backward) DFT(s).
242246

243-
As a consequence, configuring :ref:`descriptor<onemkl_dft_descriptor>` objects
244-
using these deprecated configuration parameters makes their configuration
245-
direction-dependent when different stride values are used in
246-
forward and backward domains. Since the intended compute direction is unknown
247-
to the :ref:`descriptor<onemkl_dft_descriptor>` object when
247+
As a consequence, configuring ``descriptor`` objects using these deprecated
248+
configuration parameters makes their configuration direction-dependent when
249+
different stride values are used in forward and backward domains. Since the
250+
intended compute direction is unknown to the object when
248251
:ref:`committing<onemkl_dft_descriptor_commit>` it, every direction that results
249252
in a :ref:`consistent data layout<onemkl_dft_data_layout_requirements>` in
250-
forward and backward domains must be supported by successfully committed
251-
:ref:`descriptor<onemkl_dft_descriptor>` objects.
253+
forward and backward domains must be supported by successfully-committed
254+
``descriptor`` objects.
252255

253256
.. note::
254-
For :ref:`descriptor<onemkl_dft_descriptor>` objects with strides configured
255-
via these deprecated configuration parameters, the :ref:`consistency
256-
requirements<onemkl_dft_data_layout_requirements>` may be satisfied for only
257-
one of the two compute directions, *i.e.*, for only one of the forward or
258-
backward DFT(s). Such a configuration should not cause an exception to be
259-
thrown by the descriptor's :ref:`onemkl_dft_descriptor_commit` member
260-
function but the behavior of oneMKL is undefined if using that object for
261-
the compute direction that does not align with the :ref:`consistency
262-
requirements<onemkl_dft_data_layout_requirements>`.
257+
For ``descriptor`` objects with strides configured via these deprecated
258+
configuration parameters, the
259+
:ref:`consistency requirements<onemkl_dft_data_layout_requirements>` may be
260+
satisfied for only one of the two compute directions, *i.e.*, for only one
261+
of the forward or backward DFT(s). Such a configuration should not cause an
262+
exception to be thrown by the descriptor's ``commit``
263+
:ref:`member function<onemkl_dft_descriptor_commit>` but the behavior of
264+
oneMKL is undefined if using that object for the compute direction that does
265+
not align with the :ref:`consistency requirements<onemkl_dft_data_layout_requirements>`.
263266

264267
.. note::
265268
Setting either of ``config_param::INPUT_STRIDES`` or
266269
``config_param::OUTPUT_STRIDES`` triggers any default or previously-set
267270
values for ``config_param::FWD_STRIDES`` and ``config_param::BWD_STRIDES``
268-
to reset to ``std::vector<std::int64_t>(d+1, 0)`` values, and vice versa.
271+
to reset to ``std::vector<std::int64_t>(d+1, 0)``, and vice versa.
269272
This default behavior prevents mix-and-matching usage of either of
270273
``config_param::INPUT_STRIDES`` or ``config_param::OUTPUT_STRIDES`` with
271274
either of ``config_param::FWD_STRIDES`` or ``config_param::BWD_STRIDES``,
@@ -282,14 +285,15 @@ the reverse direction as shown below.
282285

283286
.. code-block:: cpp
284287
288+
namespace dft = oneapi::mkl::dft;
285289
// ...
286-
desc.set_value(config_param::INPUT_STRIDES, fwd_domain_strides);
287-
desc.set_value(config_param::OUTPUT_STRIDES, bwd_domain_strides);
290+
desc.set_value(dft::config_param::INPUT_STRIDES, fwd_domain_strides);
291+
desc.set_value(dft::config_param::OUTPUT_STRIDES, bwd_domain_strides);
288292
desc.commit(queue);
289293
compute_forward(desc, ...);
290294
// ...
291-
desc.set_value(config_param::INPUT_STRIDES, bwd_domain_strides);
292-
desc.set_value(config_param::OUTPUT_STRIDES, fwd_domain_strides);
295+
desc.set_value(dft::config_param::INPUT_STRIDES, bwd_domain_strides);
296+
desc.set_value(dft::config_param::OUTPUT_STRIDES, fwd_domain_strides);
293297
desc.commit(queue);
294298
compute_backward(desc, ...);
295299

source/elements/oneMKL/source/domains/dft/config_params/storage_formats.rst

Lines changed: 47 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,13 @@
77
Data storage
88
============
99

10-
The data storage convention observed by a
11-
:ref:`descriptor<onemkl_dft_descriptor>` object depends on whether it is a real
12-
or complex descriptor and, in case of complex descriptors, on the configuration
13-
value associated with configuration parameter ``config_param::COMPLEX_STORAGE``.
10+
The usage of prepended namespace specifiers ``oneapi::mkl::dft`` is
11+
omitted below for conciseness.
12+
13+
The data storage convention observed by a ``descriptor`` object depends on
14+
whether it is a real or complex descriptor and, in case of complex descriptors,
15+
on the configuration value associated with configuration parameter
16+
``config_param::COMPLEX_STORAGE``.
1417

1518
.. _onemkl_dft_complex_storage:
1619

@@ -24,14 +27,12 @@ associated with a configuration value ``config_value::COMPLEX_COMPLEX`` (default
2427
behavior), those entries are accessed and stored as ``std::complex<float>``
2528
(resp. ``std::complex<double>``) elements of a single data container
2629
(device-accessible USM allocation or ``sycl::buffer`` object) if the
27-
:ref:`descriptor<onemkl_dft_descriptor>` object is a single-precision (resp.
28-
double-precision) descriptor. If the configuration value
29-
``config_value::REAL_REAL`` is used instead, the real and imaginary parts of
30-
those entries are accessed and stored as ``float`` (resp. ``double``) elements
31-
of two separate, non-overlapping data containers (device-accessible USM
32-
allocations or ``sycl::buffer`` objects) if the
33-
:ref:`descriptor<onemkl_dft_descriptor>` object is a single-precision (resp.
34-
double-precision) descriptor.
30+
``descriptor`` object is a single-precision (resp. double-precision) descriptor.
31+
If the configuration value ``config_value::REAL_REAL`` is used instead, the real
32+
and imaginary parts of those entries are accessed and stored as ``float`` (resp.
33+
``double``) elements of two separate, non-overlapping data containers
34+
(device-accessible USM allocations or ``sycl::buffer`` objects) if the
35+
``descriptor`` object is a single-precision (resp. double-precision) descriptor.
3536

3637
These two behaviors are further specified and illustrated below.
3738

@@ -45,20 +46,19 @@ sequences must belong to a single data container (device-accessible USM
4546
allocation or ``sycl::buffer`` object). Any relevant entry
4647
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` is accessed/stored from/in
4748
a data container provided at compute time at the index value expressed in eq.
48-
:eq:`eq_idx_data_layout` (from :ref:`this page<onemkl_dft_config_data_layouts>`)
49+
:eq:`eq_idx_data_layout` (see the page dedicated to the
50+
:ref:`configuration of data layout<onemkl_dft_config_data_layouts>`)
4951
of that data container, whose elementary data type is (possibly implicitly
5052
re-interpreted as) ``std::complex<float>`` (resp. ``std::complex<double>``) for
5153
single-precision (resp. double-precision) descriptors.
5254

5355
The same unique data container is to be used for forward- and backward-domain
54-
data sequences for in-place transforms (for
55-
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
56-
``config_value::INPLACE`` for configuration parameter
56+
data sequences for in-place transforms (for ``descriptor`` objects with
57+
configuration value ``config_value::INPLACE`` for configuration parameter
5758
``config_param::PLACEMENT``). Two separate data containers sharing no common
58-
elements are to be used for out-of-place transforms (for
59-
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
60-
``config_value::NOT_INPLACE`` for configuration parameter
61-
``config_param::PLACEMENT``).
59+
elements are to be used for out-of-place transforms (for ``descriptor`` objects
60+
with configuration value ``config_value::NOT_INPLACE`` for configuration
61+
parameter ``config_param::PLACEMENT``).
6262

6363
The following snippet illustrates the usage of ``config_value::COMPLEX_COMPLEX``
6464
for configuration parameter ``config_param::COMPLEX_STORAGE``, in the
@@ -84,8 +84,8 @@ USM allocations.
8484
8585
// initialize forward-domain data such that entry {m;k1,k2,k3}
8686
// = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
87-
compute_forward(desc, Z); // complex-to-complex in-place DFT
88-
// in backward domain: entry {m;k1,k2,k3}
87+
auto ev = compute_forward(desc, Z); // complex-to-complex in-place DFT
88+
// Upon completion of ev, in backward domain: entry {m;k1,k2,k3}
8989
// = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
9090
9191
.. _onemkl_dft_complex_storage_real_real:
@@ -98,21 +98,20 @@ read/stored from/in two different, non-overlapping data containers
9898
(device-accessible USM allocations or ``sycl::buffer`` objects) encapsulating
9999
the real and imaginary parts of the relevant entries separately. The real and
100100
imaginary parts of any relevant complex entry
101-
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` are both stored at the index value
102-
expressed in eq. :eq:`eq_idx_data_layout` (from :ref:`this
103-
page<onemkl_dft_config_data_layouts>`) of their respective data containers, whose elementary
104-
data type is (possibly implicitly re-interpreted as) ``float`` (resp.
105-
``double``) for single-precision (resp. double-precision) descriptors.
101+
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` are both stored at the
102+
index value expressed in eq. :eq:`eq_idx_data_layout` (see the page dedicated to
103+
the :ref:`configuration of data layout<onemkl_dft_config_data_layouts>`) of
104+
their respective data containers, whose elementary data type is (possibly
105+
implicitly re-interpreted as) ``float`` (resp. ``double``) for single-precision
106+
(resp. double-precision) descriptors.
106107

107108
The same two data containers are to be used for real and imaginary parts of
108109
forward- and backward-domain data sequences for in-place transforms (for
109-
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
110-
``config_value::INPLACE`` for configuration parameter
111-
``config_param::PLACEMENT``). Four separate data containers sharing no common
112-
elements are to be used for out-of-place transforms (for
113-
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
114-
``config_value::NOT_INPLACE`` for configuration parameter
115-
``config_param::PLACEMENT``).
110+
``descriptor`` objects with configuration value ``config_value::INPLACE`` for
111+
configuration parameter ``config_param::PLACEMENT``). Four separate data
112+
containers sharing no common elements are to be used for out-of-place transforms
113+
(for ``descriptor`` objects with configuration value ``config_value::NOT_INPLACE``
114+
for configuration parameter ``config_param::PLACEMENT``).
116115

117116
The following snippet illustrates the usage of ``config_value::REAL_REAL``
118117
set for configuration parameter ``config_param::COMPLEX_STORAGE``, in the
@@ -141,8 +140,8 @@ USM allocations.
141140
// = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
142141
// and the imaginary part of entry {m;k1,k2,k3}
143142
// = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
144-
compute_forward<decltype(desc), float>(desc, ZR, ZI); // complex-to-complex in-place DFT
145-
// in backward domain: the real part of entry {m;k1,k2,k3}
143+
auto ev = compute_forward<decltype(desc), float>(desc, ZR, ZI); // complex-to-complex in-place DFT
144+
// Upon completion of ev, in backward domain: the real part of entry {m;k1,k2,k3}
146145
// = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
147146
// and the imaginary part of entry {m;k1,k2,k3}
148147
// = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
@@ -156,14 +155,13 @@ Real descriptors observe only one type of data storage. Any relevant (real)
156155
entry :math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` of a data sequence
157156
in forward domain is accessed and stored as a ``float`` (resp. ``double``)
158157
element of a single data container (device-accessible USM allocation or
159-
``sycl::buffer`` object) if the :ref:`descriptor<onemkl_dft_descriptor>` object
160-
is a single-precision (resp. double-precision) descriptor. Any relevant
161-
(complex) entry :math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` of a data
162-
sequence in backward domain is accessed and stored as a ``std::complex<float>``
163-
(resp. ``std::complex<double>``) element of a single data container
164-
(device-accessible USM allocation or ``sycl::buffer`` object) if the
165-
:ref:`descriptor<onemkl_dft_descriptor>` object is a single-precision (resp.
166-
double-precision) descriptor.
158+
``sycl::buffer`` object) if the ``descriptor`` object is a single-precision
159+
(resp. double-precision) descriptor. Any relevant (complex) entry
160+
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` of a data sequence in
161+
backward domain is accessed and stored as a ``std::complex<float>`` (resp.
162+
``std::complex<double>``) element of a single data container (device-accessible
163+
USM allocation or ``sycl::buffer`` object) if the
164+
``descriptor`` object is a single-precision (resp. double-precision) descriptor.
167165

168166
The following snippet illustrates the usage of a real, single-precision
169167
descriptor (and the corresponding data storage) for the in-place,
@@ -190,12 +188,13 @@ forward and backward domains, with USM allocations.
190188
191189
// initialize forward-domain data such that real entry {m;k1,k2,k3}
192190
// = data[ fwd_strides[0] + k1*fwd_strides[1] + k2*fwd_strides[2] + k3*fwd_strides[3] + m*fwd_dist ]
193-
compute_forward(desc, data); // real-to-complex in-place DFT
194-
// in backward domain, the implicitly-assumed type is complex so, considering
191+
auto ev = compute_forward(desc, data); // real-to-complex in-place DFT
192+
// In backward domain, the implicitly-assumed type is complex so, consider
195193
// std::complex<float>* complex_data = static_cast<std::complex<float>*>(data);
196-
// we have entry {m;k1,k2,k3}
194+
// upon completion of ev, the backward-domain entry {m;k1,k2,k3} is
197195
// = complex_data[ bwd_strides[0] + k1*bwd_strides[1] + k2*bwd_strides[2] + k3*bwd_strides[3] + m*bwd_dist ]
198196
// for 0 <= k3 <= n3/2.
199-
// Note: if n3/2 < k3 < n3, entry {m;k1,k2,k3} = std::conj(entry {m;n1-k1,n2-k2,n3-k3})
197+
// Note: if n3/2 < k3 < n3, entry {m;k1,k2,k3} is not stored explicitly
198+
// since it is equal to std::conj(entry {m;n1-k1,n2-k2,n3-k3})
200199
201200
**Parent topic** :ref:`onemkl_dft_enums`

0 commit comments

Comments
 (0)