You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add the capability to do adjoint transforms (#633)
* first version; adjoint still completely untested
* add Python interface; fix a few bugs
* add tests; add overlooked file
* use assertions that are more helpful in cse of errors
* test commit to see if tests pass with incresed tolerance for adjoint type 3
* increase overall requested plan accuracy for adjoint type 3 transforms. Perhaps 1e-6 is just a bit too close to the machine epsilon for single precision
* be more clever about memory consumption
* document new parameters for execute_internal()
* update CHANGELOG
* more comments
* small tweak
* fix typos
* add explanations for obscure tricks
* added an example
* create FFTW plans for adjoint transforms
* better variable names and debug prints
* add docstring for execute_adjoint
* comments
* corrected doc comments in example/guru2d1_adjoint.cpp
* add execute_adjoint to C/C++ guru doc strings
* execute_adjoint fully described in docs/c.rst
* execute_adjoint added to docs/cex.rst
* matlab mwrap interface add execute_adjoint
* actually add matlab execute_adjoint
* exec_adj into matlab docs and overview.src
* exec_adj matlab docs and example/guru1d1_adjoint.m
* exec_adj to matlab.rst and Contents.m
* mention additional benefits of the change
* add Fortran adjoint interface and example
* add adjoint to Fortran docs
* execute_adjoint in Fortran, with example. No docs yet
* add Martin's guru1d2_adjoint.f to makefile
* negate iflags in Martin's guru1d2_adjoint{f}.f
* both fort adj examples in doc page
* simplified guru1d2_adjoint{f}.f
* make octave runs adjoint example
* basic C++ adjointness for t1,2; in makefile and make test (not CTest yet). t3 to do
* Update CMakeLists.txt
* adjointness t3 test added, reduced prob sizes a bit; added docs trouble paragraph on adjointness
* Simplify py tests for adjoint
* bump up adjointness double allowederr to 1e-12 due to macos-13 fail
---------
Co-authored-by: Marco Barbone <[email protected]>
Co-authored-by: ahbarnett <[email protected]>
Co-authored-by: Marco Barbone <[email protected]>
Co-authored-by: Joakim Andén <[email protected]>
Copy file name to clipboardExpand all lines: docs/c.rst
+12-6Lines changed: 12 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,7 +53,7 @@ with the word "many" in the function name) perform ``ntr`` transforms with the s
53
53
54
54
.. note::
55
55
56
-
The motivations for the vectorized interface (and guru interface, see below) are as follows. 1) It is more efficient to bin-sort the nonuniform points only once if there are not to change between transforms. 2) For small problems, certain start-up costs cause repeated calls to the simple interface to be slower than necessary. In particular, we note that FFTW takes around 0.1 ms per thread to look up stored wisdom, which for small problems (of order 10000 or less input and output data) can, sadly, dominate the runtime.
56
+
The motivations for the vectorized interface (and guru interface, see below) include the following. 1) It is more efficient to bin-sort the nonuniform points only once if there are not to change between transforms. 2) For small problems, certain start-up costs cause repeated calls to the simple interface to be slower than necessary. In particular, we note that FFTW takes around 0.1 ms per thread to look up stored wisdom, which for small problems (of order 10000 or less input and output data) can, sadly, dominate the runtime.
57
57
58
58
59
59
1D transforms
@@ -77,13 +77,19 @@ with the word "many" in the function name) perform ``ntr`` transforms with the s
77
77
Guru plan interface
78
78
-------------------
79
79
80
-
This provides more flexibility than the simple or vectorized interfaces.
80
+
This provides more flexibility than either simple or vectorized interfaces.
81
81
Any transform requires (at least)
82
-
calling the following four functions in order. However, within this
83
-
sequence one may insert repeated ``execute`` calls, or another ``setpts``
84
-
followed by more ``execute`` calls, as long as the transform sizes (and number of transforms ``ntr``) are
82
+
calling four of the following five functions in order. However, within this
83
+
sequence one may insert repeated ``execute`` and/or ``execute_adjoint`` calls,
84
+
or another ``setpts``
85
+
followed by more ``execute`` and/or ``execute_adjoint`` calls, as long as the transform sizes (and number of transforms ``ntr``) are
85
86
consistent with those that have been set in the ``plan`` and in ``setpts``.
86
-
Keep in mind that ``setpts`` retains *pointers* to the user's list of nonuniform points, rather than copying these points; thus the user must not change their nonuniform point arrays until after any ``execute`` calls that use them.
87
+
Keep in mind that ``setpts`` retains *pointers* to the user's list of nonuniform points, rather than copying these points; thus the user must not change their nonuniform point arrays until after any ``execute`` or ``execute_adjoint`` calls that use them.
88
+
89
+
The goal of the ``execute_adjoint`` feature (fully supported in v2.5.0)
90
+
is to allow the
91
+
common use-case of transform and adjoint transform pairs to be accessible
Copy file name to clipboardExpand all lines: docs/cex.rst
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -236,6 +236,8 @@ previous wisdom which would be significant when doing many small transforms.
236
236
You may also send in a new
237
237
set of stacked strength data (for type 1 and 3, or coefficients for type 2),
238
238
reusing the existing FFTW plan and sorted points.
239
+
Finally, you may execute *adjoints* of the planned transforms without
240
+
re-planning, making forward-adjoint transform pairs very convenient.
239
241
Now we redo the above 2D type 1 C++ example with the guru interface.
240
242
241
243
One first makes a plan giving transform parameters, but no data:
@@ -254,6 +256,7 @@ One first makes a plan giving transform parameters, but no data:
254
256
// step 3: do the planned transform to the c strength data, output to F...
255
257
finufft_execute(plan, &c[0], &F[0]);
256
258
// ... you could now send in new points, and/or do transforms with new c data
259
+
// ... or even adjoint transforms with the same points but now mapping F to c.
257
260
// ...
258
261
// step 4: when done, free the memory used by the plan...
259
262
finufft_destroy(plan);
@@ -264,14 +267,15 @@ is that the ``int64_t`` type (aka ``long long int``)
264
267
is needed since the Fourier coefficient dimensions are passed as an array.
265
268
266
269
.. warning::
267
-
You must not change the nonuniform point arrays (here ``x``, ``y``) between passing them to ``finufft_setpts`` and performing ``finufft_execute``. The latter call expects these arrays to be unchanged. We chose this style of interface since it saves RAM and time (by avoiding unnecessary duplication), allowing the largest possible problems to be solved.
270
+
You must not change the nonuniform point arrays (here ``x``, ``y``) between passing them to ``finufft_setpts`` and performing ``finufft_execute`` or ``finufft_execute_adjoint``. The last two calls expect these arrays to be unchanged. We chose this style of interface since it saves RAM and time (by avoiding unnecessary duplication), allowing the largest possible problems to be solved.
268
271
269
272
.. warning::
270
273
You must destroy a plan before making a new plan using the same
271
274
plan object, otherwise a memory leak results.
272
275
273
-
The complete code with a math test is in ``examples/guru2d1.cpp``, and for
274
-
more examples see ``examples/guru1d1*.c*``
276
+
The complete code with a math test is in ``examples/guru2d1.cpp``,
277
+
the demo of an adjoint execution is in ``examples/guru2d1_adjoint.cpp``,
278
+
and for more examples see ``examples/guru1d1*.c*``
275
279
276
280
Using the guru interface to perform a vectorized transform (multiple 1D type 1
277
281
transforms each with the same nonuniform points) is demonstrated in
Copy file name to clipboardExpand all lines: docs/matlab.rst
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,12 +54,14 @@ interface. For smaller transform sizes the acceleration factor of this vectorize
54
54
55
55
If you want yet more control, consider using the "guru" interface.
56
56
This can be faster than fresh calls to the simple or vectorized interfaces
57
-
for the same number of transforms, for reasons such as this:
57
+
for the same number of transforms, since
58
58
the nonuniform points can be changed between transforms, without forcing
59
59
FFTW to look up a previously stored plan.
60
60
Usually, such an acceleration is only important when doing
61
61
repeated small transforms, where "small" means each transform takes of
62
62
order 0.01 sec or less.
63
+
The guru interface is also very convenient for applying forward-adjoint
64
+
transform pairs, common in imaging or optimization applications.
63
65
Here we use the guru interface to repeat the first demo above:
64
66
65
67
.. code-block:: matlab
@@ -74,12 +76,12 @@ Here we use the guru interface to repeat the first demo above:
74
76
c = randn(M,1)+1i*randn(M,1); % iid random complex data (row or col vec)
75
77
f = plan.execute(c); % do the transform (0.008 sec, ie, faster)
76
78
% ...one could now change the points with setpts, and/or do new transforms
77
-
% with new c data...
79
+
% ...with new c data, and/or do adjoint transforms with new data...
78
80
delete(plan); % don't forget to clean up
79
81
80
82
.. warning::
81
83
82
-
If an existing array is passed to ``setpts``, then this array must not be altered before ``execute`` is called! This is because, in order to save RAM (allowing larger problems to be solved), internally FINUFFT stores only *pointers* to ``x`` (etc), rather than unnecessarily duplicating this data. This is not true if an *expression* such as ``-x`` or ``2*pi*rand(M,1)`` is passed to ``setpts``, since in those cases the ``plan`` object does make internal copies, as per MATLAB's usual shallow-copy argument passing.
84
+
If an existing array is passed to ``setpts``, then this array must not be altered before ``execute`` or ``execute_adjoint`` is called! This is because, in order to save RAM (allowing larger problems to be solved), internally FINUFFT stores only *pointers* to ``x`` (etc), rather than unnecessarily duplicating this data. This is not true if an *expression* such as ``-x`` or ``2*pi*rand(M,1)`` is passed to ``setpts``, since in those cases the ``plan`` object does make internal copies, as per MATLAB's usual shallow-copy argument passing.
83
85
84
86
Finally, we demo a 2D type 1 transform using the simple interface. Let's
85
87
request a rectangular Fourier mode array of 1000 modes in the x direction but 500 in the
0 commit comments