|
| 1 | +.. _nfft_migr: |
| 2 | + |
| 3 | +Migration guide from NFFT3 |
| 4 | +========================== |
| 5 | + |
| 6 | +Here we outline how the C/C++ user can replace NUFFT calls to the popular library |
| 7 | +`Chemnitz NFFT3 library <https://www-user.tu-chemnitz.de/~potts/nfft/>`_ with |
| 8 | +FINUFFT CPU calls to achieve the same effect, possibly with more performance |
| 9 | +or less RAM usage. |
| 10 | +See [KKP] in the :ref:`references<refs>` for more about NFFT3, and [FIN] for |
| 11 | +some performance and RAM comparisons performed using the codes available in 2018. |
| 12 | +We use the `NFFT source on GitHub <https://github.com/NFFT/nfft>`_, version 3.5.4alpha. |
| 13 | +So far we only discuss: |
| 14 | + |
| 15 | + * the adjoint NFFT (a.k.a. type 1) |
| 16 | + |
| 17 | +Also of interest (but not yet demonstrated below) is: |
| 18 | + |
| 19 | + * the forward NFFT transform (a.k.a. type 2) |
| 20 | + * the nonuniform to nonuniform NNFFT (a.k.a. type 3) |
| 21 | + |
| 22 | + .. note:: The NFFT3 library can do more things---real-valued data, sphere, rotation group, hyperbolic cross, inverse transforms---none of which FINUFFT can yet do directly (although our three transforms can be used as components in such tasks). We do not address those here. |
| 23 | + |
| 24 | +Migrating a 2D adjoint transform (type 1) in C from NFFT3 to FINUFFT |
| 25 | +-------------------------------------------------------------------- |
| 26 | + |
| 27 | +We need to start with the simplest example of using NFFT3 on "user data" generated |
| 28 | +using plain, transparent, C commands (rather than relying on NFFT3-supplied |
| 29 | +data-generation, direct transform, and printing utilities as in the |
| 30 | +NFFT example :file:`examples/nfft/simple_test.c`, or even its simplest |
| 31 | +version |
| 32 | +at https://www-user.tu-chemnitz.de/~potts/nfft/download/nfft_simple_test.tar.gz ). |
| 33 | +We choose 2D since it is the simplest rectangular |
| 34 | +case that illustrates how to get the transposed |
| 35 | +coordinates (or mode array) ordering correct. |
| 36 | +After installing NFFT3 one should be able to compile and run the following: |
| 37 | + |
| 38 | +.. literalinclude:: ../tutorial/nfft2d1_test.c |
| 39 | + :language: c |
| 40 | + |
| 41 | +This is a basic example, running single-threaded, at the highest precision |
| 42 | +(using ``nfft_init_guru`` would allow more control.) |
| 43 | +It demonstrates: i) the NFFT3 user must write their data into arrays allocated |
| 44 | +by ``nfft_plan``, |
| 45 | +ii) the single nonuniform point coordinate array |
| 46 | +is interleaved ($x_1, y_1, x_2, y_2, \dots, x_M, y_M$), |
| 47 | +iii) the output mode ordering is C (row-major) rather than Fortran (column-major; |
| 48 | +this affects how to convert frequency indices into the output array index), |
| 49 | +and iv) there is an extra factor of $2\pi$ in the exponent relative |
| 50 | +to the FINUFFT definition, because NFFT3 assumes a 1-periodic input domain. |
| 51 | +The code is found in our :file:`tutorial/nfft2d1_test.c`. Running the executable gives: |
| 52 | + |
| 53 | +:: |
| 54 | + |
| 55 | + 2D type 1 (NFFT3) done in 0.589 s: f_hat[-17,33]=86.0632804289+-350.023846367i, rel err 9.93e-14 |
| 56 | + |
| 57 | +To show how to migrate this, we write a self-contained code that generates exactly |
| 58 | +the same "user data" (same random seed), then uses FINUFFT to do the transform |
| 59 | +to achieve exactly the same ``f_hat`` output array (in row-major C ordering). |
| 60 | +This entails scaling and swapping the nonequispaced coordinates just before sending |
| 61 | +to FINUFFT. Here is the corresponding C code (compare to the above): |
| 62 | + |
| 63 | +.. literalinclude:: ../tutorial/migrate2d1_test.c |
| 64 | + :language: c |
| 65 | + |
| 66 | +The fact that NFFT3 uses row-major mode arrays whereas FINUFFT uses column-major has |
| 67 | +been handled here by swapping the input $x$ and $y$ coordinates and array sizes in the |
| 68 | +FINUFFT call. (Equivalently, this could have been achieved by transposing the ``f_hat`` |
| 69 | +output array. We recommend the former route since it saves memory.) Running the |
| 70 | +executable gives: |
| 71 | + |
| 72 | +:: |
| 73 | + |
| 74 | + 2D type 1 (FINUFFT) in 0.0787 s: f_hat[-17,33]=86.0632804289+-350.023846367i, rel err 9.58e-14 |
| 75 | + |
| 76 | +Comparing to the above, we see the same answer to all shown digits, a similar error for this tested output entry, plus a 7.5$\times$ speed-up. (Both use a single thread, tested on the same AMD 5700U laptop.) The user may of course now set a coarser (larger) value for ``tol`` and see a further speed-up. |
| 77 | + |
| 78 | +We believe that the above gives the essentials of how to convert your code from using NFFT3 to FINUFFT. Please read our documentation, especially the guru interface if multiple related transforms are required, then post a GitHub Issue if you are still stuck. |
0 commit comments