Skip to content

Commit d6246e1

Browse files
committed
Add the api_fft
1 parent a128bb9 commit d6246e1

File tree

1 file changed

+113
-0
lines changed

1 file changed

+113
-0
lines changed

docs/source/pages/api_fft.rst

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,116 @@
11
==============================
22
API for Three-dimensional FFTs
33
==============================
4+
5+
To use the FFT programming interface, first of all, one additional Fortran module has to be used:
6+
7+
::
8+
9+
use decomp_2d_fft
10+
11+
The FFT interface is built on top of the 2D decomposition library, which, naturally,
12+
needs to be initialised first:
13+
14+
::
15+
16+
call decomp_2d_init(nx, ny, nz, p_row, p_col)
17+
18+
where :math:`nx\times ny\times nz` is the 3D domain size and :math:`p\_row \times p\_col`
19+
is the 2D processor grid.
20+
Then one needs to initialise the FFT interface by:
21+
22+
::
23+
24+
call decomp_2d_fft_init
25+
26+
The initialisation routine handles planing for the underlying FFT engine (if supported)
27+
and defines global data structures (such as temporary work spaces) for the computations.
28+
By default, it assumes that physical-space data is distributed in X-pencil format ``PHYSICAL_IN_X``.
29+
The corresponding spectral-space data is stored in transposed Z-pencil format after the FFT.
30+
To give applications more flexibility, the library also supports the opposite direction Z-pensil,
31+
passing the optional parameter ``PHYSICAL_IN_Z``:
32+
33+
::
34+
35+
call decomp_2d_fft_init(PHYSICAL_IN_Z)
36+
37+
Physical-space data in Y-pencil is not supported since it requires additional expensive transpositions
38+
which does not make economical sense.
39+
There is a third and the most flexible form of the initialisation routine:
40+
41+
::
42+
43+
call decomp_2d_fft_init(pencil, n1, n2, n3)
44+
45+
where ``pencil=PHYSICAL_IN_X`` or ``PHYSICAL_IN_Z`` and ``n1, n2, n3`` is an arbitrary problem size
46+
different from :math:`nx\times ny\times nz`.
47+
The result of the ``decomp_2d_fft_init`` operation is to create two new objects of type ``DECOMP_INFO``:
48+
49+
#. ``ph`` - structure with default size :math:`nx\times ny\times nz` or size :math:`n1\times n2\times n3`
50+
in case of arbitrary defined problem
51+
52+
#. ``sh`` - structure for the ``r2c/c2r`` transform which with dimensions:
53+
54+
* ``PHYSICAL_IN_X`` - :math:`nx/2+1\times ny\times nz` (default) or :math:`n1/2+1\times n2\times n3` (customized)
55+
56+
* ``PHYSICAL_IN_Z`` - :math:`nx\times ny\times nz/2+1` (default) or :math:`n1\times n2\times n3/2+1` (customized)
57+
58+
**Complex-to-complex Transforms**
59+
60+
The library supports three-dimensional FFTs whose data is distributed as 2D pencils and stored in ordinary ijk-ordered 3D arrays across processors.
61+
For complex-to-complex (c2c) FFTs, the user interface is:
62+
63+
::
64+
65+
call decomp_2d_fft_3d(input, output, direction)
66+
67+
where ``direction`` can be either ``DECOMP_2D_FFT_FORWARD == -1`` for forward transforms, or ``DECOMP_2D_FFT_BACKWARD == 1`` for backward transforms.
68+
The input array ``input`` and ``output`` array out are both complex and
69+
70+
and have to be either a X-pencil/Z-pencil combination or vice versa, depending on the direction of FFT and
71+
how the FFT interface is initialised (``PHYSICAL_IN_X``, the default, or ``PHYSICAL_IN_Z`` the optional).
72+
73+
**Real-to-complex & Complex-to-Real Transforms**
74+
75+
The interface for the the real-to-complex and complex-to-real transform is
76+
77+
::
78+
79+
call decomp_2d_fft_3d(input, output)
80+
81+
If the ``input`` data are real type a forward transform is assumed obtaining a complex ``output``.
82+
Similarly a backward FFT is computed if ``input`` is a complex array and ``output`` a real array.
83+
When real input is involved, the corresponding complex output satisfies so-called *Hermitian redundancy* -
84+
i.e. some output values are complex conjugates of others.
85+
Taking advantage of this, FFT algorithms can normally compute r2c and c2r transforms twice as fast as c2c transforms
86+
while only using about half of the memory.
87+
Unfortunately, the price to pay is that application's data structures have to become slightly more complex.
88+
For a 3D real input data set of size :math:`nx\times ny\times nz` in a X-pencil deposition,
89+
the complex output can be held in an array of size :math:`nx/2+1\times ny\times nz`, with the first dimension being cut roughly in half.
90+
This change in size is reflected in the dimension assigned to the ``sp`` structure previously described
91+
The size of the ``sp`` can also be recovered using the following routine:
92+
93+
::
94+
95+
call decomp_2d_fft_get_size(start,end,size)
96+
97+
Here all three arguments are 1D array of three elements, returning to the caller the starting index,
98+
ending index and size of the sub-domain held by the current processor -
99+
information very similar to the ``start/end/size`` variables defined in the main decomposition library.
100+
101+
Please note that the complex output arrays obtained from X-pencil and Z-pencil input do not contain identical information.
102+
However, if *Hermitian redundancy* is taken into account, no physical information is lost and the real input can be fully recovered
103+
through the corresponding inverse FFT from either complex array.
104+
105+
Please also note that ``2decomp&FFT`` does not scale the transforms. So a forward transform followed by a backward transform
106+
will not recover the input unless applications normalise the result by the size of the transforms.
107+
108+
**Finalisation**
109+
110+
Finally, to release the memory used by the FFT interface:
111+
112+
::
113+
114+
call decomp_2d_fft_finalize
115+
116+
It is possible to re-initialise the FFT interface in the same application at the later stage after it has been finalised, if this becomes necessary.

0 commit comments

Comments
 (0)