API for Three-dimensional FFTs¶
To use the FFT programming interface, first of all, one additional Fortran module has to be used:
use decomp_2d_fft
Initialisation of the FFT module¶
The FFT interface is built on top of the 2D decomposition library, which, naturally, needs to be initialised first:
call decomp_2d_init(nx, ny, nz, p_row, p_col)
where \(nx\times ny\times nz\) is the 3D domain size and \(p\_row \times p\_col\) is the 2D processor grid. Then one needs to initialise the FFT interface by:
call decomp_2d_fft_init
The initialisation routine handles planing for the underlying FFT engine (if supported)
and defines global data structures (such as temporary work spaces) for the computations.
By default, it assumes that physical-space data is distributed in X-pencil format PHYSICAL_IN_X
.
The corresponding spectral-space data is stored in transposed Z-pencil format after the FFT.
To give applications more flexibility, the library also supports the opposite direction Z-pensil,
passing the optional parameter PHYSICAL_IN_Z
:
call decomp_2d_fft_init(PHYSICAL_IN_Z)
Physical-space data in Y-pencil is not supported since it requires additional expensive transpositions which does not make economical sense. There is a third and the most flexible form of the initialisation routine:
call decomp_2d_fft_init(pencil, n1, n2, n3)
where pencil=PHYSICAL_IN_X
or PHYSICAL_IN_Z
and n1, n2, n3
is an arbitrary problem size
different from \(nx\times ny\times nz\).
The result of the decomp_2d_fft_init
operation is to create two new objects of type DECOMP_INFO
:
ph
- structure with default size \(nx\times ny\times nz\) or size \(n1\times n2\times n3\) in case of arbitrary defined problemsh
- structure for ther2c/c2r
transform which with dimensions:PHYSICAL_IN_X
- \(nx/2+1\times ny\times nz\) (default) or \(n1/2+1\times n2\times n3\) (customized)PHYSICAL_IN_Z
- \(nx\times ny\times nz/2+1\) (default) or \(n1\times n2\times n3/2+1\) (customized)
Complex-to-complex Transforms¶
The library supports three-dimensional FFTs whose data is distributed as 2D pencils and stored in ordinary ijk-ordered 3D arrays across processors. For complex-to-complex (c2c) FFTs, the user interface is:
call decomp_2d_fft_3d(input, output, direction)
where direction
can be either DECOMP_2D_FFT_FORWARD == -1
for forward transforms, or DECOMP_2D_FFT_BACKWARD == 1
for backward transforms.
The input array input
and output
array out are both complex and
have to be either a X-pencil/Z-pencil combination or vice versa, depending on the direction of FFT and
how the FFT interface is initialised (PHYSICAL_IN_X
, the default, or PHYSICAL_IN_Z
the optional).
Real-to-complex & Complex-to-Real Transforms¶
The interface for the the real-to-complex and complex-to-real transform is
call decomp_2d_fft_3d(input, output)
If the input
data are real type a forward transform is assumed obtaining a complex output
.
Similarly a backward FFT is computed if input
is a complex array and output
a real array.
When real input is involved, the corresponding complex output satisfies so-called Hermitian redundancy -
i.e. some output values are complex conjugates of others.
Taking advantage of this, FFT algorithms can normally compute r2c and c2r transforms twice as fast as c2c transforms
while only using about half of the memory.
Unfortunately, the price to pay is that application’s data structures have to become slightly more complex.
For a 3D real input data set of size \(nx\times ny\times nz\) in a X-pencil deposition,
the complex output can be held in an array of size \(nx/2+1\times ny\times nz\), with the first dimension being cut roughly in half.
This change in size is reflected in the dimension assigned to the sp
structure previously described
The size of the sp
can also be recovered using the following routine:
call decomp_2d_fft_get_size(start,end,size)
Here all three arguments are 1D array of three elements, returning to the caller the starting index,
ending index and size of the sub-domain held by the current processor -
information very similar to the start/end/size
variables defined in the main decomposition library.
Please note that the complex output arrays obtained from X-pencil and Z-pencil input do not contain identical information. However, if Hermitian redundancy is taken into account, no physical information is lost and the real input can be fully recovered through the corresponding inverse FFT from either complex array.
Please also note that 2DECOMP&FFT does not scale the transforms. So a forward transform followed by a backward transform will not recover the input unless applications normalise the result by the size of the transforms.
Deallocation of the FFT module¶
Finally, to release the memory used by the FFT interface:
call decomp_2d_fft_finalize
It is possible to re-initialise the FFT interface in the same application at the later stage after it has been finalised, if this becomes necessary.