Installation

Building

The build system is driven by cmake. It is good practice to directly point to the MPI Fortran wrapper that you would like to use to guarantee consistency between Fortran compiler and MPI. This can be done by setting the default Fortran environmental variable

export FC=my_mpif90

To generate the build system run

cmake -S $path_to_sources -B $path_to_build_directory -DOPTION1 -DOPTION2 ...

If the directory does not exist it will be generated and it will contain the configuration files. The configuration can be further edited by using the ccmake utility as

ccmake $path_to_build_directory

and editing as desired, variables that are likely of interest are: CMAKE_BUILD_TYPE and FFT_Choice; additional variables can be shown by entering advanced mode by pressing t.

By default a RELEASE build will built, other options for CMAKE_BUILD_TYPE are DEBUG and DEV which turn on debugging flags and additionally try to catch coding errors at compile time, respectively.

The behaviour of debug and development versions of the library can be changed before the initialization using the variable decomp_debug or the environment variable DECOMP_2D_DEBUG. The value provided with the environment variable must be a positive integer below 9999.

Two BUILD_TARGETS are available namely mpi and gpu. For the mpi target no additional options should be required, whereas for gpu extra options are necessary at the configure stage. Please see GPU compilation section.

Once the build system has been configured, you can build 2DECOMP&FFT by running

cmake --build $path_to_build_directory -j <nproc>

appending -v will display additional information about the build, such as compiler flags.

After building the library can be tested using ctest. Please see Testing and examples section

Options can be added to change the level of verbosity. Finally, the build library can be installed by running

cmake --install $path_to_build_directory

The default location for libdecomp2d.a is

  • $path_to_build_directory/opt/lib or

  • $path_to_build_directory/opt/lib64

unless the variable CMAKE_INSTALL_PREFIX is modified.

The module files generated by the build process will similarly be installed to $path_to_build_directory/opt/install, users of the library should add this to the include paths for their program.

As indicated above, by default a static libdecomp2d.a will be compiled, if desired a shared library can be built by setting BUILD_SHARED_LIBS=ON either on the command line:

cmake -S $path_to_sources -B $path_to_build_directory -DBUILD_SHARED_LIBS=ON

or by editing the configuration using ccmake.

This might be useful for a centralised install supporting multiple users that is upgraded over time.

Occasionally a clean build is required, this can be performed by running

cmake --build $path_to_build_directory --target clean

Testing and examples

By default building of the tests is deactivated. To activate the testing the option -DBUILD_TESTING=ON can be added or alternativey the option can be activated in the GUI interface ccmake. After building the library can be tested by running

ctest --test-dir $path_to_build_directory

which uses the ctest utility. By default tests are performed in serial, but more than 1 rank can be used by setting MPIEXEC_MAX_NUMPROCS under ccmake utility. It is also possible to specify the decomposition by setting prow and pcol parameters at the configure stage or using ccmake. During the configure stage users should ensure that the number of MPI tasks MPIEXEC_MAX_NUMPROCS is equal to the product of pcol times prow. Mesh resolution can also be imposed using the parameters NX, NY and NZ.

For the GPU implementation please be aware that it is based on a single MPI rank per GPU. Therefore, to test multiple GPUs, use the maximum number of available GPUs on the system/node and not the maximum number of MPI tasks.

GPU compilation

The library can perform multi GPU offoloading using the NVHPC compiler suite for NVIDIA hardware. The implementation is based on CUDA-aware MPI and NVIDIA Collective Communication Library (NCCL). The FFT is based on cuFFT.

To properly configure for GPU build the following needs to be used

cmake -S $path_to_sources -B $path_to_build_directory -DBUILD_TARGET=gpu

Note, further configuration can be performed using ccmake, however the initial configuration of GPU builds must include the -DBUILD_TARGET=gpu flag as shown above.

By default CUDA aware MPI will be used together with cudaFFT for the FFT library. The configure will automatically look for the GPU architecture available on the system. If you are building on a HPC system please use a computing node for the installation. Useful variables to be added are

  • -DENABLE_NCCL=yes to activate the NCCL collectives

  • -DENABLE_MANAGED=yes to activate the automatic memory management form the NVHPC compiler

If you are getting the following error

-- The CUDA compiler identification is unknown
   CMake Error at /usr/share/cmake/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):
   Failed to detect a default CUDA architecture

It is possible that your default C compiler is too recent and not supported by nvcc. You might be able to solve the issue by adding

-DCMAKE_CUDA_HOST_COMPILER=$supported_gcc

At the moment the latest supported CUDA host compilers are gcc11 and earlier.

Linking from external codes

Codes using Makefiles

When building a code that links 2DECOMP&FFT using a Makefile you will need to add the include and link paths as appropriate (inlude/ and lib/ under the installation directory, respectively).

DECOMP_ROOT = /path/to/2decomp-fft
DECOMP_BUILD_DIR = $(DECOMP_ROOT)/build
DECOMP_INSTALL_DIR ?= $(DECOMP_BUILD_DIR)/opt # Use default unless set by user

INC += -I$(DECOMP_INSTALL_DIR)/include

# Users build/link targets
LIBS = -L$(DECOMP_INSTALL_DIR)/lib64 -L$(DECOMP_INSTALL_DIR)/lib -ldecomp2d

OBJ = my_exec.o

my_exec: $(OBJ)
   $(F90) -o $@ $(OBJ) $(LIBS)

In case 2DECOMP&FFT has been compiled with an external FFT, such as FFTW3, LIBS should also contain the following

FFTW3_PATH=/my_path_to_FFTW/lib
LIBFFT=-L$(FFTW3_PATH) -lfftw3 -lfftw3f
LIBS += $(LIBFFT)

In case of 2DECOMP&FFT compiled for GPU with NVHPC, linking against cuFFT is mandatory

LIBS += -cudalib=cufft

In case of NCCL the following is required

LIBS += -cudalib=cufft,nccl

It is also possible to drive the build and installation of 2DECOMP&FFT from a Makefile such as in the following example code

FC = mpif90
BUILD = Release

DECOMP_ROOT = /path/to/2decomp-fft
DECOMP_BUILD_DIR = $(DECOMP_ROOT)/build
DECOMP_INSTALL_DIR ?= $(DECOMP_BUILD_DIR)/opt # Use default unless set by user

INC += -I$(DECOMP_INSTALL_DIR)/include

# Users build/link targets
LIBS = -L$(DECOMP_INSTALL_DIR)/lib64 -L$(DECOMP_INSTALL_DIR)/lib -ldecomp2d

# Building libdecomp.a
$(DECOMP_INSTALL_DIR)/lib/libdecomp.a:
FC=$(FC) cmake -S $(DECOMP_ROOT) -B $(DECOMP_BUILD_DIR) -DCMAKE_BUILD_TYPE=$(BUILD) -DCMAKE_INSTALL_PREFIX=$(DECOMP_INSTALL_DIR)
cmake --build $(DECOMP_BUILD_DIR) --target decomp2d
cmake --build $(DECOMP_BUILD_DIR) --target install

# Clean libdecomp.a
clean-decomp:
cmake --build $(DECOMP_BUILD_DIR) --target clean
rm -f $(DECOMP_INSTALL_DIR)/lib/libdecomp.a

Profiling

Profiling can be activated via cmake configuration, however, the recommended approach is to run the initial configuration as follows:

export caliper_DIR=/path/to/caliper/install/share/cmake/caliper
export FC=mpif90
export CXX=mpicxx
cmake -S $path_to_sources -B $path_to_build_directory -DENABLE_PROFILER=caliper

where ENABLE_PROFILER is set to the profiling tool desired, currently supported values are: caliper. Note that when using caliper a C++ compiler is required as indicated in the above command line.

Miscellaneous

List of preprocessor variables

  • DEBUG - This variable is automatically added in debug and dev builds. Extra information is printed when it is present.

  • DOUBLE_PREC - When this variable is not present, the library uses single precision. When it is present, the library uses double precision. This preprocessor variable is driven by the CMake on/off variable DOUBLE_PRECISION.

  • SAVE_SINGLE - This variable is valid for double precision builds only. When it is present, snapshots are written in single precision. This preprocessor variable is driven by the CMake on/off variable SINGLE_PRECISION_OUTPUT.

  • PROFILER - This variable is automatically added when selecting the profiler. It activates the profiling sections of the code.

  • EVEN - This preprocessor variable is not valid for GPU builds. It leads to padded alltoall operations. This preprocessor variable is driven by the CMake on/off variable EVEN.

  • OVERWRITE - This variable leads to overwrite the input array when computing FFT. The support of this flag does not always correspond to in-place transforms, depending on the FFT backend selected, as described above. This preprocessor variable is driven by the CMake on/off variable ENABLE_INPLACE.

  • HALO_DEBUG - This variable is used to debug the halo operations. This preprocessor variable is driven by the CMake on/off variable HALO_DEBUG.

  • _GPU - This variable is automatically added in GPU builds.

  • _NCCL - This variable is valid only for GPU builds. The NVIDIA Collective Communication Library (NCCL) implements multi-GPU and multi-node communication primitives optimized for NVIDIA GPUs and Networking.

Optional dependencies

FFTW

The library FFTW can be used as a backend for the FFT engine. The version 3.3.10 was tested, is supported and can be downloaded here [FFTW Download]. Please note that one should build fftw and decomp2d against the same compilers. For build instructions, please check the instructions. Below is a suggestion for the compilation of the library in double precision (add --enable-single for a single precision build):

wget http://www.fftw.org/fftw-3.3.10.tar.gz
tar xzf fftw-3.3.10.tar.gz
mkdir fftw-3.3.10_tmp && cd fftw-3.3.10_tmp
../fftw-3.3.10/configure --prefix=xxxxxxx/fftw3/fftw-3.3.10_bld --enable-shared
make -j
make -j check
make install

Please note that the resulting build is not compatible with CMake see here. As a workaround, one can open the file /path/to/fftw3/install/lib/cmake/fftw3/FFTW3Config.cmake and comment the line

include ("${CMAKE_CURRENT_LIST_DIR}/FFTW3LibraryDepends.cmake")

To build 2DECOMP&FFT against fftw3, one can provide the package configuration for fftw3 in the PKG_CONFIG_PATH environment variable, this should be found under /path/to/fftw3/install/lib/pkgconfig. One can also provide the option -DFFTW_ROOT=/path/to/fftw3/install. Then either specify on the command line when configuring the build

cmake -S . -B build -DFFT_Choice=<fftw|fftw_f03> -DFFTW_ROOT=/path/to/fftw3/install

or modify the build configuration using ccmake.

Note the legacy fftw interface lacks interface definitions and will fail when stricter compilation flags are used (e.g. when -DCMAKE_BUILD_TYPE=Dev) for this it is recommended to use fftw_f03 which provides proper interfaces.

Caliper

The library caliper can be used to profile the execution of the code. The version 2.9.1 was tested and is supported, version 2.8.0 has also been tested and is still expected to work. Please note that one must build caliper and decomp2d against the same C/C++/Fortran compilers and MPI libray. For build instructions, please check the relative GitHub page and here. Below is a suggestion for the compilation of the library using the GNU compilers:

git clone https://github.com/LLNL/Caliper.git caliper_github
cd caliper_github
git checkout v2.9.1
mkdir build && cd build
cmake -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DCMAKE_Fortran_COMPILER=gfortran -DCMAKE_INSTALL_PREFIX=../../caliper_build_2.9.1 -DWITH_FORTRAN=yes -DWITH_MPI=yes -DBUILD_TESTING=yes ../
make -j
make test
make install``

After installing Caliper ensure to set caliper_DIR=/path/to/caliper/install/share/cmake/caliper. Following this the 2DECOMP&FFT build can be configured to use Caliper profiling as

cmake -S . -B -DENABLE_PROFILER=caliper

or by modifying the configuration to set ENABLE_PROFILER=caliper via ccmake.