Building against HPC SDK¶
HPC SDK 22.3 ships with both cuFFT and cuFFTMp. Both cannot be used simultaneously. However, since cuFFTMp is a superset of cuFFT, it can be used in place of cuFFT.
The cuFFT headers are located in
.../math_libs/X.Y/include/ while the cuFFTMp headers are located in
When compiling an application against cuFFTMp, ensure that
The cuFFT headers are not included at compile time
Or the cuFFTMp headers are included before the cuFFT headers.
An application cannot link against both cuFFT (
libcufft.so) and cuFFTMp (
libcufftMp.so). This will lead to runtime errors.
Both those requirements are automatically satisfied when building using the
nvc -cudalib=cufftmp flag.
Building and running on Summit¶
cuFFTMp requires CUDA 11.4. This can be achieved by
ml cuda/11.4 nvhpc/X.Y spectrum-mpi/10.4.0.3-20210112.
CUDA_VISIBLE_DEVICESto be identical on every process. This means the proper usage of
jsrunto run cuFFTMp on two nodes with 6 processes (each with 1 GPU and 4 cores) per node is
jsrun -n 2 -a 6 -c 24 -g 6 ....
Since CUDA-aware Spectrum-MPI is not compatible with CUDA 11.4, the CUDA-aware features of Spectrum-MPI cannot be used with cuFFTMp.