Usage tips¶
Building against HPC SDK¶
HPC SDK 22.3 ships with both cuFFT and cuFFTMp. Both cannot be used simultaneously. However, since cuFFTMp is a superset of cuFFT, it can be used in place of cuFFT.
The cuFFT headers are located in .../math_libs/X.Y/include/
while the cuFFTMp headers are located in .../math_libs/X.Y/include/cufftmp/
.
When compiling an application against cuFFTMp, ensure that
The cuFFT headers are not included at compile time
Or the cuFFTMp headers are included before the cuFFT headers.
An application cannot link against both cuFFT (libcufft.so
) and cuFFTMp (libcufftMp.so
). This will lead to runtime errors.
Both those requirements are automatically satisfied when building using the nvc -cudalib=cufftmp
flag.
Building and running on Summit¶
cuFFTMp requires CUDA 11.4. This can be achieved by
ml cuda/11.4 nvhpc/X.Y spectrum-mpi/10.4.0.3-20210112
.cuFFTMp requires
CUDA_VISIBLE_DEVICES
to be identical on every process. This means the proper usage ofjsrun
to run cuFFTMp on two nodes with 6 processes (each with 1 GPU and 4 cores) per node isjsrun -n 2 -a 6 -c 24 -g 6 ...
.Since CUDA-aware Spectrum-MPI is not compatible with CUDA 11.4, the CUDA-aware features of Spectrum-MPI cannot be used with cuFFTMp.