NVIDIA cuFFTMp documentation

Welcome to the cuFFTMp (cuFFT Multi-process) library.

You can find here:

cuFFTMp is distributed as part of the NVIDIA HPC-SDK.


  • 2D and 3D distributed-memory FFTs

  • Slabs (1D) and pencils (2D) data decomposition, with arbitrary block sizes

  • MPI interface

  • Low-latency implementation using NVSHMEM, optimized for single-node and multi-node FFTs

  • x86_64 and aarch64 support (see Hardware and software requirements)