Release Notes¶
cuFFTMp 10.8.1 EA (HPC-SDK 22.5)¶
cuFFTMp 10.8.1 integrates NVSHMEM 2.5.0 and fixes a few issues as indicated below.
New features¶
N/A
Deprecations¶
N/A
Known / resolved issues¶
The issue with single-node, single-precision, 3D, complex-to-complex powers of 2 transforms in which Z > 8192 producing incorrect results has been resolved.
cuFFTMp’s versioning has been corrected. Going forward, cuFFTMp will be versioned similarly to cuFFT. See Versioning.
cuFFTMp 0.0.2 EA (HPC-SDK 22.3)¶
New features¶
Improved performances of
cufftXtSetDistribution
and distributed descriptors. This effectively gives full support to Pencil data decompositions.Improved performances of the Reshape API.
Deprecations¶
N/A
Known / resolved issues¶
Single-node, single-precision, 3D, complex-to-complex powers of 2 transforms in which Z > 8192 (e.g. a transform of size 2x2x16384) will lead to incorrect results when using built-in Slab decompositions (i.e.
CUFFT_XT_FORMAT_INPLACE
andCUFFT_XT_FORMAT_INPLACE_SHUFFLED
). This will be fixed in the future release of cuFFTMp.cufftXtSetDistribution
can be used as a workaround.
Standalone EA (November 2021)¶
New features¶
New multi-process API interoperable with MPI.
Built-in Slab decompositions (using
CUFFT_XT_FORMAT_INPLACE
andCUFFT_XT_FORMAT_INPLACE_SHUFFLED
descriptors) usingcufftMpAttachComm
Custom data decomposition (using
CUFFT_XT_FORMAT_DISTRIBUTED_INPUT
andCUFFT_XT_FORMAT_DISTRIBUTED_OUTPUT
descriptors) usingcufftXtSetDistribution
andcufftMpAttachComm
cufftXtMalloc
,cufftXtFree
andcufftXtMemcpy
are fully compatible with the aboveStandalone distributed reshape API with
cufftReshapeHandle
and associated APIs
In addition, the following limitations have been lifted
C2R/Z2D now support
CUFFT_XT_FORMAT_INPLACE
in 3DR2C/D2Z now support
CUFFT_XT_FORMAT_INPLACE_SHUFFLED
in 3D
The following restrictions have been lifted for CUFFT_XT_FORMAT_INPLACE
and CUFFT_XT_FORMAT_INPLACE_SHUFFLED
“Dimension must factor into primes less than or equal to 127”
“Maximum dimension size is 4096 for single precision”
“Maximum dimension size is 2048 for double precision”
The following restrictions have been lifted for R2C/D2Z/C2R/Z2D with CUFFT_XT_FORMAT_INPLACE
and CUFFT_XT_FORMAT_INPLACE_SHUFFLED
“Fastest changing dimension size needs to be even”
Deprecations¶
N/A
Known / resolved issues¶
cufftXtMemcpy
withCUFFT_COPY_DEVICE_TO_DEVICE
was returning wrong results for 2D and 3D transforms in all previous versions of cuFFT. This has been fixed.