Release Notes#
This section includes significant changes, new features, performance improvements, and various issues. Unless noted, listed issues should not impact functionality. When functionality is impacted, we offer a work-around to avoid the issue (if available).
NVPL FFT 0.4.2 EA (nvpl-25.5-beta)#
New features#
Improved single- and multi-threaded performance of complex-to-complex, complex-to-real and real-to-complex transforms for sizes from 2 to 1024.
type |
size |
complex-to-complex |
real-to-complex |
complex-to-real |
---|---|---|---|---|
FP32 |
[2, 32) |
1.09 |
1.02 |
1.01 |
[32, 512] |
1.06 |
1.05 |
1.05 |
|
(512, 1024] |
2.51 |
6.72 |
7.78 |
|
FP64 |
[2, 32) |
1.03 |
1.02 |
1.02 |
[32, 512] |
1.06 |
1.06 |
1.05 |
|
(512, 1024] |
2.80 |
4.94 |
5.55 |
Known issues#
For real-to-complex and complex-to-real in-place transforms of rank 2 and higher, there are additional constraints on the data-layout as compared to FFTW.
NVPL FFT 0.4.1 EA (nvpl-25.1.1-beta)#
Minor bug-fix release of NVPL FFT library.
New features#
N/A
Resolved issues#
Fix a bug when a size 1 kernel was not selected during planning for innermost dimension of complex-to-real transform.
Known issues#
For real-to-complex and complex-to-real in-place transforms of rank 2 and higher, there are additional constraints on the data-layout as compared to FFTW.
NVPL FFT 0.4.0 EA (nvpl-25.1-beta)#
The 4th early access release of NVPL FFT library.
New features#
Added support for modern and legacy FFTW Fortran interfaces.
nvpl_fftw.h
can now also be found ininclude/nvpl_fftw/
under the namefftw3.h
.Improved single- and multi-threaded performance of complex-to-complex, complex-to-real and real-to-complex transforms for sizes ranging from 2 to 512.
complex-to-complex |
real-to-complex |
complex-to-real |
|
---|---|---|---|
FP32 |
1.04 |
1.20 |
1.20 |
FP64 |
1.05 |
5.65 |
6.58 |

Known issues#
For real-to-complex and complex-to-real in-place transforms of rank 2 and higher, there are additional constraints on the data-layout as compared to FFTW.
NVPL FFT 0.3.0 EA (nvpl-24.7-beta)#
The 3rd early access release of NVPL FFT library.
New features#
Improved single- and multi-threaded performance of complex-to-complex transforms in double precision for sizes ranging from 2 to 512.
Improved single- and multi-threaded performance of complex-to-real and real-to-complex transforms in single precision for sizes ranging from 2 to 512.
Known issues#
N/A
NVPL FFT 0.2.0 EA (nvpl-24.03-beta)#
The 2nd early access release of NVPL FFT library.
New features#
Improved single- and multi-threaded performance of complex-to-complex transforms in both single and double precisions.
Improved scalability of the multi-threaded NVPL FFT.
Known issues#
N/A
Resolved issues#
NVPL FFT adopts a different threading implementation (see OpenMP-based Threading). Setting the
OMP_PROC_BIND
environment variable (orOMP_PLACES
) will no longer negatively impact the multi-threaded performance.
NVPL FFT 0.1.0 EA (nvpl-23.11-beta)#
The first early access release of NVPL FFT library.
New features#
Supports computation of one-, two-, three- dimensional complex-to-complex, real-to-complex, complex-to-real DFTs in single and double precision with arbitrary sizes and strides using FFTW APIs.
Supports single- and multi-threaded FFTs computation.
Known issues#
Some of the supported FFT sizes, including composite sizes and sizes greater than 50K elements, are not optimized to the full extent.
NVPL FFT respects the original thread affinity mask. For applications built with OpenMP runtime, controls of thread affinity (either via the
OMP_PROC_BIND
or theOMP_PLACES
environment variables) could negatively impact the multi-threaded performance.
Resolved issues#
N/A