Release Notes¶
This section includes significant changes, new features, performance improvements, and various issues. Unless noted, listed issues should not impact functionality. When functionality is impacted, we offer a work-around to avoid the issue (if available).
NVPL FFT 0.3.0 EA (nvpl-24.07-beta)¶
The 3rd early access release of NVPL FFT library.
New features¶
Improved single- and multi-threaded performance of complex-to-complex transforms in double precision for sizes ranging from 2 to 512.
Improved single- and multi-threaded performance of complex-to-real and real-to-complex transforms in single precision for sizes ranging from 2 to 512.
Known issues¶
N/A
NVPL FFT 0.2.0 EA (nvpl-24.03-beta)¶
The 2nd early access release of NVPL FFT library.
New features¶
Improved single- and multi-threaded performance of complex-to-complex transforms in both single and double precisions.
Improved scalability of the multi-threaded NVPL FFT.
Known issues¶
N/A
Resolved issues¶
NVPL FFT adopts a different threading implementation (see OpenMP-based Threading). Setting the
OMP_PROC_BIND
environment variable (orOMP_PLACES
) will no longer negatively impact the multi-threaded performance.
NVPL FFT 0.1.0 EA (nvpl-23.11-beta)¶
The first early access release of NVPL FFT library.
New features¶
Supports computation of one-, two-, three- dimensional complex-to-complex, real-to-complex, complex-to-real DFTs in single and double precision with arbitrary sizes and strides using FFTW APIs.
Supports single- and multi-threaded FFTs computation.
Known issues¶
Some of the supported FFT sizes, including composite sizes and sizes greater than 50K elements, are not optimized to the full extent.
NVPL FFT respects the original thread affinity mask. For applications built with OpenMP runtime, controls of thread affinity (either via the
OMP_PROC_BIND
or theOMP_PLACES
environment variables) could negatively impact the multi-threaded performance.
Resolved issues¶
N/A