Release Notes

This section includes significant changes, new features, performance improvements, and various issues. Unless noted, listed issues should not impact functionality. When functionality is impacted, we offer a work-around to avoid the issue (if available).

NVPL FFT 0.3.0 EA (nvpl-24.07-beta)

The 3rd early access release of NVPL FFT library.

New features

  • Improved single- and multi-threaded performance of complex-to-complex transforms in double precision for sizes ranging from 2 to 512.

  • Improved single- and multi-threaded performance of complex-to-real and real-to-complex transforms in single precision for sizes ranging from 2 to 512.

Known issues

  • N/A

NVPL FFT 0.2.0 EA (nvpl-24.03-beta)

The 2nd early access release of NVPL FFT library.

New features

  • Improved single- and multi-threaded performance of complex-to-complex transforms in both single and double precisions.

  • Improved scalability of the multi-threaded NVPL FFT.

Known issues

  • N/A

Resolved issues

  • NVPL FFT adopts a different threading implementation (see OpenMP-based Threading). Setting the OMP_PROC_BIND environment variable (or OMP_PLACES) will no longer negatively impact the multi-threaded performance.

NVPL FFT 0.1.0 EA (nvpl-23.11-beta)

The first early access release of NVPL FFT library.

New features

  • Supports computation of one-, two-, three- dimensional complex-to-complex, real-to-complex, complex-to-real DFTs in single and double precision with arbitrary sizes and strides using FFTW APIs.

  • Supports single- and multi-threaded FFTs computation.

Known issues

  • Some of the supported FFT sizes, including composite sizes and sizes greater than 50K elements, are not optimized to the full extent.

  • NVPL FFT respects the original thread affinity mask. For applications built with OpenMP runtime, controls of thread affinity (either via the OMP_PROC_BIND or the OMP_PLACES environment variables) could negatively impact the multi-threaded performance.

Resolved issues

  • N/A