NVIDIA MathDx Package#
The MathDx package is a comprehensive collection of NVIDIA device extension libraries that empower CUDA developers to run advanced mathematical operations directly inside their GPU kernels, leveraging kernel fusion for maximum efficiency and flexibility. These libraries are crafted to work seamlessly together, providing a unified solution for high-performance computations, data processing, and random number generation — all without unnecessary host-device data transfers. MathDx delivers performance portability across hardware generations, abstracting low-level GPU architecture details so developers can focus on algorithms rather than hardware-specific tuning.
cuBLASDx: Device-side extensions for selected linear algebra routines, including efficient General Matrix Multiplication (GEMM) performed within kernels.
cuFFTDx: Device-side Fast Fourier Transform library, enabling in-kernel FFT calculations for signal processing and scientific computation.
cuSolverDx: Device-side matrix factorization, linear solve, least squares, eigenvalue solver, and singular value decomposition routines, supporting scientific and engineering workflows within a kernel.
cuRANDDx: A random number generation library serving as a modern replacement for cuRAND device APIs.
nvCOMPDx: Compression and decompression capabilities built into device code, essential for high-throughput streaming and storage applications.
All device extension libraries are shipped in a single package, see Installation Guide for more details.
Libraries Documentation#
Components#
The latest release: 26.03.0
MathDx |
cuBLASDx |
cuFFTDx |
cuSolverDx |
cuRANDDx |
nvCOMPDx |
|---|---|---|---|---|---|
26.03.0 |
0.6.0 |
1.7.0 |
0.4.0 |
0.2.3 |
0.1.3 |
25.12.1 |
0.5.1 |
1.6.1 |
0.3.0 |
0.2.2 |
0.1.2 |
25.12.0 |
0.5.0 |
1.6.0 |
0.3.0 |
0.2.2 |
0.1.2 |
25.6.1 |
0.4.1 |
1.5.1 |
0.2.1 |
0.2.1 |
0.1.1 |
25.6.0 |
0.4.0 |
1.5.0 |
0.2.0 |
0.2.0 |
0.1.0 |
25.1.1 |
0.3.1 |
1.3.1 |
0.1.1 |
0.1.1 |
|
25.1.0 |
0.3.0 |
1.3.0 |
0.1.0 |
0.1.0 |
|
24.8.0 |
0.2.0 |
1.2.1 |
|||
24.4.0 |
0.1.1 |
1.2.0 |
|||
24.1.0 |
0.1.0 |
1.1.1 |
|||
22.11.0 |
1.1.0 |
||||
22.02.0 |
1.0.0 |
Requirements#
General requirements for the components in the latest release:
Supported CUDA compiler (C++17 support required)
Supported host compiler (C++17 support required)
(Optional) CMake (version 3.26 or greater)
NVIDIA GPU: Turing (SM75) or newer
CUDA Toolkit support:
Always use the latest patch version of the CUDA Toolkit (for example, for 13.0 use 13.0.2).
cuBLASDx |
cuFFTDx |
cuSolverDx |
cuRANDDx |
nvCOMPDx |
|
|---|---|---|---|---|---|
CUDA |
13.0.2+
|
13.0.2+
|
13.0.2+
|
13.0.2+
|
13.0.2+
|
CUDA compilers: NVCC and NVRTC from the supported CUDA Toolkits.
Host compilers:
cuBLASDx |
cuFFTDx |
cuSolverDx |
cuRANDDx |
nvCOMPDx |
|
|---|---|---|---|---|---|
GCC |
7+ |
7+ |
7+ |
7+ |
7+ |
Clang (Linux/WSL2) |
9+ |
9+ |
9+ |
9+ |
9+ |
Note
We recommend using GCC 10+ as the host compiler, and NVCC shipped with the latest CUDA Toolkit as the CUDA compiler.
Note
For detailed requirements, please check the documentation of each library as as requirements may differ slightly between libraries.