Python#

While device extension libraries are primarily CUDA/C++ libraries, many either provide Python bindings in nvmath-python or expose their functionality in NVIDIA Warp.

Note

Python bindings for device extension libraries are designed to match C++/CUDA code in performance. Python may offer additional features like auto tuning capabilities.

Note

The integrations of Dx libraries into NVIDIA Warp and nvmath-python are still in progress.

For more details about Python support for device extension libraries, please refer to the documentation of the corresponding library:

NVIDIA Warp#

NVIDIA Warp is a Python library for writing high-performance simulation and graphics code that runs efficiently on both CPUs and NVIDIA GPUs. It uses just-in-time (JIT) compilation to transform Python functions into fast, parallel kernels, making it ideal for physics simulation, robotics, and geometry processing. Warp supports differentiable programming for integration with machine learning frameworks, enabling gradient-based optimization while maintaining Python’s simplicity.

Warp uses cuFFTDx, cuSolverDx, and cuBLASDx in its Tile mode.

nvmath-python#

nvmath-python is an open source library that bridges the gap between Python scientific community and NVIDIA CUDA-X™ math Libraries by reimagining Python’s performance-oriented APIs. It interoperates with and complements existing array libraries such as NumPy, CuPy, and PyTorch by pushing performance limits to new levels through such capabilities as stateful APIs, just-in-time kernel fusion, custom callbacks, and scaling to many GPUs.

nvmath-python provides Python bindings for cuBLASDx and cuFFTDx in its device APIs.