nvmath-python Device APIs

Overview

The device module of nvmath-python nvmath.device offers integration with NVIDIA’s high-performance computing libraries through device APIs for cuFFTDx and cuBLASDx. Detailed documentation for these libraries can be found at cuFFTDx and cuBLASDx.

Users may take advantage of the device module via the two approaches below:

  • Numba Extensions: Users can access these device APIs via Numba by utilizing specific extensions that simplify the process of defining functions, querying device traits, and calling device functions.

  • Third-party JIT Compilers: The APIs are also available through low-level interfaces in other JIT compilers, allowing advanced users to work directly with the raw device code.

Note

The device module nvmath.device currently supports cuFFTDx 1.2.0 and cuBLASDx 0.1.1, also available as part of MathDx 24.04. All functionalities from the C++ libraries are supported with the exception of cuFFTDx C++ APIs with a workspace argument, which are currently not available in nvmath-python.

API Reference

Utility APIs (nvmath.device)

current_device_lto()

A helper function to get the default code type for link time optimization (LTO) on the current device.

float16x2(x, y)

Create a Numba compliant vector object for float16 with vector length 2.

float16x4(x, y, z, w)

Create a Numba compliant vector object for float16 with vector length 4.

float32x2(x, y)

Create a Numba compliant vector object for float32 with vector length 2.

float64x2(x, y)

Create a Numba compliant vector object for float64 with vector length 2.

float16x2_type

A Numba compliant vector type object for float16 with vector length 2

float16x4_type

A Numba compliant vector type object for float16 with vector length 4

float32x2_type

A Numba compliant vector type object for float32 with vector length 2

float64x2_type

A Numba compliant vector type object for float64 with vector length 2

ISAVersion(major, minor)

A namedtuple class that encapsulates the code version.

Code(code_type, isa_version, data)

A namedtuple class that encapsulates code type, version, and buffer.

CodeType(kind, cc)

A namedtuple class that encapsulates code kind and compute capability.

ComputeCapability(major, minor)

A namedtuple class that encapsulates the major and minor compute capability.

CodeType(kind, cc)

A namedtuple class that encapsulates code kind and compute capability.

Symbol(variant, name)

A namedtuple class that encapsulates a device function symbol and which API it maps to.

Dim3([x, y, z])

A namedtuple class that encapsulates the dimensions for grids and blocks.

cuBLASDx APIs (nvmath.device)

matmul(*[, compiler])

Create an BlasOptions object that encapsulates a compiled and ready-to-use device function for matrix multiplication.

BlasOptions(size, precision, data_type, *[, ...])

A class that encapsulates a partial BLAS device function.

LeadingDimension(a, b, c)

A namedtuple class that encapsulates the three leading dimensions in matrix multiplication \(C = \alpha Op(A) Op(B) + \beta C\).

TransposeMode(a, b)

A namedtuple class that encapsulates the transpose mode for input matrices A and B in matrix multiplication.

cuFFTDx APIs (nvmath.device)

fft(*[, compiler])

Create an FFTOptions object that encapsulates a compiled and ready-to-use FFT device function.

FFTOptions(size, precision, fft_type, ...[, ...])

A class that encapsulates a partial FFT device function.