Linear Algebra#

Overview#

The Linear Algebra module nvmath.linalg in nvmath-python leverages various NVIDIA math libraries to support dense [1] linear algebra computations. As of version 0.7.0, we offer both a generic matrix multiplication API based on the cuBLAS and NVPL libraries and a specialized matrix multiplication API (nvmath.linalg.advanced) based on the cuBLASLt library. See Generic and Specialized APIs for motivation.

At a high-level, if your use case is predominantly GEMM and requires particular flexibility in matrix data layouts, input and/or compute types, and also in choosing the algorithmic implementation, look at the specialized APIs. Otherwise, look at the generic APIs.

API Reference#

Generic Linear Algebra APIs (nvmath.linalg)#

The generic linear algebra module includes matrix multiplication APIs which accept structured matrices as input, but do not allow for control over computational precision or algorithm selection and planning.

matmul(a, b, /[, c, alpha, beta, ...])

Perform the specified matrix multiplication computation \(\alpha a @ b + \beta c\).

Matmul(a, b, /[, c, alpha, beta, ...])

Create a stateful object encapsulating the specified matrix multiplication computation \(\alpha a @ b + \beta c\) and the required resources to perform the operation.

matrix_qualifiers_dtype

A NumPy custom dtype which describes a structured matrix.

ComputeType(value)

See cublasComputeType_t.

DiagonalMatrixQualifier()

A class which constructs and validates matrix_qualifiers_dtype for a diagonal matrix.

GeneralMatrixQualifier()

A class which constructs and validates matrix_qualifiers_dtype for a general rectangular matrix.

HermitianMatrixQualifier()

A class which constructs and validates matrix_qualifiers_dtype for a hermitian matrix.

InvalidMatmulState

SymmetricMatrixQualifier()

A class which constructs and validates matrix_qualifiers_dtype for a symmetric matrix.

TriangularMatrixQualifier()

A class which constructs and validates matrix_qualifiers_dtype for a triangular matrix.

SideMode(value)

See cublasSideMode_t.

FillMode(value)

See cublasFillMode_t.

DiagType(value)

See cublasDiagType_t.

ExecutionCPU(*[, num_threads])

A data class for providing CPU execution options.

ExecutionCUDA(*[, device_id])

A data class for providing GPU execution options.

MatmulOptions(*, allocator, blocking, ] =, ...)

A dataclass for providing options to a Matmul object.

Specialized Linear Algebra APIs (nvmath.linalg.advanced)#

The specialized linear algebra module includes a matrix multiplication API which only accepts general matrices, but provides extra functionality such as epilog functions, more options and controls over computational precision, and control over algorithm selection and planning.

matmul(a, b, /[, c, alpha, beta, epilog, ...])

Perform the specified matrix multiplication computation \(F(\alpha a @ b + \beta c)\), where \(F\) is the epilog.

matrix_qualifiers_dtype

NumPy dtype object that encapsulates the matrix qualifiers in linalg.advanced.

Algorithm(algorithm)

An interface class to query algorithm capabilities and configure the algorithm.

Matmul(a, b, /[, c, alpha, beta, ...])

Create a stateful object encapsulating the specified matrix multiplication computation \(\alpha a @ b + \beta c\) and the required resources to perform the operation.

MatmulComputeType

alias of ComputeType

MatmulEpilog

alias of Epilogue

MatmulInnerShape(value)

See cublasLtMatmulInnerShape_t.

MatmulNumericalImplFlags(value)

These flags can be combined with the | operator: OP_TYPE_FMA | OP_TYPE_TENSOR_HMMA ...

MatmulReductionScheme

alias of ReductionScheme

MatmulEpilogPreferences([aux_type, aux_amax])

A data class for providing epilog options as part of preferences to the Matmul.plan() method and the wrapper function matmul().

MatmulOptions([inplace, compute_type, ...])

A data class for providing options to the Matmul object and the wrapper function matmul().

MatmulPlanPreferences([...])

A data class for providing options to the Matmul.plan() method and the wrapper function matmul().

MatmulQuantizationScales([a, b, c, d])

A data class for providing quantization_scales to Matmul constructor and the wrapper function matmul().

Helpers#

The Specialized Linear Algebra helpers module nvmath.linalg.advanced.helpers provides helper functions to facilitate working with some of the complex features of nvmath.linalg.advanced module.

Matmul helpers (nvmath.linalg.advanced.helpers.matmul)#

BlockScalingFormat(value)

Block scaling format for microscaling data types.

create_mxfp8_scale(x, exponent[, stream])

invert_mxfp8_scale(mx_scales)

apply_mxfp8_scale(x, scales_1d[, output_dtype])

quantize_to_fp4(x, axis)

unpack_fp4(fp4_tensor, axis)

get_block_scale_offset(index, ...[, axis])

get_mxfp8_scale_offset(operand_or_shape, index)

Computes offset of a scale in the 1D interleaved scales tensor, applied to element operand[index].

to_block_scale(scale_tensor, ...[, axis, out])

expand_block_scale(scales_1d, ...[, axis, ...])

Footnotes