nvmath.device.matmul¶
- nvmath.device.matmul(*, compiler=None, **kwargs)[source]¶
Create an
BlasOptions
object that encapsulates a compiled and ready-to-use device function for matrix multiplication.- Parameters:
size – A sequence of integers denoting the three dimensions
(m, n, k)
for the matrix multiplication problem.precision – The computation precision specified as a numpy float dtype, currently supports
numpy.float16
,numpy.float32
andnumpy.float64
.data_type – The data type of the input matrices, can be either
'real'
or'complex'
.compiler – A string to specify the compiler for the device code, currently supports
None
(default) and'Numba'
code_type (CodeType) – The target GPU code and compute-capability.
block_size (int) – The total block size, optional. If not provided or set to
'suggested'
, will be set to a suggested value for 1D block dim.block_dim (Dim3) – The block dimension for launching the CUDA kernel, optional. If not provided or set to
'suggested'
, will be set to a suggested value. Can’t not be used whenblock_size
is explicitly specified.leading_dimension (LeadingDimension) – The leading dimensions for the input matrices, optional. If not provided, will be set to match the matrix row/column dimension. Alternatively, if provided as
'suggested'
, will be set to a suggested value for optimal performance.transpose_mode (TransposeMode) – The transpose mode for all input matrices. If not provided, no transposition by default.
function (str) – A string specifying the name of the function. Currently supports
'MM'
(default) for matrix multiplication.execution (str) – A string specifying the execution method, can be
'Block'
or'Thread'
.