Matmul#
-
class nvmath.
device. Matmul( - size,
- precision,
- data_type,
- *,
- sm=None,
- block_size=None,
- block_dim=None,
- leading_dimension=None,
- transpose_mode=None,
- arrangement=None,
- alignment=None,
- function='MM',
- static_block_dim=False,
- execution='Block',
A class that encapsulates a partial Matmul device function. A partial device function can be queried for available or optimal values for some knobs (such as
leading_dimensionorblock_dim).Changed in version 0.7.0:
Matmulhas replacedBlasOptionsandBlasOptionsComplete.- Parameters:
size – A sequence of integers denoting the three dimensions
(m, n, k)for the matrix multiplication problem.precision – The computation precision specified as a numpy float dtype, currently supports
numpy.float16,numpy.float32andnumpy.float64.data_type – The data type of the input matrices, can be either
'real'or'complex'.sm (ComputeCapability) – Target mathdx compute-capability.
block_size (int) – The total block size, optional. If not provided or set to
'suggested', will be set to a suggested value for 1D block dim.block_dim (Dim3) – The block dimension for launching the CUDA kernel, optional. If not provided or set to
'suggested', will be set to a suggested value. Cannot be used whenblock_sizeis explicitly specified.leading_dimension (LeadingDimension) – The leading dimensions for the input matrices, optional. If not provided, will be set to match the matrix row/column dimension. Alternatively, if provided as
'suggested', will be set to a suggested value for optimal performance.transpose_mode (TransposeMode) – The transpose mode for all input matrices ; transpose_mode or arrangement must be provided.
arrangement (Arrangement) – The arrangement for all input matrices ; transpose_mode or arrangement must be provided.
alignment (Alignment) – The alignment for the input matrices in shared memory. Defines the alignments (in bytes) of the input matrices A, B, and C (either arrays or wrapped in opaque tensors) that are passed to the execute(…) method. Default alignment is equal to an element size of the matrix unless used suggested layout. In that case alignment is greater or equal than the element size.
function (str) – A string specifying the name of the function. Currently supports
'MM'(default) for matrix multiplication.execution (str) – A string specifying the execution method, can be
'Block'or'Thread'.execute_api (str) –
A string specifying the signature of the function that handles problems with default or custom/dynamic leading dimensions. Could be
'static_leading_dimensions'or'dynamic_leading_dimensions'.Changed in version 0.5.0: execute_api is not part of the Matmul (ex. Blas) type. Pass this argument to
nvmath.instead.device. matmul() tensor_types (Sequence[str]) –
A list of strings specifying the tensors being used at execute signature.
Changed in version 0.5.0: tensor_types is not part of the Matmul (ex. Blas) type. Pass this argument to
nvmath.instead.device. matmul()
See also
The attributes of this class provide a 1:1 mapping with the CUDA C++ cuBLASDx APIs. For further details, please refer to cuBLASDx documentation.
Methods
- __init__(
- size,
- precision,
- data_type,
- *,
- sm=None,
- block_size=None,
- block_dim=None,
- leading_dimension=None,
- transpose_mode=None,
- arrangement=None,
- alignment=None,
- function='MM',
- static_block_dim=False,
- execution='Block',
- create(
- code_type=None,
- compiler=None,
- execute_api=None,
- tensor_types=None,
- global_memory_alignment=None,
- **kwargs,
Creates a copy of the instance with provided arguments updated.
Deprecated since version 0.7.0: Please use
functools.partial()instead.
- suggest_partitioner() Partitioner[source]#
Attributes
- a_dim#
- a_size#
- a_value_type#
- alignment#
- arrangement#
- b_dim#
- b_size#
- b_value_type#
- block_dim#
- block_size#
- c_dim#
- c_size#
- c_value_type#
- data_type#
- execution#
- files#
The list of binary files for the lto functions.
- function#
- input_type#
- leading_dimension#
- max_threads_per_block#
- output_type#
- precision#
- size#
- sm#
- static_block_dim#
- transpose_mode#
- value_type#