BlasOptions#
- 
class nvmath.device. BlasOptions( 
- size,
- precision,
- data_type,
- *,
- code_type=None,
- block_size=None,
- block_dim=None,
- leading_dimension=None,
- transpose_mode=None,
- arrangement=None,
- alignment=None,
- global_memory_alignment=None,
- function='MM',
- static_block_dim=False,
- execution='Block',
- execute_api='static_leading_dimensions',
- tensor_types=None,
- A class that encapsulates a partial BLAS device function. A partial device function can be queried for available or optimal values for some knobs (such as - leading_dimensionor- block_dim). It does not contain a compiled, ready-to-use, device function until finalized using- create().- Parameters:
- size – A sequence of integers denoting the three dimensions - (m, n, k)for the matrix multiplication problem.
- precision – The computation precision specified as a numpy float dtype, currently supports - numpy.float16,- numpy.float32and- numpy.float64.
- data_type – The data type of the input matrices, can be either - 'real'or- 'complex'.
- code_type (CodeType) – The target GPU code and compute-capability. 
- block_size (int) – The total block size, optional. If not provided or set to - 'suggested', will be set to a suggested value for 1D block dim.
- block_dim (Dim3) – The block dimension for launching the CUDA kernel, optional. If not provided or set to - 'suggested', will be set to a suggested value. Cannot be used when- block_sizeis explicitly specified.
- leading_dimension (LeadingDimension) – The leading dimensions for the input matrices, optional. If not provided, will be set to match the matrix row/column dimension. Alternatively, if provided as - 'suggested', will be set to a suggested value for optimal performance.
- transpose_mode (TransposeMode) – The transpose mode for all input matrices ; transpose_mode or arrangement must be provided. 
- arrangement (Arrangement) – The arrangement for all input matrices ; transpose_mode or arrangement must be provided. 
- alignment (Alignment) – The alignment for the input matrices in shared memory. Defines the alignments (in bytes) of the input matrices A, B, and C (either arrays or wrapped in opaque tensors) that are passed to the execute(…) method. Default alignment is equal to an element size of the matrix unless used suggested layout. In that case alignment is greater or equal than the element size. 
- function (str) – A string specifying the name of the function. Currently supports - 'MM'(default) for matrix multiplication.
- execution (str) – A string specifying the execution method, can be - 'Block'or- 'Thread'.
- execute_api (str) – A string specifying the signature of the function that handles problems with default or custom/dynamic leading dimensions. Could be - 'static_leading_dimensions'or- 'dynamic_leading_dimensions'.
- global_memory_alignment (Alignment) – Same as alignment, but for the global memory. Used to optimize copying between shared and global memory. 
 
 - See also - The attributes of this class provide a 1:1 mapping with the CUDA C++ cuBLASDx APIs. For further details, please refer to cuBLASDx documentation. - Methods - __init__(
- size,
- precision,
- data_type,
- *,
- code_type=None,
- block_size=None,
- block_dim=None,
- leading_dimension=None,
- transpose_mode=None,
- arrangement=None,
- alignment=None,
- global_memory_alignment=None,
- function='MM',
- static_block_dim=False,
- execution='Block',
- execute_api='static_leading_dimensions',
- tensor_types=None,
 - Attributes - alignment#
 - arrangement#
 - block_dim#
 - block_size#
 - code_type#
 - data_type#
 - execute_api#
 - execution#
 - function#
 - leading_dimension#
 - precision#
 - size#
 - static_block_dim#
 - transpose_mode#