Solver#
-
class nvmath.
device. Solver( - function: str,
- size: Sequence[int],
- precision: type[floating],
- execution: str,
- *,
- sm=None,
- arrangement: Sequence[str] | None = None,
- transpose_mode: str | None = None,
- side: str | None = None,
- diag: str | None = None,
- fill_mode: str | None = None,
- batches_per_block: int | Literal['suggested'] | None = None,
- data_type: str | None = None,
- leading_dimensions: Sequence[int] | None = None,
- block_dim: Sequence[int] | Literal['suggested'] | None = None,
- job: str | None = None,
A class that encapsulates a partial dense matrix factorization and solve device function.
Memory Layout Requirements:
Matrices must be stored in shared memory according to their arrangement and leading dimension (ld):
- Column-major arrangement: Matrix shape
(batches_per_block, rows, cols) with strides
(ld * cols, 1, ld)
- Column-major arrangement: Matrix shape
- Row-major arrangement: Matrix shape
(batches_per_block, rows, cols) with strides
(ld * rows, ld, 1)
- Row-major arrangement: Matrix shape
- Parameters:
function (str) – Solver function to be executed on execute() method. List of available options:
'potrf'(Cholesky factorization),'potrs'(Linear system solve after Cholesky factorization),'posv'(Fused Cholesky factorization with solve),'getrf_no_pivot'(LU factorization without pivoting),'getrs_no_pivot'(LU solve without pivoting),'gesv_no_pivot'(Fused LU without pivoting factorization with solve),'getrf_partial_pivot'(LU factorization with partial pivoting),'getrs_partial_pivot'(LU solve with partial pivoting),'gesv_partial_pivot'(Fused LU without pivoting factorization with solve),'geqrf'(QR factorization),'unmqr'(Multiplication of Q from QR factorization),'gelqf'(LQ factorization),'unmlq'(Multiplication of Q from LQ factorization),'trsm'(Triangular matrix-matrix solve),'gels'(Overdetermined or underdetermined least square problems),'ungqr'(Q matrix generation from geqrf factorization),'unglq'(Q matrix generation from gelqf factorization),'htev'(Eigenvalue solver for hermitian tridiagonal matrices),'heev'(Eigenvalue solver for hermitian dense matrices),'gtsv_no_pivot'(General tridiagonal matrix solve). Functions'ungqr','unglq','htev','heev','gtsv_no_pivot'require libmathdx 0.3.2 or later.size (Sequence[int]) – Problem size specified as a sequence of 1 to 3 elements:
(M,)(treated as(M, M, 1)),(M, N)(treated as(M, N, 1)), or(M, N, K). Please refer to cuSOLVERDx functionalities for detailed meaning.precision (type[np.floating]) – The computation precision specified as a numpy float dtype. Currently supports:
numpy.float32,numpy.float64.execution (str) – A string specifying the execution method. Supported values:
'Block'.sm (ComputeCapability) – Target mathdx compute-capability.
arrangement (Sequence[str], optional) – Storage layout for matrices A and B, specified as a sequence of 2 elements
(arr_A, arr_B). Each element can be one of:'col_major','row_major'. Defaults to("col_major", "col_major").transpose_mode (str, optional) – Transpose mode of matrix A. Refer to cuSOLVERDx documentation, as some functions do not support this option. Can be one of:
'non_transposed','transposed','conj_transposed'. Defaults to'non_transposed'.side (str, optional) – Side of matrix A in a multiplication operation. Required and supported only by functions:
'unmqr','unmlq','trsm'. Can be one of:'left','right'.diag (str, optional) – Indicates whether the diagonal elements of matrix A are ones or not. Required and supported only by
'trsm'functions. Can be one of:'unit','non_unit'.fill_mode (str, optional) – Indicates which part of matrix A is filled and should be used by function. Required and supported only by functions:
'potrf','potrs','posv','trsm','heev'. Can be one of:'upper','lower'.batches_per_block (int | Literal["suggested"], optional) – Number of batches to compute in parallel in a single CUDA block. Can be a non-zero integer or the string
'suggested'for automatic selection of an optimal value. We recommend using 1 for matrix A size larger than or equal to 16 x 16, and using'suggested'for smaller sizes to achieve optimal performance. Defaults to 1.data_type (str, optional) – The data type of the input matrices, can be one of:
'real','complex'. Defaults to'real'.leading_dimensions (Sequence[int], optional) – The leading dimensions for input matrices A and B, specified as a sequence of 2 elements (
lda,ldb) orNone. If not provided, it will be automatically deduced fromsizeandarrangement.block_dim (Sequence[int] | Literal["suggested"], optional) – The block dimension for launching the CUDA kernel, specified as a 1 to 3 integer sequence (x, y, z) where missing dimensions are assumed to be 1. Can be a sequence of 1 to 3 positive integers, the string
'suggested'for optimal value selection, orNonefor the default value.job (str, optional) – Job type for eigenvalue computation. Required and supported only by functions:
'htev','heev'. For'htev':'no_vectors','all_vectors','multiply_vectors'. For'heev':'no_vectors','overwrite_vectors'. Requires libmathdx 0.3.2 or later.
See also
The attributes of this class provide a 1:1 mapping with the CUDA C++ cuSOLVERDx APIs. For further details, please refer to cuSOLVERDx documentation.
Attributes
- a_arrangement#
- arrangement#
- b_arrangement#
- batches_per_block#
- block_dim#
- block_size#
- data_type#
- diag#
- execution#
- fill_mode#
- function#
- info_type#
- ipiv_type#
- job#
- k#
- lda#
- ldb#
- leading_dimensions#
- m#
- n#
- precision#
- side#
- size#
- sm#
- tau_type#
- transpose_mode#
- value_type#
- workspace_size#
Methods