QRFactorize#
-
class nvmath.
device. QRFactorize( - size: Sequence[int],
- precision: type[floating],
- execution: str,
- *,
- sm=None,
- arrangement: str | None = None,
- batches_per_block: int | Literal['suggested'] | None = None,
- data_type: str | None = None,
- leading_dimension: int | None = None,
- block_dim: Sequence[int] | Literal['suggested'] | None = None,
A class that encapsulates QR orthogonal factorization device function for general matrices using Householder reflections.
Available operation:
factorize: Computes the QR factorization A = Q @ R, where Q is a unitary M x M matrix and R is an upper triangular matrix (if M >= N) or upper trapezoidal matrix (if M < N).
The factorization uses Householder reflection transformations and does not explicitly form the unitary matrix Q. Instead, Q is represented as a product of Householder vectors stored in the input matrix A along with the tau array.
Memory Layout Requirements:
Matrices must be stored in shared memory according to their arrangement and leading dimension (ld):
For matrix A (M x N):
Column-major arrangement: Matrix shape
(batches_per_block, M, N)with strides(lda * N, 1, lda)Row-major arrangement: Matrix shape
(batches_per_block, M, N)with strides(lda * M, lda, 1)
- Parameters:
size (Sequence[int]) – Problem size specified as a sequence of 1 to 3 elements:
(M,)(treated as(M, M, 1)),(M, N)(treated as(M, N, 1)), or(M, N, K).MandNrepresent the dimensions of the matrix A used in factorization.Kis ignored if specified.precision (type[np.floating]) – The computation precision specified as a numpy float dtype. Currently supports:
numpy.float32,numpy.float64.execution (str) – A string specifying the execution method. Supported values:
'Block'.sm (ComputeCapability) – Target mathdx compute-capability.
arrangement (str, optional) – Storage layout for matrix A. Can be one of:
'col_major','row_major'. Defaults to'col_major'. Note: When provided in the constructor, leading dimensions are set at compile-time. To use runtime leading dimensions (avoiding recompilation for different leading dimensions), provide the leading dimension parameters directly to the device methods instead.batches_per_block (int | Literal["suggested"], optional) – Number of batches to compute in parallel in a single CUDA block. Can be a non-zero integer or the string
'suggested'for automatic selection of an optimal value. We recommend using 1 for matrix A size larger than or equal to 16 x 16, and using'suggested'for smaller sizes to achieve optimal performance. Defaults to 1.data_type (str, optional) – The data type of the input matrices, can be one of:
'real','complex'. Defaults to'real'.leading_dimension (int, optional) – The leading dimension for input matrix A, or
None. If not provided, it will be automatically deduced fromsizeandarrangement. Note: When provided in the constructor, leading dimensions are set at compile-time. To use runtime leading dimensions (avoiding recompilation for different leading dimensions), provide the leading dimension parameters directly to the device methods instead.block_dim (Sequence[int] | Literal["suggested"], optional) – The block dimension for launching the CUDA kernel, specified as a 1 to 3 integer sequence (x, y, z) where missing dimensions are assumed to be 1. Can be a sequence of 1 to 3 positive integers, the string
'suggested'for optimal value selection, orNonefor the default value.
Attributes
- a_arrangement#
- a_shape#
- batches_per_block#
- block_dim#
- block_size#
- data_type#
- execution#
- lda#
- m#
- n#
- precision#
- size#
- sm#
- tau_shape#
- tau_size#
- tau_strides#
- tau_type#
- value_type#
Methods
- factorize(a, tau, lda=None) None[source]#
Computes the QR factorization of a general matrix A using Householder reflections.
This device function computes A = Q @ R, where Q is a unitary M x M matrix and R is an upper triangular matrix (if M >= N) or upper trapezoidal matrix (if M < N). Uses cuSOLVERDx
'geqrf'.If
ldais provided, uses runtime version with the specified leading dimension. Ifldais not provided (None), uses compile-time version with default or constructor-provided leading dimensions.Matrix Q is not explicitly formed. Instead, Q is represented as a product of min(M, N) Householder vectors: Q = H(0) * H(1) * … * H(min(M, N) - 1).
Each Householder vector has the form H(i) = I - tau[i] * v * v^H, where:
v is a vector of size M for each batch
v[0:i-1] = 0, v[i] = 1
v[i+1:M] is stored on exit in A[i+1:M, i]
For more details, see: get_started/functions/geqrf.html
- Parameters:
a – Pointer to an array in shared memory, storing the batched matrix according to the specified arrangement and leading dimension (see
__init__()). The matrix is overwritten in place. On exit, the upper triangular or upper trapezoidal part (including diagonal) contains the matrix R. The elements below the diagonal, with the array tau, represent the unitary matrix Q as a product of Householder vectors.tau – Pointer to a 1D array of size min(M, N) for each batch. Contains the scalar factors of the Householder reflections. The tau array, together with the Householder vectors stored in A, defines the unitary matrix Q.
lda – Optional runtime leading dimension of matrix A. If not specified, the compile-time
ldais used.