LU Factorization#

GETRF (GEneral TRiangular Factorization) function computes batched LU factorization of a general matrix:

\[P * A = L * U,\]

where:

  • A is a batched M x N matrix, with leading dimension lda >= M if matrix A is in column-major layout, or lda >= N if matrix A is row-major.

  • P is a permutation matrix if pivoting is performed.

  • L is a lower triangular matrix with unit diagonal (lower trapezoidal if M > N).

  • U is an upper triangular matrix (upper trapezoidal if M < N).

cuSolverDx provides separate function operators depending on whether pivoting is to be performed:

  • cusolverdx::function::getrf_no_pivot: LU factorization with no pivoting.

  • cusolverdx::function::getrf_partial_pivot: LU factorization with partial pivoting.

Note

If a nonsingular matrix A is diagonal dominant, then it is safe to compute \(A = L * U\) without pivoting. If a matrix is not diagonal dominant, then pivoting is usually required to ensure numerical stability.

cuSolverDx getrf_no_pivot device functions are (see Execution Methods):

__device__ void execute(data_type* A, status_type* info);
// with runtime lda
__device__ void execute(data_type* A, const unsigned int lda, status_type* info);

cuSolverDx getrf_partial_pivot device functions are:

__device__ void execute(data_type* A, int* ipiv, status_type* info);
// with runtime lda
__device__ void execute(data_type* A, const unsigned int lda, int* ipiv, status_type* info);

After the function, input A is replaced by the lower triangular LU factor L, and the upper triangular LU factor U.

The output status parameter, info, is an array of batch size. If LU factorization is successful, every element of info is zero. If LU factorization failed for any batches, i.e. A (U) is singular, info[batch_id] = i indicates U(i,i) = 0.

For getrf_partial_pivot functions, the output ipiv is an array of size min(M, N) for each batch, and ipiv[batch_id * min(M, N) + i] indicates the row i interchanges with row ipiv[batch_id * min(M, N) + i] on the batch_id-th batch of A.

The functions support A being either column- or row-major memory layout, see Arrangement operator.