LU Factorization#
GETRF (GEneral TRiangular Factorization) function computes batched LU factorization of a general matrix:
where:
A
is a batchedM x N
matrix, with leading dimensionlda >= M
if matrixA
is in column-major layout, orlda >= N
if matrixA
is row-major.P
is a permutation matrix if pivoting is performed.L
is a lower triangular matrix with unit diagonal (lower trapezoidal ifM > N
).U
is an upper triangular matrix (upper trapezoidal ifM < N
).
cuSolverDx provides separate function operators depending on whether pivoting is to be performed:
cusolverdx::function::getrf_no_pivot
: LU factorization with no pivoting.cusolverdx::function::getrf_partial_pivot
: LU factorization with partial pivoting.
Note
If a nonsingular matrix A
is diagonal dominant, then it is safe to compute \(A = L * U\) without pivoting. If a matrix is not diagonal dominant, then pivoting is usually required to ensure numerical stability.
cuSolverDx getrf_no_pivot
device functions are (see Execution Methods):
__device__ void execute(data_type* A, status_type* info);
// with runtime lda
__device__ void execute(data_type* A, const unsigned int lda, status_type* info);
cuSolverDx getrf_partial_pivot
device functions are:
__device__ void execute(data_type* A, int* ipiv, status_type* info);
// with runtime lda
__device__ void execute(data_type* A, const unsigned int lda, int* ipiv, status_type* info);
After the function, input A
is replaced by the lower triangular LU factor L
, and the upper triangular LU factor U
.
The output status parameter, info
, is an array of batch size. If LU factorization is successful, every element of info
is zero.
If LU factorization failed for any batches, i.e. A
(U
) is singular, info[batch_id] = i
indicates U(i,i) = 0
.
For getrf_partial_pivot
functions, the output ipiv
is an array of size min(M, N)
for each batch, and ipiv[batch_id * min(M, N) + i]
indicates the row i
interchanges with row ipiv[batch_id * min(M, N) + i]
on the batch_id
-th batch of A
.
The functions support A
being either column- or row-major memory layout, see Arrangement operator.