Generalized Linear System Solver#
GELS (GEneral Lease Square) function solves overdetermined or underdetermined least square problems:
using the QR or LQ factorization of A
, and overwriting B
with the solution X
.
The configurations supported by GELS are:
If
op(A)
isnon_transposed
andM >= N
, find the least squares solution of an overdetermined system using the QR factorization ofA
;If
op(A)
isnon_transposed
andM < N
, find the minimum norm solution of an underdetermined system using the LQ factorization ofA
;If
op(A)
istransposed
orconj_transposed
andM >= N
, find the minimum norm solution of an underdetermined system using the LQ factorization ofA
;If
A
is eithertransposed
orconj_transposed
andM < N
: find the least squares solution of an overdetermined system using the QR factorization ofA
.
cuSolverDx gels
device functions are (see Execution Methods):
__device__ void execute(data_type* A, data_type* tau, data_type* B);
// with runtime leading dimensions
__device__ void execute(data_type* A, data_type* tau,
data_type* B, const unsigned int ldb);
__device__ void execute(data_type* A, const unsigned int lda, data_type* tau,
data_type* B);
__device__ void execute(data_type* A, const unsigned int lda, data_type* tau,
data_type* B, const unsigned int ldb);
A
is a batched M x N
matrix, with leading dimension lda >= M
if A
is in column-major layout, or lda >= N
if matrix A
is row-major. After the function returns, A
is overwritten by the QR or LQ factorization of the input matrix.
The input B
is a batched N x K
right-hand side matrix, and the result X
is a batched M x K
matrix.
Note
GELS is a in-place function, i.e., B
is overwritten by the solution X
after the function returns. While mathmatically the dimensions of X
and B
are different, in practice the storage object of B/X
is max(M, N) x K
per batch.
tau
is an array of size min(M, N)
for each batch, and represents the Householder vectors of the QR or LQ factorization of A
.
The functions support:
A
andB
being either column- or row-major memory layout, see Arrangement Operator,\(op(A)\) either being
non_transposed
,transposed
for real data type, orconj_transposed
for complex data type, see TransposeMode Operator, andM >= N
for overdetermined system, orM < N
for underdetermined system.