cuDSS Functions#
Library Management Functions#
cudssCreate#
-
cudssStatus_t cudssCreate(cudssHandle_t *handle)#
- The function initializes the cuDSS library handle (
cudssHandle_t) which holds the cuDSS library context. It allocates light hardware resources on the host, and must be called prior to making any other cuDSS library calls. Calling any cuDSS function which usescudssHandle_twithout a previous call ofcudssCreate()will return an error. The cuDSS library context is tied to the current CUDA device. To use the library on multiple devices, one cuDSS handle should be created for each device.- Parameters:
handle – [out] [host] cuDSS library handle
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssCreateMg#
- cudssStatus_t cudssCreateMg(
- cudssHandle_t *handle,
- int device_count,
- const int *device_indices
- The function initializes the cuDSS library handle (
cudssHandle_t) which holds the cuDSS library context for multiple devices. Calling any cuDSS function which usescudssHandle_twithout a previous call ofcudssCreateMg()will return an error. The cuDSS library context is tied to the CUDA devices defined bydevice_countnumber anddevice_indicesarray. Ifdevice_indicesisNULL, cuDSS will take devices from0todevice_count - 1. The calling device index must be equal to the first device number fromdevice_indicesor0(ifdevice_indicesisNULL)Note: Samedevice_countanddevice_indicesmust be passed tocudssConfig_tobject by callingcudssConfigSet()withCUDSS_CONFIG_DEVICE_COUNTandCUDSS_CONFIG_DEVICE_INDICES.Limitations: Some of the features are not supported, see MG mode for details.- Parameters:
handle – [out] [host] cuDSS library handle
device_count – [in] [host] Number of devices
device_indices – [in] [host] Integer array of size
device_countwhich stores device indices. If set toNULL, device indices from0todevice_count - 1are used.
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssDestroy#
-
cudssStatus_t cudssDestroy(cudssHandle_t handle)#
- The function releases hardware resources used by the cuDSS library. This function is the last call with a particular handle to the cuDSS library. Calling any cuDSS function which uses
cudssHandle_taftercudssDestroy()will return an error.- Parameters:
handle – [in] [host] cuDSS library handle
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssGetProperty#
- cudssStatus_t cudssGetProperty(
- libraryPropertyType propertyType,
- int *value
- The function returns the value of the requested property. Refer to
libraryPropertyTypefor supported types.- Parameters:
propertyType – [in] [host] Requested property
value – [out] [host] Value of the requested property
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
The enumlibraryPropertyTypeis defined inlibrary_types.h(standard CUDA header file). The supported subset of values for cuDSS is:
cudssSetStream#
- cudssStatus_t cudssSetStream(
- cudssHandle_t handle,
- cudaStream_t stream
- The function sets the stream to be used by the cuDSS library to execute its routines.
- Parameters:
handle – [inout] [host] cuDSS library handle
stream – [in] [host] The stream used by the library
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssSetMgStreams#
- cudssStatus_t cudssSetMgStreams(
- cudssHandle_t handle,
- const cudaStream_t *streams,
- int stream_count
- The function sets per-device streams to be used by the cuDSS library in multi-GPU mode. This function should be used when the library handle was created with
cudssCreateMg().Each stream in the array corresponds to a device specified incudssCreateMg(). Specifically,streams[i]will be used for operations ondevice_indices[i].Note: User is responsible for creating and destroying these streams. The streams must remain valid for the lifetime of the cuDSS handle. cuDSS will not destroy user-provided streams.Note: This function is optional. If not called, cuDSS will create and manage its own internal per-device streams for multi-GPU operations.- Parameters:
handle – [inout] [host] cuDSS library handle created with
cudssCreateMg()streams – [in] [host] Array of CUDA streams, one per device
stream_count – [in] [host] Number of streams in the array (must match device_count from
cudssCreateMg())
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise. Returns an error ifstream_countdoes not match the number of devices, ifstreamsis NULL, or if the handle was not created withcudssCreateMg().
cudssSetDeviceMemHandler#
- cudssStatus_t cudssSetDeviceMemHandler(
- cudssHandle_t handle,
- const cudssDeviceMemHandler_t *handler
- Set the current device memory handler inside the library handle.If
handlerargument is set to NULL, the library handle will detach its existing memory handler from the library handle. In case device memory handler needs to be changed after it is set for the first time, previously set device memory handler needs to be detached.If a cuDSS API which needs to allocate device memory (cudssExecute()) is called and there is no device memory handler attached to the library handle at the moment of the call, cuDSS will allocate device memory internally using a default memory handler.When the device memory handler is set, during calls to the main cuDSS routinecudssExecute()the library will allocate the necessary device memory using thedevice_alloc()member of the device memory handler struct. The allocated memory will remain a part of thecudssData_tobject used in the call until the allocated memory will be deallocated using thedevice_free()member of the struct when the correspondingcudssDataDestroy()is called. As it follows, erroneous behavior is likely to occur if the device memory handler is changed during the lifespan of thecudssData_tobjects which have used the library handle. SeecudssDeviceMemHandler_tfor further details aboutdevice_alloc()anddevice_free().The internal stream order is established using the user-provided stream set viacudssSetStream().Note: It is undefined behavior if the library handle is bound to a memory handler and subsequently to another handler (without detaching), or the library handle outlives the attached memory pool, or the memory pool is not stream-ordered.- Parameters:
handle – [inout] [host] cuDSS library handle
handler – [in] [host] The device memory handler that encapsulates the user’s mempool. The struct content is copied internally
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssGetDeviceMemHandler#
- cudssStatus_t cudssGetDeviceMemHandler(
- const cudssHandle_t handle,
- cudssDeviceMemHandler_t *handler
- Get the current device memory handler.
- Parameters:
handle – [in] [host] cuDSS library handle
handler – [out] [host] A (deep) copy of the device memory handler that encapsulates the user’s mempool (if it was set) previously via calling
cudssSetDeviceMemHandler()
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssSetCommLayer#
- cudssStatus_t cudssSetCommLayer(
- cudssHandle_t handle,
- const char *commLibFileName
- The function sets the communication layer to be used in MGMN modeof cuDSS. The set communication layer will be used for all MGMN mode operations where themodified library handle is involved.The communication layer must provide communication backends for both device and host memory buffers.After setting the communication layer, users should set the device communicator via
cudssDataSet()withCUDSS_DATA_COMM_DEVICEand the host communicator withCUDSS_DATA_COMM_HOST.For more details of when and how this function should be used, see- Parameters:
handle – [in] [host] cuDSS library handle
commLibFileName – [in] [host] Library name or path (resolved by the platform dynamic loader, e.g.
dlopen/LoadLibrary). If NULL, the name is read from CUDSS_COMM_LIB
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssSetThreadingLayer#
- cudssStatus_t cudssSetThreadingLayer(
- cudssHandle_t handle,
- const char *thrLibFileName
- Parameters:
handle – [in] [host] cuDSS library handle
thrLibFileName – [in] [host] Library name or path (resolved by the platform dynamic loader, e.g.
dlopen/LoadLibrary). If NULL, the name is read from CUDSS_THREADING_LIB
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
Logger API Functions#
Refer to cuDSS Logging Features for overview and environment variables.
Note
Logging API functions are not thread-safe and must not be called concurrently from multiple threads.
cudssLoggerSetCallback#
-
cudssStatus_t cudssLoggerSetCallback(cudssLoggerCallback_t callback)#
- Sets a custom callback function to receive log messages.The callback is invoked synchronously from the calling thread and must not call any cuDSS API functions (may cause deadlocks).
- Parameters:
callback – [in] [host] Pointer to callback function (see
cudssLoggerCallback_t), or NULL to disable
- Returns:
[out]
CUDSS_STATUS_SUCCESSon success
Note
In MGMN mode, each process has its own independent logger state and callback. The callback must be set in each process if needed.
cudssLoggerSetFile#
-
cudssStatus_t cudssLoggerSetFile(FILE *file)#
- Redirects log output to an already-opened file handle.Pass
NULLto disable log output entirely.- Parameters:
file – [in] [host] Pointer to open file with write permission, or
NULLto disable logging
- Returns:
[out]
CUDSS_STATUS_SUCCESSon success
Note
The file handle must remain open until another file is set or logging is disabled. Passing
NULLdisables all log output until a valid file is set.Note
In MGMN mode, each process must use a different file handle to avoid corrupted or interleaved output. Multiple processes writing to the same file concurrently is not safe.
cudssLoggerOpenFile#
-
cudssStatus_t cudssLoggerOpenFile(const char *logFile)#
- Opens a file and redirects log output to it. The file is opened in write mode (“w”), truncating any existing content.Multiple calls automatically close the previous file. Parent directories must exist beforehand.
- Parameters:
logFile – [in] [host] Path to the log file
- Returns:
[out]
CUDSS_STATUS_SUCCESSon success,CUDSS_STATUS_INVALID_VALUEif file cannot be opened (permission denied, path not found, etc. are not distinguished)
Note
In MGMN mode, each process must open a different file to avoid corrupted or interleaved output. Multiple processes writing to the same file concurrently is not safe.
cudssLoggerSetLevel#
-
cudssStatus_t cudssLoggerSetLevel(int level)#
- Sets the logging level to control which messages are output.
- Parameters:
level – [in] [host] Log level. Refer to cuDSS Logging Features for specific logging level values.
- Returns:
[out]
CUDSS_STATUS_SUCCESSon success,CUDSS_STATUS_INVALID_VALUEif level is invalid
cudssLoggerSetMask#
-
cudssStatus_t cudssLoggerSetMask(int mask)#
- Sets the logging mask for fine-grained control over which log types are output.
- Parameters:
mask – [in] [host] Bitwise OR of masks. Refer to cuDSS Logging Features for specific logging mask values.
- Returns:
[out]
CUDSS_STATUS_SUCCESSon success
cudssLoggerForceDisable#
-
cudssStatus_t cudssLoggerForceDisable(void)#
- Permanently and irreversibly disables all logging for the entire process.Once called, logging cannot be re-enabled for the lifetime of the process.
- Returns:
[out]
CUDSS_STATUS_SUCCESSon success
Note
Once this API is called, disables all existing file sinks and registered callbacks immediately. This API takes precedence over all logging configurations (environment variables,
cudssLoggerSetLevel(),cudssLoggerSetMask(), etc.).
Config and Data Object Functions#
cudssConfigCreate#
-
cudssStatus_t cudssConfigCreate(cudssConfig_t *config)#
- The function initializes the cuDSS config object (
cudssConfig_t) which holds the settings of the solver relatedto solving a specific linear system. It allocates light resources on the host.To release the allocated memory,cudssConfigDestroy()must be called.- Parameters:
config – [out] [host] cuDSS config object
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssConfigDestroy#
-
cudssStatus_t cudssConfigDestroy(cudssConfig_t config)#
- The function releases the host resources used by the cuDSS config object. Using the config object after this function call can lead to undefined behavior.
- Parameters:
config – [in] [host] cuDSS config object to be destroyed
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssConfigSet#
- cudssStatus_t cudssConfigSet(
- cudssConfig_t config,
- cudssConfigParam_t param,
- const void *value,
- size_t sizeInBytes
- The function sets a parameter (
cudssConfigParam_t) to the specified value passed by the pointer.- Parameters:
config – [inout] [host] cuDSS config object
param – [in] [host] Parameter to be set
value – [in] [host] A pointer to the value to be set
sizeInBytes – [in] [host] Number of bytes to be read from the pointer
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssConfigGet#
- cudssStatus_t cudssConfigGet(
- const cudssConfig_t config,
- cudssConfigParam_t param,
- void *value,
- size_t sizeInBytes,
- size_t *sizeWritten
- The function retrieves value of a parameter (
cudssConfigParam_t) and saves it to the specified memory location.- Parameters:
config – [in] [host] cuDSS config object
param – [in] [host] Parameter to be retrieved from the config
value – [out] [host] A pointer to the output memory
sizeInBytes – [in] [host] Number of bytes to be written (for verification)
sizeWritten – [out] [host] Valid only when the return value is CUDSS_STATUS_SUCCESS. If
sizeInBytesis non-zero, thensizeWrittenis the number of bytes actually written; ifsizeInBytesis zero:sizeWrittenis the number of bytes needed to write full contents
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssDataCreate#
-
cudssStatus_t cudssDataCreate(cudssHandle_t handle, cudssData_t *data)#
- The function initializes the cuDSS data object (
cudssData_t) which holds the internal data (e.g., LU factors arrays) as well as pointers to user-provided data related to solving a specific linear system. To release the allocated memory,cudssDataDestroy()must be called.- Parameters:
handle – [in] [host] cuDSS library handle
data – [out] [host] cuDSS data object
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssDataDestroy#
-
cudssStatus_t cudssDataDestroy(cudssHandle_t handle, cudssData_t data)#
- The function releases the hardware resources used by the cuDSS data object. Using the data object after this function call can lead to undefined behavior.
- Parameters:
handle – [in] [host] cuDSS library handle
data – [in] [host] cuDSS data object to be destroyed
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssDataSet#
- cudssStatus_t cudssDataSet(
- const cudssHandle_t handle,
- cudssData_t data,
- cudssDataParam_t param,
- const void *value,
- size_t sizeInBytes
- The function sets a parameter (
cudssDataParam_t) to the specified value passed by the pointer.- Parameters:
handle – [in] [host] cuDSS library handle
data – [inout] [host] cuDSS data object
param – [in] [host] Parameter to be set
value – [in] [host] A pointer to the value to be set
sizeInBytes – [in] [host] Number of bytes to be read from the pointer
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssDataGet#
- cudssStatus_t cudssDataGet(
- const cudssHandle_t handle,
- const cudssData_t data,
- cudssDataParam_t param,
- void *value,
- size_t sizeInBytes,
- size_t *sizeWritten
- The function retrieves value of a parameter (
cudssDataParam_t) and saves it to the specified memory location.The output memory buffer can be either on device or on host, a memory copy will be done if necessary.- Parameters:
handle – [in] [host] cuDSS library handle
data – [in] [host] cuDSS data object
param – [in] [host] Parameter to be retrieved from the data object
value – [out] [host/device] A pointer to the output memory
sizeInBytes – [in] [host] Number of bytes to be written (for verification)
sizeWritten – [out] [host] Valid only when the return value is CUDSS_STATUS_SUCCESS. If
sizeInBytesis non-zero, thensizeWrittenis the number of bytes actually written; ifsizeInBytesis zero:sizeWrittenis the number of bytes needed to write full contents
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
Main cuDSS Function#
cudssExecute#
- cudssStatus_t cudssExecute(
- cudssHandle_t handle,
- int phase,
- const cudssConfig_t config,
- cudssData_t data,
- const cudssMatrix_t matrix,
- cudssMatrix_t solution,
- const cudssMatrix_t rhs
- The function executes a phase of the solution process. Prior to calling
cudssExecute(), all objects passed as parameters must already be created and properly initialized.The simplest possible solution process consists of three main phases, analysis, factorization, and solve, following one another. During the analysis phase, reordering and symbolic factorization (preparing the internal data structures) are done. During the factorization phase, numerical factorization is performed and during the solve phase, the factorization is used to find the solution to the linear system.The phases must always happen in the following order:CUDSS_PHASE_REORDERING->CUDSS_PHASE_SYMBOLIC_FACTORIZATION->CUDSS_PHASE_FACTORIZATION-> (optional)CUDSS_PHASE_REFACTORIZATION->CUDSS_PHASE_SOLVE. The optional refactorization is usually skipped before the first solve. Re-using the analysis results is supported. Users can change matrix values and only need to run the (re-)factorization and solve phases.Note: Combining phases is supported as long as the order is followed. As an example, combiningCUDSS_PHASE_REORDERING|CUDSS_PHASE_SOLVEwill result in an error, whileCUDSS_PHASE_ANALYSIS|CUDSS_PHASE_FACTORIZATIONis allowed. Please reviewcudssPhase_tfor the full list of phase parameters.During the execution, the solver configuration properties are read from the
configof typecudssConfig_t. The internal data structures necessary to keep all data required for solving the system (incl. the factors) are kept as a part ofdataobject of typecudssData_t. Users can change the configuration settings or provide additional data parameters (e.g. a user permutation) or query extra information (like memory estimates or number of pivots) before/after the phases of the solution process viacudssConfigSet(), orcudssDataGet(), respectively.The data buffers in the matrix objects for the input matrix, solution and right-hand side matrices must hold device-visible data, unless the hybrid host/device execution mode is enabled. For the hybrid execute mode, matrix, solution or right-hand side can be passed with host memory pointers.Note: The function has the following limitations on thecudssMatrix_tobjects which can be used as call arguments, in addition to the limitations of the corresponding matrix creation routines (e.g.,cudssMatrixCreateCsr()):The system matrix must be sparse (currently, only 3-array CSR format is supported).
Right-hand side and solution matrices must be dense, with column-major layout.
The input sparse matrix (in the batch case, each matrix in the batch) must have data consistent with its description (incl. offsets and indices, indexing base, matrix type)
The input sparse matrix (in the batch case, each matrix in the batch) may have unsorted column indices but must not have repeating entries.
The input sparse matrix in CSR format must have its row offsets array start with the indexing base (0 or 1).
The input sparse matrix may change for the SOLVE phase (except when hybrid execution mode is used) but for distributed matrices, the range of rows (row distribution) must remain the same.
The input sparse matrix, right hand side and solution must have the same datatypes for values and indices (if applicable).
For the non-uniform input batches, batches with varying shapes are supported. E.g.,
nrows,ncols,nnzcan be different for each batch instance. However, the aggregate number of rows and non-zeroes must not exceed theINT_MAXlimit.In MGMN mode all processes must have valid global
nrows,ncols,nnzdata for sparse matrices and valid globalnrows,ncolsand localldfor the dense matrices. In the batch case, all processes must additionally have a validbatchCount.If
cudssMatrixSetDistributionRow1d()is not used then in MGMN mode full matrix data (for the system’s matrix, solution and right-hand side) must be present on the root process, i.e. process withrank= 0 in the provided communicator. The other processes may have the data pointers set toNULL.Otherwise the system’s matrix, solution and right-hand side must be distributed according to the corresponding call to
cudssMatrixSetDistributionRow1d().
Note: using thecudssMatrixViewType_tparameter when creating the sparse input matrix (or a batch of those), one can pass only a triangular portion of the matrix without the need to explicitly change the underlying matrix data. E.g., if themviewis set toCUDSS_MVIEW_UPPER, then during the analysis phase,cudssExecute()will ignore all the indices in the lower part of the matrix, even if the underlying matrix storage represents a full matrix.- Parameters:
handle – [in] [host] cuDSS library handle
phase – [in] [host] Execution phase(s)
config – [in] [host] Solver config object
data – [inout] [host] Solver data object
matrix – [in] [host] Input sparse matrix
solution – [inout] [host] Solution matrix
rhs – [in] [host] Right-hand side matrix
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
Matrix Object Functions#
cudssMatrixCreateDn#
- cudssStatus_t cudssMatrixCreateDn(
- cudssMatrix_t *matrix,
- int64_t nrows,
- int64_t ncols,
- int64_t ld,
- const void *values,
- cudssDataType_t valueType,
- cudssLayout_t layout
- The function creates a matrix object wrapped around dense matrix data. The provided data buffer for the matrix values must hold device-visible data.Note: In MGMN mode all processes must have valid
nrows,ncolsandld.See more limitations for usingcudssMatrix_tobjects in the documentation for the main routine,cudssExecute().- Parameters:
matrix – [out] [host] Created matrix object
nrows – [in] [host] Number of rows
ncols – [in] [host] Number of columns
ld – [in] [host] Leading dimension
values – [in] [device/host] Values of the dense matrix
valueType – [in] [host] Data type of the matrix
layout – [in] [host] Memory layout
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixCreateBatchDn#
- cudssStatus_t cudssMatrixCreateBatchDn(
- cudssMatrix_t *matrix,
- int64_t batchCount,
- const void *nrows,
- const void *ncols,
- const void *ld,
- const void *const *values,
- cudssDataType_t integerType,
- cudssDataType_t valueType,
- cudssLayout_t layout
- The function creates a matrix object wrapped around a batch of dense matrices. The provided data buffer for the matrix values must contain device-visible pointers to device-visible data.Note: cuDSS supports non-uniform batches with varying shapes, e.g
nrows,ncols,ldcan be different for each batch instance.Note: MGMN mode does not support matrix batches.See more limitations for usingcudssMatrix_tobjects in the documentation for the main routine,cudssExecute().- Parameters:
matrix – [out] [host] Created matrix object
batchCount – [in] [host] Size of the batch
nrows – [in] [host] Number of rows for each matrix in the batch
ncols – [in] [host] Number of columns for each matrix in the batch
ld – [in] [host] Leading dimension for each matrix in the batch
values – [in] [device] Pointer to values of each dense matrix in the batch
integerType – [in] [host] Integer type for scalar arrays (nrows, ncols, ld)
valueType – [in] [host] Data type of the matrix
layout – [in] [host] Memory layout
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixCreateCsr#
- cudssStatus_t cudssMatrixCreateCsr(
- cudssMatrix_t *matrix,
- int64_t nrows,
- int64_t ncols,
- int64_t nnz,
- const void *rowStart,
- const void *rowEnd,
- const void *colIndices,
- const void *values,
- cudssDataType_t offsetType,
- cudssDataType_t indexType,
- cudssDataType_t valueType,
- cudssMatrixType_t mtype,
- cudssMatrixViewType_t mview,
- cudssIndexBase_t indexBase
- The function creates a matrix object wrapped around sparse matrix data. The provided data buffers for
rowStart,rowEnd,colIndicesandvaluescan hold either device or host memory pointers. Passing host memory pointers avoids redundant device-to-host data movements when the data resides on the host.Note: host memory input for the input matrix is not supported whenbatchCount> 1, when uniform batch options are enabled, or whenCUDSS_REORDERING_ALG_BTF_COLAMDorCUDSS_REORDERING_ALG_COLAMDis used for reordering.Note: creating a cudssMatrix_t with CSR format does not perform any data consistency checks and thus currently it is the caller’s responsibility to have data description parameters matching the data.Note: In MGMN mode all processes must have validnrows,ncols,nnzdata (in case of distributed matrices, these should correspond to the global matrix)See more limitations for usingcudssMatrix_tobjects in the documentation for the main routine,cudssExecute().- Parameters:
matrix – [out] [host] Created matrix object
nrows – [in] [host] Number of rows
ncols – [in] [host] Number of columns
nnz – [in] [host] Number of non-zeroes
rowStart – [in] [device/host] Row start offsets
rowEnd – [in] [device/host] Row end offsets
colIndices – [in] [device/host] Column indices of the matrix
values – [in] [device/host] Values of the matrix
offsetType – [in] [host] Offset type (
rowStartandrowEnd) of the matrixindexType – [in] [host] Index type of the matrix
valueType – [in] [host] Data type of the matrix
mtype – [in] [host] Matrix type of the matrix
mview – [in] [host] Matrix view of the matrix
indexBase – [in] [host] Indexing base
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixCreateBatchCsr#
- cudssStatus_t cudssMatrixCreateBatchCsr(
- cudssMatrix_t *matrix,
- int64_t batchCount,
- const void *nrows,
- const void *ncols,
- const void *nnz,
- const void *const *rowStart,
- const void *const *rowEnd,
- const void *const *colIndices,
- const void *const *values,
- cudssDataType_t offsetType,
- cudssDataType_t indexType,
- cudssDataType_t valueType,
- cudssMatrixType_t mtype,
- cudssMatrixViewType_t mview,
- cudssIndexBase_t indexBase
- The function creates a matrix object wrapped around a batch of sparse matrices (CSR format). The provided data buffer for
rowStart,rowEnd,colIndicesandvaluesmust contain device-visible pointers to device-visible data.Note: cuDSS supports non-uniform batches with varying shapes, e.gnrows,ncols,nnzcan be different for each batch instance.Note: creating a cudssMatrix_t with a batch of matrices in CSR format does not perform any data consistency checks and thus currently it is the caller’s responsibility to have data description parameters matching the data.See more limitations for usingcudssMatrix_tobjects in the documentation for the main routine,cudssExecute().- Parameters:
matrix – [out] [host] Created matrix object
batchCount – [in] [host] Size of the batch
nrows – [in] [host] Number of rows for each matrix in the batch
ncols – [in] [host] Number of columns for each matrix in the batch
nnz – [in] [host] Numbers of non-zeroes for each matrix in the batch
rowStart – [in] [device] Pointer to row start offsets for each matrix in the batch
rowEnd – [in] [device] Pointer to row end offsets for each matrix in the batch
colIndices – [in] [device] Pointer to column indices for each matrix in the batch
values – [in] [device] Pointer to values of each CSR matrix in the batch
offsetType – [in] [host] Offset type (
rowStartandrowEnd) of the matrices in the batchindexType – [in] [host] Index type of the matrices in the batch
valueType – [in] [host] Data type of the matrices in the batch
mtype – [in] [host] Matrix type of the matrix
mview – [in] [host] Matrix view of the matrix
indexBase – [in] [host] Indexing base
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixDestroy#
-
cudssStatus_t cudssMatrixDestroy(cudssMatrix_t matrix)#
- The function releases memory associated with the matrix wrapper. As cuDSS matrix objects are only lightweight wrappers around the user data, the user data remains untouched.
- Parameters:
matrix – [in] [host] cuDSS matrix object
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixSetValues#
- cudssStatus_t cudssMatrixSetValues(
- cudssMatrix_t matrix,
- const void *values
- The function resets the pointer to values inside the cuDSS matrix object to the provided buffer. The provided data buffer must hold device-visible data.
- Parameters:
matrix – [inout] [host] cuDSS matrix object
values – [in] [device/host] Buffer with the new matrix values
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixSetBatchValues#
- cudssStatus_t cudssMatrixSetBatchValues(
- cudssMatrix_t matrix,
- const void *const *values
- The function resets the pointer to values inside the cuDSS matrix object to the provided buffer. The provided data buffer for the matrix values must contain device-visible pointers to device-visible data.
- Parameters:
matrix – [inout] [host] cuDSS matrix object
values – [in] [device/host] Pointer to the new values for each dense matrix in the batch
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixSetCsrPointers#
- cudssStatus_t cudssMatrixSetCsrPointers(
- cudssMatrix_t matrix,
- const void *rowStart,
- const void *rowEnd,
- const void *colIndices,
- const void *values
- The function resets the CSR pointers inside the cuDSS matrix object to the provided buffers. The provided data buffers must hold device-visible data.
- Parameters:
matrix – [inout] [host] cuDSS matrix object
rowStart – [in] [device/host] Buffer with the new row start offsets
rowEnd – [in] [device/host] Buffer with the new row end offsets
colIndices – [in] [device/host] Buffer with the new column indices
values – [in] [device/host] Buffer with the new matrix values
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixSetBatchCsrPointers#
- cudssStatus_t cudssMatrixSetBatchCsrPointers(
- cudssMatrix_t matrix,
- const void *const *rowStart,
- const void *const *rowEnd,
- const void *const *colIndices,
- const void *const *values
- The function resets the CSR pointers inside the cuDSS matrix object to the provided buffers. The provided data buffer for
rowStart,rowEnd,colIndicesandvaluesmust contain device-visible pointers to device-visible data.- Parameters:
matrix – [inout] [host] cuDSS matrix object
rowStart – [in] [device/host] Pointer to the new row start offsets for each CSR matrix in the batch
rowEnd – [in] [device/host] Pointer to the new row end offsets for each CSR matrix in the batch
colIndices – [in] [device/host] Pointer to the new column indices for each CSR matrix in the batch
values – [in] [device/host] Pointer to the new values for each CSR matrix in the batch
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixSetDistributionRow1d#
- cudssStatus_t cudssMatrixSetDistributionRow1d(
- cudssMatrix_t matrix,
- int64_t first_row,
- int64_t last_row
- The function sets the 1D distribution for the matrix (CSR or Dense) for the MGMN mode. The provided
first_rowandlast_rowmust be always 0-based and specify the first and the last (included) row indices of the local matrix on the calling process. Settingfirst_row>last_rowmeans that the local matrix is empty for the calling process.Note: input sparse matrix, right-hand-size and solution can have a different (from each other) distribution. For example, only the sparse matrix can be distributed, while right-hand-size or solution are not.
Note: if the sparse matrix or right-hand side are distributed with an overlap (between processes) then the overlapped part will be summed up, that is the overlapped part will have a contribution from all related processes. if the solution is distributed with overlapping (between processes) then the overlapped part will have the same values on corresponding processes.
- Parameters:
matrix – [inout] [host] cuDSS matrix object
first_row – [in] [host] first row index of the local matrix on the calling process
last_row – [in] [host] last row index of the local matrix on the calling process
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixGetDn#
- cudssStatus_t cudssMatrixGetDn(
- const cudssMatrix_t matrix,
- int64_t *nrows,
- int64_t *ncols,
- int64_t *ld,
- void **values,
- cudssDataType_t *valueType,
- cudssLayout_t *layout
- The function retrieves dense matrix properties and data from a cuDSS matrix object which holds a dense matrix. If on input any of the pointers is NULL, it is ignored and the corresponding value is not returned.
- Parameters:
matrix – [in] [host] cuDSS matrix object
nrows – [out] [host] Buffer for the number of rows
ncols – [out] [host] Buffer for the number of columns
ld – [out] [host] Buffer for the leading dimension
values – [out] [device/host] Buffer for the values of the matrix
valueType – [out] [host] Buffer for the data type of the matrix
layout – [out] [host] Buffer for the memory layout
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixGetBatchDn#
- cudssStatus_t cudssMatrixGetBatchDn(
- const cudssMatrix_t matrix,
- int64_t *batchCount,
- void **nrows,
- void **ncols,
- void **ld,
- void ***values,
- cudssDataType_t *indexType,
- cudssDataType_t *valueType,
- cudssLayout_t *layout
- The function retrieves dense matrix properties and data from a cuDSS matrix object which holds a batch of dense matrices. If on input any of the pointers is NULL, it is ignored and the corresponding value is not returned.
- Parameters:
matrix – [in] [host] cuDSS matrix object
batchCount – [out] [host] Buffer for the size of the batch
nrows – [out] [host] Pointer to the number of rows for each dense matrix in the batch
ncols – [out] [host] Pointer to the number of columns for dense matrix in the batch
ld – [out] [host] Pointer to the leading dimension for each dense matrix in the batch
values – [out] [device] Pointer to the values of each dense matrix in the batch
indexType – [out] [host] Buffer for the index type of the matrix
valueType – [out] [host] Buffer for the data type of the matrix
layout – [out] [host] Buffer for the memory layout
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixGetCsr#
- cudssStatus_t cudssMatrixGetCsr(
- const cudssMatrix_t matrix,
- int64_t *nrows,
- int64_t *ncols,
- int64_t *nnz,
- void **rowStart,
- void **rowEnd,
- void **colIndices,
- void **values,
- cudssDataType_t *offsetType,
- cudssDataType_t *indexType,
- cudssDataType_t *valueType,
- cudssMatrixType_t *mtype,
- cudssMatrixViewType_t *mview,
- cudssIndexBase_t *indexBase
- The function retrieves sparse matrix properties and data from a cuDSS matrix object which holds a CSR sparse matrix. If on input any of the pointers is NULL, it is ignored and the corresponding value is not returned.
- Parameters:
matrix – [in] [host] Matrix object
nrows – [out] [host] Buffer for the number of rows
ncols – [out] [host] Buffer for the number of columns
nnz – [out] [host] Buffer for the number of non-zeroes
rowStart – [out] [device/host] Buffer for the row start offsets
rowEnd – [out] [device/host] Buffer for the row end offsets
colIndices – [out] [device/host] Buffer for the column indices of the matrix
values – [out] [device/host] Buffer for the values of the CSR matrix
offsetType – [out] [host] Buffer for the offset type (
rowStartandrowEnd) of the matrixindexType – [out] [host] Buffer for the index type of the matrix
valueType – [out] [host] Buffer for the data type of the matrix
mtype – [out] [host] Matrix type of the matrix
mview – [out] [host] Matrix view of the matrix
indexBase – [out] [host] Buffer for the indexing base
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixGetBatchCsr#
- cudssStatus_t cudssMatrixGetBatchCsr(
- const cudssMatrix_t matrix,
- int64_t *batchCount,
- void **nrows,
- void **ncols,
- void **nnz,
- void ***rowStart,
- void ***rowEnd,
- void ***colIndices,
- void ***values,
- cudssDataType_t *offsetType,
- cudssDataType_t *indexType,
- cudssDataType_t *valueType,
- cudssMatrixType_t *mtype,
- cudssMatrixViewType_t *mview,
- cudssIndexBase_t *indexBase
- The function retrieves sparse matrix properties and data from a cuDSS matrix object which holds a batch of CSR sparse matrices. If on input any of the pointers is NULL, it is ignored and the corresponding value is not returned.
- Parameters:
matrix – [in] [host] Matrix object
batchCount – [out] [host] Buffer for the size of the batch
nrows – [out] [host] Pointer to the number of rows for each CSR matrix in the batch
ncols – [out] [host] Pointer to the number of columns for each CSR matrix in the batch
nnz – [out] [host] Pointer to the number of non-zeroes for each CSR matrix in the batch
rowStart – [out] [device] Pointer to the row start offsets for each CSR matrix in the batch
rowEnd – [out] [device] Pointer to the row end offsets for each CSR matrix in the batch
colIndices – [out] [device] Pointer to the column indices for each CSR matrix in the batch
values – [out] [device] Pointer to the values for each CSR matrix in the batch
offsetType – [out] [host] Buffer for the offset type (
rowStartandrowEnd) of the matrices in the batchindexType – [out] [host] Buffer for the index type of the matrices in the batch
valueType – [out] [host] Buffer for the data type of the matrices in the batch
mtype – [out] [host] Matrix type of the matrix
mview – [out] [host] Matrix view of the matrix
indexBase – [out] [host] Buffer for the indexing base
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixGetDistributionRow1d#
- cudssStatus_t cudssMatrixGetDistributionRow1d(
- const cudssMatrix_t matrix,
- int64_t *first_row,
- int64_t *last_row
- The function retrieves 1D distribution boundaries (first and last row indices) from a cuDSS matrix object. For the non-MGMN mode the function returns
first_rowequals to 0 andlast_rowequal tonrows - 1.- Parameters:
matrix – [in] [host] cuDSS matrix object
first_row – [out] [host] first row index of the local matrix on the calling process
last_row – [out] [host] last row index of the local matrix on the calling process
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.
cudssMatrixGetFormat#
- cudssStatus_t cudssMatrixGetFormat(
- const cudssMatrix_t matrix,
- int *format
- The function returns into the provided buffer the matrix format (as an
intwhich can be a combination of bit flags defined incudssMatrixFormat_t) of the cuDSS matrix object.- Parameters:
matrix – [in] [host] cuDSS matrix object
format – [out] [host] Buffer for the returned matrix format
- Returns:
[out] The error status of the invocation. Must return
CUDSS_STATUS_SUCCESSon success and some other status code otherwise.