Generic API Functions#
nvpl_sparse_spmv()#
nvpl_sparse_status_t
nvpl_sparse_spmv_create_descr(nvpl_sparse_spmv_descr_t* descr)
nvpl_sparse_status_t
nvpl_sparse_spmv_destroy_descr(nvpl_sparse_spmv_descr_t descr)
nvpl_sparse_status_t
nvpl_sparse_spmv_buffer_size(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_vec_descr_t vec_X,
const void* beta,
nvpl_sparse_dn_vec_descr_t vec_Y,
nvpl_sparse_dn_vec_descr_t vec_Z,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spmv_alg_t alg,
nvpl_sparse_spmv_descr_t spmv_descr,
size_t* buffer_size)
nvpl_sparse_status_t
nvpl_sparse_spmv_analysis(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_vec_descr_t vec_X,
const void* beta,
nvpl_sparse_dn_vec_descr_t vec_Y,
nvpl_sparse_dn_vec_descr_t vec_Z,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spmv_alg_t alg,
nvpl_sparse_spmv_descr_t spmv_descr,
void* external_buffer)
nvpl_sparse_status_t
nvpl_sparse_spmv(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_vec_descr_t vec_X,
const void* beta,
nvpl_sparse_dn_vec_descr_t vec_Y,
nvpl_sparse_dn_vec_descr_t vec_Z,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spmv_alg_t alg,
nvpl_sparse_spmv_descr_t spmv_descr)
This function performs the multiplication of a sparse matrix mat_A and a dense vector vec_X
\(\mathbf{Z} = \alpha op\left( \mathbf{A} \right) \cdot \mathbf{X} + \beta\mathbf{Y}\) |
where
op(A)is a sparse matrix of size \(m \times k\)Xis a dense vector of size \(k\)Yis a dense vector of size \(m\)Zis a dense vector of size \(m\)\(\alpha\) and \(\beta\) are scalars
Also, for matrix A
\(\text{op}(A) == \begin{cases} A & \text{if}\; op = {\small{\texttt{NVPL_SPARSE_OPERATION_NON_TRANSPOSE}}} \\ A^{T} & \text{if}\; op = {\small{\texttt{NVPL_SPARSE_OPERATION_TRANSPOSE}}} \\ A^{H} & \text{if}\; op = {\small{\texttt{NVPL_SPARSE_OPERATION_CONJUGATE_TRANSPOSE}}} \\ \end{cases}\)
Routine usage#
To use this routine, you should:
Create a descriptor using
nvpl_sparse_spmv_create_descr(). The opaque data structurespmv_descris used to share information among all functions.Call
nvpl_sparse_spmv_buffer_size()to get the size of the workspace needed bynvpl_sparse_spmv_analysis().Allocate a workspace buffer of at least
buffer_sizebytes. The buffer must remain valid until the execution (nvpl_sparse_spmv()) is complete, and should not be modified between the analysis and execution steps.Call
nvpl_sparse_spmv_analysis()to perform the analysis.Call
nvpl_sparse_spmv()to perform the multiplication. This step can be performed multiple times with different right hand side vectorsvec_Yand output vectorsvec_Z.Destroy the descriptor using
nvpl_sparse_spmv_destroy_descr().
Optionally, you can omit the analysis step and call nvpl_sparse_spmv() directly after creating the descriptor.
In this case, the performance might be degraded.
It is recommended not to omit the analysis step.
Note
When nvpl_sparse_spmv_analysis() is called, the operation type (op_A) and the number of OpenMP
threads must match exactly when nvpl_sparse_spmv() is called later with the same descriptor.
Calling nvpl_sparse_spmv() with a different op_A or a different thread count than was used
during analysis will return NVPL_SPARSE_STATUS_INVALID_VALUE.
Parameters#
Param. |
In/out |
Meaning |
|---|---|---|
|
IN |
Handle to the NVPL Sparse library context |
|
IN |
Operation |
|
IN |
\(\alpha\) scalar used for multiplication of type |
|
IN |
Sparse matrix |
|
IN |
Dense vector |
|
IN |
\(\beta\) scalar used for multiplication of type |
|
IN |
Dense vector |
|
OUT |
Dense vector |
|
IN |
Datatype in which the computation is executed |
|
IN |
Algorithm for the computation |
|
OUT |
Number of bytes of workspace needed by |
|
IN/OUT |
Pointer to a workspace buffer of at least |
|
IN/OUT |
Opaque descriptor for storing internal data used across the three steps |
Supported types and formats#
The sparse matrix formats currently supported are listed below:
NVPL_SPARSE_FORMAT_COONVPL_SPARSE_FORMAT_CSRNVPL_SPARSE_FORMAT_CSCNVPL_SPARSE_FORMAT_SLICED_ELL
nvpl_sparse_spmv supports the following index type for representing the sparse matrix mat_A:
32-bit indices (
NVPL_SPARSE_INDEX_32I)64-bit indices (
NVPL_SPARSE_INDEX_64I)
nvpl_sparse_spmv() supports the following data types:
Uniform-precision computation:
|
|---|
|
|
|
|
nvpl_sparse_spmv() supports the following algorithms:
Algorithm |
Notes |
|---|---|
|
Default algorithm for any sparse matrix format. |
|
Default algorithm for COO sparse matrix format. May produce slightly different results during different runs with the same input parameters. |
|
Algorithm for CSR/CSC sparse matrix format with fast analysis step. Provides deterministic (bit-wise) results for each run. |
|
Algorithm for CSR/CSC sparse matrix format with more extensive analysis step. Can yield better performance than |
|
Default algorithm for Sliced Ellpack sparse matrix format. Provides deterministic (bit-wise) results for each run. |
Performance#
NVPL_SPARSE_SPMV_CSR_ALG1provides higher performance thanNVPL_SPARSE_SPMV_COO_ALG1.If the analysis step is omitted, the performance might be degraded.
nvpl_sparse_spmv_create_descr()allocates a small amount of memory for the descriptor but does not perform any expensive computations.
Notes#
The routine allows the CSR column indices (or CSC row indices) of
matAto be unsorted.
See nvpl_sparse_status_t for the description of the return status.
nvpl_sparse_spsv()#
nvpl_sparse_status_t
nvpl_sparse_spsv_create_descr(nvpl_sparse_spsv_descr_t* descr)
nvpl_sparse_status_t
nvpl_sparse_spsv_destroy_descr(nvpl_sparse_spsv_descr_t descr)
nvpl_sparse_status_t
nvpl_sparse_spsv_buffer_size(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_vec_descr_t vec_X,
nvpl_sparse_dn_vec_descr_t vec_Y,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spsv_alg_t alg,
nvpl_sparse_spsv_descr_t spsv_descr,
size_t* buffer_size)
nvpl_sparse_status_t
nvpl_sparse_spsv_analysis(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_vec_descr_t vec_X,
nvpl_sparse_dn_vec_descr_t vec_Y,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spsv_alg_t alg,
nvpl_sparse_spsv_descr_t spsv_descr,
void* external_buffer)
nvpl_sparse_status_t
nvpl_sparse_spsv_solve(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_vec_descr_t vec_X,
nvpl_sparse_dn_vec_descr_t vec_Y,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spsv_alg_t alg,
nvpl_sparse_spsv_descr_t spsv_descr)
The function solves a system of linear equations whose coefficients are represented in a sparse triangular matrix:
\(op\left( \mathbf{A} \right) \cdot \mathbf{Y} = \alpha\mathbf{X}\) |
where
op(A)is a sparse square matrix of size \(m \times m\)Xis a dense vector of size \(m\)Yis a dense vector of size \(m\)\(\alpha\) is a scalar
Also, for matrix A
\(\text{op}(A) = \begin{cases} A & \text{if}\; op = {\small{\texttt{NVPL_SPARSE_OPERATION_NON_TRANSPOSE}}} \\ A^{T} & \text{if}\; op = {\small{\texttt{NVPL_SPARSE_OPERATION_TRANSPOSE}}} \\ A^{H} & \text{if}\; op = {\small{\texttt{NVPL_SPARSE_OPERATION_CONJUGATE_TRANSPOSE}}} \\ \end{cases}\)
Routine usage#
To use this routine, you should:
Create a descriptor using
nvpl_sparse_spsv_create_descr(). The opaque data structurespsv_descris used to share information among all functions.Call
nvpl_sparse_spsv_buffer_size()to get the size of the workspace needed bynvpl_sparse_spsv_analysis().Allocate a workspace buffer of at least
buffer_sizebytes. The buffer must remain valid until the execution (nvpl_sparse_spsv_solve()) is complete, and should not be modified between the analysis and execution steps.Call
nvpl_sparse_spsv_analysis()to perform the analysis.Call
nvpl_sparse_spsv_solve()to execute the solve phase. This step can be performed multiple times with different right hand side vectors.Destroy the descriptor using
nvpl_sparse_spsv_destroy_descr().
Analysis step is mandatory for this routine and cannot be omitted.
The function nvpl_sparse_spsv_update_matrix() can be used to update spsv_descr with new matrix values.
All parameters must be consistent across nvpl_sparse_spsv API calls and the matrix descriptors.
Parameters#
Param. |
Memory |
Meaning |
|---|---|---|
|
IN |
Handle to the NVPL Sparse library context |
|
IN |
Operation |
|
IN |
\(\alpha\) scalar used for multiplication of type |
|
IN |
Sparse matrix |
|
IN |
Dense vector |
|
IN/OUT |
Dense vector |
|
IN |
Datatype in which the computation is executed |
|
IN |
Algorithm for the computation |
|
OUT |
Number of bytes of workspace needed by |
|
IN/OUT |
Pointer to a workspace buffer of at least |
|
IN/OUT |
Opaque descriptor for storing internal data used across the three steps |
Matrix update#
nvpl_sparse_status_t
nvpl_sparse_spsv_update_matrix(nvpl_sparse_handle_t handle,
nvpl_sparse_spsv_descr_t spsv_descr,
void* new_values,
nvpl_sparse_spsv_update_t update_part)
nvpl_sparse_spsv_update_matrix() updates the sparse matrix after calling the analysis phase. This functions supports the following update strategies (update_part):
Strategy |
Notes |
|---|---|
|
Updates the sparse matrix values with values of |
|
Updates the diagonal part of the matrix with diagonal values stored in |
Supported types and formats#
The sparse matrix formats currently supported are listed below:
NVPL_SPARSE_FORMAT_CSRNVPL_SPARSE_FORMAT_COONVPL_SPARSE_FORMAT_SLICED_ELL
The nvpl_sparse_spsv supports the following shapes and properties:
NVPL_SPARSE_FILL_MODE_LOWERandNVPL_SPARSE_FILL_MODE_UPPERfill modesNVPL_SPARSE_DIAG_TYPE_NON_UNITandNVPL_SPARSE_DIAG_TYPE_UNITdiagonal types
nvpl_sparse_spsv supports the following index type for representing the sparse matrix mat_A:
32-bit indices (
NVPL_SPARSE_INDEX_32I)64-bit indices (
NVPL_SPARSE_INDEX_64I)
nvpl_sparse_spsv supports the following data types:
Uniform-precision computation:
|
|---|
|
|
|
|
nvpl_sparse_spsv supports the following algorithms:
Algorithm |
Notes |
|---|---|
|
Default algorithm |
Performance#
nvpl_sparse_spsv_create_descr()allocates a small amount of memory for the descriptor but does not perform any expensive computations.
Notes#
The routine requires extra storage memory (see
nvpl_sparse_spsv_buffer_size()) for the analysis phase which is proportional to number of non-zero entries of the sparse matrix.Provides deterministic (bit-wise) results for each run for the solving phase
nvpl_sparse_spsv_solve().The routine supports in-place operation
nvpl_sparse_spsv_buffer_size()andnvpl_sparse_spsv_analysis()routines acceptNULLforvec_Xandvec_YThe routine allows the CSR column indices (or CSC row indices) of
matAto be unsorted.
See nvpl_sparse_status_t for the description of the return status.
nvpl_sparse_spmm()#
nvpl_sparse_status_t
nvpl_sparse_spmm_create_descr(nvpl_sparse_spmm_descr_t* descr)
nvpl_sparse_status_t
nvpl_sparse_spmm_destroy_descr(nvpl_sparse_spmm_descr_t descr)
nvpl_sparse_status_t
nvpl_sparse_spmm_buffer_size(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
nvpl_sparse_operation_t op_B,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_mat_descr_t mat_B,
const void* beta,
nvpl_sparse_const_dn_mat_descr_t mat_C,
nvpl_sparse_dn_mat_descr_t mat_D,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spmm_alg_t alg,
nvpl_sparse_spmm_descr_t spmm_descr,
size_t* buffer_size)
nvpl_sparse_status_t
nvpl_sparse_spmm_analysis(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
nvpl_sparse_operation_t op_B,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_mat_descr_t mat_B,
const void* beta,
nvpl_sparse_const_dn_mat_descr_t mat_C,
nvpl_sparse_dn_mat_descr_t mat_D,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spmm_alg_t alg,
nvpl_sparse_spmm_descr_t spmm_descr,
void* external_buffer)
nvpl_sparse_status_t
nvpl_sparse_spmm(nvpl_sparse_handle_t handle,
nvpl_sparse_operation_t op_A,
nvpl_sparse_operation_t op_B,
const void* alpha,
nvpl_sparse_const_sp_mat_descr_t mat_A,
nvpl_sparse_const_dn_mat_descr_t mat_B,
const void* beta,
nvpl_sparse_const_dn_mat_descr_t mat_C,
nvpl_sparse_dn_mat_descr_t mat_D,
nvpl_sparse_data_type_t compute_type,
nvpl_sparse_spmm_alg_t alg,
nvpl_sparse_spmm_descr_t spmm_descr)
This function performs the multiplication of a sparse matrix mat_A and a dense matrix mat_B
\(\mathbf{D} = \alpha op_A\left( \mathbf{A} \right) \cdot op_B\left( \mathbf{B} \right) + \beta\mathbf{C}\) |
where
\(op_A(A)\) is a sparse matrix of size \(m \times k\)
\(op_B(B)\) is a dense matrix of size \(k \times n\)
\(C\) is a dense matrix of size \(m \times n\)
\(D\) is a dense matrix of size \(m \times n\)
\(\alpha\) and \(\beta\) are scalars
Also, for matrix A:
\(op_A(A) == \begin{cases} A & \text{if}\; op_A = {\small{\texttt{NVPL_SPARSE_OPERATION_NON_TRANSPOSE}}} \\ A^{T} & \text{if}\; op_A = {\small{\texttt{NVPL_SPARSE_OPERATION_TRANSPOSE}}} \\ A^{H} & \text{if}\; op_A = {\small{\texttt{NVPL_SPARSE_OPERATION_CONJUGATE_TRANSPOSE}}} \\ \end{cases}\)
and similarly for matrix B:
\(op_B(B) == \begin{cases} B & \text{if}\; op_B = {\small{\texttt{NVPL_SPARSE_OPERATION_NON_TRANSPOSE}}} \\ B^{T} & \text{if}\; op_B = {\small{\texttt{NVPL_SPARSE_OPERATION_TRANSPOSE}}} \\ B^{H} & \text{if}\; op_B = {\small{\texttt{NVPL_SPARSE_OPERATION_CONJUGATE_TRANSPOSE}}} \\ \end{cases}\)
The function nvpl_sparse_spmm_buffer_size() computes the size of the workspace needed by nvpl_sparse_spmm_analysis() and nvpl_sparse_spmm().
Routine usage#
To use this routine, you should:
Create a descriptor using
nvpl_sparse_spmm_create_descr(). The opaque data structurespmm_descris used to share information among all functions.Call
nvpl_sparse_spmm_buffer_size()to get the size of the workspace needed bynvpl_sparse_spmm_analysis().Allocate a workspace buffer of at least
buffer_sizebytes. The buffer must remain valid until the execution (nvpl_sparse_spmm()) is complete, and should not be modified between the analysis and execution steps.Call
nvpl_sparse_spmm_analysis()to perform the analysis.Call
nvpl_sparse_spmm()to perform the multiplication. This step can be performed multiple times with different dense matricesmat_B,mat_Cand output dense matrixmat_D.Destroy the descriptor using
nvpl_sparse_spmm_destroy_descr().
Analysis step is mandatory for this routine and cannot be omitted.
Parameters#
Param. |
In/out |
Meaning |
|---|---|---|
|
IN |
Handle to the NVPL Sparse library context |
|
IN |
Operation |
|
IN |
Operation |
|
IN |
\(\alpha\) scalar used for multiplication of type |
|
IN |
Sparse matrix |
|
IN |
Dense matrix |
|
IN |
\(\beta\) scalar used for multiplication of type |
|
IN |
Dense matrix |
|
OUT |
Dense matrix |
|
IN |
Datatype in which the computation is executed |
|
IN |
Algorithm for the computation |
|
OUT |
Number of bytes of workspace needed by |
|
IN/OUT |
Pointer to a workspace buffer of at least |
|
IN/OUT |
Opaque descriptor for storing internal data used across the three steps |
Supported types and formats#
The sparse matrix formats currently supported are listed below:
NVPL_SPARSE_FORMAT_COONVPL_SPARSE_FORMAT_CSRNVPL_SPARSE_FORMAT_CSC
nvpl_sparse_spmm() supports the following index type for representing the sparse matrix mat_A:
32-bit indices (
NVPL_SPARSE_INDEX_32I)64-bit indices (
NVPL_SPARSE_INDEX_64I)
nvpl_sparse_spmm() supports the following data types in uniform-precision computation:
|
|---|
|
|
|
|
nvpl_sparse_spmm() supports the following algorithms:
Algorithm |
Notes |
|---|---|
|
Default algorithm for any sparse matrix format. |
|
Default algorithm for COO sparse matrix format. |
|
Provides the best performance for CSR when
|
|
Provides the best performance for CSR when
|
Performance#
nvpl_sparse_spmm_create_descr()allocates a small amount of memory for the descriptor but does not perform any expensive computations.
Notes#
nvpl_sparse_spmm() has the following properties:
Usage of
nvpl_sparse_spmm_buffer_sizeandnvpl_sparse_spmm_analysisis required before callingnvpl_sparse_spmm. Otherwise, the routine will returnNVPL_SPARSE_STATUS_NOT_SUPPORTED.For COO format, the routine requires the indices of
mat_Ato be sorted by row indices.The routine allows the CSR column indices (or CSC row indices) of
matAto be unsorted.
See nvpl_sparse_status_t for the description of the return status.