DirectSolver#
-
class nvmath.
sparse. advanced. DirectSolver( - a,
- b,
- *,
- options: DirectSolverOptions | None = None,
- execution: ExecutionCUDA | ExecutionHybrid | None = None,
- stream: AnyStream | int | None = None,
Create a stateful object that encapsulates the specified sparse direct solver computations and required resources. This object ensures the validity of resources during use and releases them when they are no longer needed to prevent misuse.
This object encompasses all functionalities of function-form API
direct_solver()
, which is a convenience wrapper around it. The stateful object also allows for the amortization of preparatory costs when the same solve operation is to be performed with different left-hand side (LHS) and right-hand side (RHS) with the same problem specification (seereset_operands()
for more details).Using the stateful object typically involves the following steps:
Problem Specification: Initialize the object with the defined operation and options.
Preparation: Use
plan()
for reordering to minimize fill-in and symbolic factorization for this specific direct sparse solver operation.Execution: Factorize and solve the system.
Resource Management: Ensure all resources are released either by explicitly calling
free()
or by managing the stateful object within a context manager.
Detailed information on each step described above can be obtained by passing in a
logging.Logger
object toDirectSolverOptions
or by setting the appropriate options in the root logger object, which is used by default:>>> import logging >>> logging.basicConfig( ... level=logging.INFO, ... format="%(asctime)s %(levelname)-8s %(message)s", ... datefmt="%m-%d %H:%M:%S", ... )
- Parameters:
a – The sparse operand (or sequence of operands) representing the left-hand side (LHS) of the system of equations. The LHS operand may be a (sequence of)
scipy.sparse.csr_matrix
,scipy.sparse.csr_array
,cupyx.scipy.sparse.csr_matrix
, ortorch.sparse_csr_tensor()
. That is, the LHS is a sparse matrix or tensor in Compressed Sparse Row (CSR) format from one of the supported packages: SciPy, CuPy, PyTorch. Refer to the semantics section for details.b – The ndarray/tensor or (sequence of ndarray/tensors) representing the dense right-hand side (RHS) of the system of equations. The RHS operand may be a (sequence of)
numpy.ndarray
,cupy.ndarray
, andtorch.Tensor
. Refer to the semantics section for details.options – Specify options for the direct sparse solver as a
DirectSolverOptions
object. Alternatively, adict
containing the parameters for theDirectSolverOptions
constructor can also be provided. If not specified, the value will be set to the default-constructedDirectSolverOptions
object.execution – Specify execution space options for the direct solver as a
ExecutionCUDA
orExecutionHybrid
object. Alternatively, a string (‘cuda’ or ‘hybrid’), or adict
with the ‘name’ key set to ‘cuda’ or ‘hybrid’ and optional parameters relevant to the given execution space. The default execution space is ‘cuda’ and the correspondingExecutionCUDA
object will be default-constructed.stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream from the operand package will be used.
See also
DirectSolverPlanConfig
,DirectSolverFactorizationConfig
,DirectSolverSolutionConfig
,DirectSolverPlanInfo
,DirectSolverFactorizationInfo
,DirectSolverOptions
,ExecutionCUDA
,ExecutionHybrid
,plan()
,reset_operands()
,factorize()
,solve()
.Examples
>>> import numpy as np >>> import scipy.sparse as sp >>> import nvmath
Create a sparse float64 ndarray in CSR format on the CPU for the LHS.
>>> n = 16 >>> a = sp.random_array((n, n), density=0.5, format="csr", dtype="float64")
Ensure that the randomly-generated LHS is not singular.
>>> a += sp.diags_array([2.0] * n, format="csr", dtype="float64")
The RHS can be a vector or matrix. Here we create a random vector.
>>> b = np.random.rand(n).astype(dtype="float64")
We will define a sparse direct solver operation for solving the system a @ x = b using the specialized sparse direct solver interface.
>>> solver = nvmath.sparse.advanced.DirectSolver(a, b)
Options can be provided above to control the behavior of the operation using the
options
argument (seeDirectSolverOptions
).Next, plan the operation. The planning operation can be configured through the
DirectSolverPlanConfig
interface, which is accessed throughplan_config
.>>> plan_config = solver.plan_config
Here we set the reordering algorithm to choice 1.
>>> AlgType = nvmath.sparse.advanced.DirectSolverAlgType >>> plan_config.reordering_algorithm = AlgType.ALG_1
Plan the operation, which reorders the system to minimize fill-in and performs the symbolic factorization. Planning returns a
DirectSolverPlanInfo
object, whose attributes (such as row or column permutation) can be queried.>>> plan_info = solver.plan() >>> plan_info.col_permutation array([ 0, 1, 8, 9, 2, 3, 4, 11, 15, 5, 10, 6, 12, 13, 7, 14], dtype=int32)
The next step is to perform the numerical factorization of the system using
factorize()
. Similar to planning, the numerical factorization step can also be configured if desired.>>> fac_config = solver.factorization_config
Here we set the pivot epsilon value to
1e-14
, instead of the default1e-13
.>>> fac_config.pivot_eps = 1e-14
Factorize the system, which returns a
DirectSolverFactorizationInfo
object, whose attributes can be inspected. We print the Sylvester inertia here.>>> fac_info = solver.factorize() >>> fac_info.inertia array([0, 0], dtype=int32)
Now solve the factorized system, and obtain the result
x
as a NumPy ndarray on the CPU.>>> x = solver.solve()
Finally, free the object’s resources. To avoid having to explicitly making this call, it’s recommended to use the DirectSolver object as a context manager as shown below, if possible.
>>> solver.free()
Note
All
DirectSolver
methods execute on the package current stream by default. Alternatively, thestream
argument can be used to run a method on a specified stream.Let’s now look at a batched solve with PyTorch operands on the GPU.
Create a 3D complex128 PyTorch sparse tensor on the GPU representing the LHS, along with the corresponding RHS:
>>> import torch >>> n = 8 >>> batch = 2 >>> device_id = 0
Prepare sample input data. Create a diagonally-dominant random CSR matrix.
>>> a = torch.rand(n, n, dtype=torch.complex128) + torch.diag(torch.tensor([2.0] * n)) >>> a = torch.stack([a] * batch, dim=0) >>> a = a.to_sparse_csr()
Important
PyTorch uses int64 for index buffers, whereas cuDSS currently requires int32. So we’ll have to convert the indices.
>>> a = torch.sparse_csr_tensor( ... a.crow_indices().to(dtype=torch.int32), ... a.col_indices().to(dtype=torch.int32), ... a.values(), ... size=a.size(), ... device=device_id, ... )
Create the RHS, which can be a matrix or vector in column-major layout.
>>> b = torch.ones(batch, 3, n, dtype=torch.complex128, device=device_id) >>> b = b.permute(0, 2, 1)
Create a
DirectSolver
object encapsulating the problem specification described earlier and use it as a context manager.>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver: ... plan_info = solver.plan() ... ... # Factorize the system. ... fac_info = solver.factorize() ... ... # Solve the factorized system. ... x1 = solver.solve() ... ... # Update the RHS b in-place (see reset_operands() for an alternative). ... b[...] = torch.rand(*b.shape, dtype=torch.complex128, device=device_id) ... ... # Solve again to get the new result. ... x2 = solver.solve()
All the resources used by the object are released at the end of the block.
Batching can be implicitly-specified as shown above, or explicitly-specified as a sequence for both the LHS and the RHS. This, as well as other options and usage patterns, are illustrated in the examples found in the nvmath/examples/sparse/advanced/direct_solver directory.
Methods
- __init__(
- a,
- b,
- *,
- options: DirectSolverOptions | None = None,
- execution: ExecutionCUDA | ExecutionHybrid | None = None,
- stream: AnyStream | int | None = None,
- factorize(
- *,
- stream: AnyStream | None = None,
Factorize the system of equations. Numerical factorization is required each time the values in the LHS change, unless (for example) iterative refinement is used during the solve and it converges (see
solution_config
).This phase can be optionally configured through the property
factorization_config
(an object of typeDirectSolverFactorizationConfig
). Factorization returns aDirectSolverFactorizationInfo
object, which can also be accessed through the propertyfactorization_info
.- Parameters:
stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream from the operand package will be used.- Returns:
A
DirectSolverFactorizationInfo
object, whose attributes can be queried for information regarding the numerical factorization.
Examples
>>> import cupy as cp >>> import cupyx.scipy.sparse as sp >>> import nvmath
Prepare sample input data.
>>> n = 8 >>> a = sp.random(n, n, density=0.25, format="csr", dtype="float64") >>> a += sp.diags([2.0] * n, format="csr", dtype="float64")
Create the RHS, which can be a matrix or vector in column-major layout.
>>> b = cp.ones((n, 2), dtype="float64", order="F")
Specify, plan, and factorize a @ x = b. Use the stateful object as a context manager to automatically release resources.
>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver: ... # Plan the operation (configure if desired). ... plan_info = solver.plan() ... # (Optionally) configure the factorization. ... fac_config = solver.factorization_config ... fac_config.pivot_eps = 1e-14 ... # Factorize using the specified factorization configuration, which ... # returns a DirectSolverFactorizationInfo object. ... fac_info = solver.factorize() ... # Query the number of non-zeros, inertia, ... ... fac_info.lu_nnz 40
Further examples can be found in the nvmath/examples/sparse/advanced/direct_solver directory.
- free()[source]#
Free DirectSolver resources.
It is recommended that the
DirectSolver
object be used within a context, but if it is not possible then this method must be called explicitly to ensure that the sparse direct solver resources (especially internal library objects) are properly cleaned up.
- plan(
- *,
- stream: AnyStream | None = None,
Plan the sparse direct solve (reordering to minimize fill-in, and symbolic factorization). The planning phase can be optionally configured through the property
plan_config
(an object of typeDirectSolverPlanConfig
). Planning returns aDirectSolverPlanInfo
object, which can also be accessed through the propertyplan_info
.- Parameters:
stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream from the operand package will be used.- Returns:
A
DirectSolverPlanInfo
object, whose attributes can be queried for information regarding the plan.
See also
Examples
>>> import cupy as cp >>> import cupyx.scipy.sparse as sp >>> import nvmath
Prepare sample input data.
>>> n = 8 >>> a = sp.random(n, n, density=0.5, format="csr", dtype="float64") >>> a += sp.diags([2.0] * n, format="csr", dtype="float64")
Create the RHS, which can be a matrix or vector in column-major layout.
>>> b = cp.ones((n,), dtype="float64")
Specify, configure, and plan a @ x = b. Use the stateful object as a context manager to automatically release resources.
>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver: ... # Configure the reordering algorithm for the plan. ... plan_config = solver.plan_config ... plan_config.algorithm = nvmath.sparse.advanced.DirectSolverAlgType.ALG_1 ... # Plan the operation using the specified plan configuration, which ... # returns a DirectSolverPlanInfo object. ... plan_info = solver.plan() ... # Query the column permutation, memory estimates, ... ... plan_info.col_permutation array([6, 1, 4, 0, 3, 2, 5, 7], dtype=int32)
Further examples can be found in the nvmath/examples/sparse/advanced/direct_solver directory.
- reset_operands( )[source]#
Reset the operands held by this
DirectSolver
instance.This method has two use cases:
it can be used to provide new operands for execution when the original operands are on the CPU
it can be used to release the internal reference to the previous operands and make their memory available for other use by passing
None
for all arguments. In this case, this method must be called again to provide the desired operands before another call to planning and execution APIs likeplan()
,factorize()
, orsolve()
.
This method will perform various checks on the new operands to make sure:
The shapes, index and data types match those of the old ones.
The packages that the operands belong to match those of the old ones.
If input tensors are on GPU, the device must match.
- Parameters:
a – The sparse operand (or sequence of operands) representing the left-hand side (LHS) of the system of equations. The LHS operand may be a (sequence of)
scipy.sparse.csr_matrix
,scipy.sparse.csr_array
,cupyx.scipy.sparse.csr_matrix
, ortorch.sparse_csr_tensor()
. That is, the LHS is a sparse matrix or tensor in Compressed Sparse Row (CSR) format from one of the supported packages: SciPy, CuPy, PyTorch. Refer to the semantics section for details.b – The ndarray/tensor or (sequence of ndarray/tensors) representing the dense right-hand side (RHS) of the system of equations. The RHS operand may be a (sequence of)
numpy.ndarray
,cupy.ndarray
, andtorch.Tensor
. Refer to the semantics section for details.stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream from the operand package will be used.
Examples
>>> import cupy as cp >>> import cupyx.scipy.sparse as sp >>> import nvmath
Prepare sample input data.
>>> n = 8 >>> a = sp.random(n, n, density=0.15, format="csr", dtype="float64") >>> a += sp.diags([2.0] * n, format="csr", dtype="float64")
Create the RHS, which can be a matrix or vector in column-major layout.
>>> b = cp.ones((n,), dtype="float64")
Specify, plan, factorize and solve a @ x = b. Use the stateful object as a context manager to automatically release resources.
>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver: ... # Plan the operation. ... plan_info = solver.plan() ... ... # Factorize the system. ... fac_info = solver.factorize() ... ... # Solve the factorized system for the first result. ... x1 = solver.solve() ... ... # Reset the RHS to a new CuPy ndarray. ... c = cp.random.rand(n, dtype="float64") ... solver.reset_operands(b=c) ... ... # Solve for the second result corresponding to the updated operands. ... x2 = solver.solve()
Tip
If only a subset of operands are reset, the operands that are not reset hold their original values.
With
reset_operands()
, minimal overhead is achieved as problem specification and planning are only performed once.For the particular example above, explicitly calling
reset_operands()
is equivalent to updating the operandb
in-place, i.e, replacingsolver.reset_operands(b=c)
withb[:]=c
.Danger
Updating the operand in-place can only yield the expected result under the additional constraints below:
The operand is on the GPU (more precisely, the operand memory space should be accessible from the execution space).
The user has called
factorize()
if needed (they are not relying on iterative refinement for example)
Assuming that the constraint above is satisfied, updating an operand in-place is preferred to avoid the extra checks incurred in
reset_operands()
.For more details, please refer to the reset operand example.
- solve(
- *,
- stream: AnyStream | None = None,
Solve the factorized system of equations.
This phase can be optionally configured through the property
solution_config
(an object of typeDirectSolverSolutionConfig
).- Parameters:
stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream from the operand package will be used.- Returns:
The result of the specified sparse direct solve, which has the same shape, remains on the same device, and belongs to the same package as the RHS
b
. Ifb
is a sequence, the resultx
is also a sequence of ndarray/tensor, each of which has the same shape as the corresponding tensor inb
.
See also
Examples
>>> import cupy as cp >>> import cupyx.scipy.sparse as sp >>> import nvmath
Prepare sample input data.
>>> n = 8 >>> a = sp.random(n, n, density=0.25, format="csr", dtype="float64") >>> a += sp.diags([2.0] * n, format="csr", dtype="float64")
Create the RHS, which can be a matrix or vector in column-major layout.
>>> b = cp.ones((n, 2), dtype="float64", order="F")
Specify, plan, factorize, and solve a @ x = b for x. Use the stateful object as a context manager to automatically release resources.
>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver: ... # Plan the operation (configure if desired). ... plan_info = solver.plan() ... # Factorize the system (configure if desired). ... fac_info = solver.factorize() ... # (Optionally) configure the solve. ... solution_config = solver.solution_config ... solution_config.ir_num_steps = 10 ... # Solve the system based on the solution configuration set above. ... x = solver.solve()
Further examples can be found in the nvmath/examples/sparse/advanced/direct_solver directory.
Attributes
- factorization_config#
An accessor to configure or query the solver factorization phase attributes.
- Returns:
A
DirectSolverFactorizationConfig
object, whose attributes can be set (or queried) to configure the factorization phase.
See also
- factorization_info#
Query solver factorization information (see
nvmath.
). An accessor to get information about the solver factorization phase.sparse. advanced. DirectSolverFactorizationInfo - Returns:
A
DirectSolverFactorizationInfo
object, whose attributes can be queried for information regarding the factorization phase.
See also
- plan_config#
An accessor to configure or query the solver planning phase attributes.
- Returns:
A
DirectSolverPlanConfig
object, whose attributes can be set (or queried) to configure the planning phase.
See also
- plan_info#
An accessor to get information about the solver planning phase.
- Returns:
A
DirectSolverPlanInfo
object, whose attributes can be queried for information regarding the planning phase.
See also
- solution_config#
An accessor to configure or query the solver solution phase attributes.
- Returns:
A
DirectSolverSolutionConfig
object, whose attributes can be set (or queried) to configure the factorization phase.
See also