DirectSolver#

class nvmath.sparse.advanced.DirectSolver(
a,
b,
*,
options: DirectSolverOptions | None = None,
execution: ExecutionCUDA | ExecutionHybrid | None = None,
stream: AnyStream | int | None = None,
)[source]#

Create a stateful object that encapsulates the specified sparse direct solver computations and required resources. This object ensures the validity of resources during use and releases them when they are no longer needed to prevent misuse.

This object encompasses all functionalities of function-form API direct_solver(), which is a convenience wrapper around it. The stateful object also allows for the amortization of preparatory costs when the same solve operation is to be performed with different left-hand side (LHS) and right-hand side (RHS) with the same problem specification (see reset_operands() for more details).

Using the stateful object typically involves the following steps:

  1. Problem Specification: Initialize the object with the defined operation and options.

  2. Preparation: Use plan() for reordering to minimize fill-in and symbolic factorization for this specific direct sparse solver operation.

  3. Execution: Factorize and solve the system.

  4. Resource Management: Ensure all resources are released either by explicitly calling free() or by managing the stateful object within a context manager.

Detailed information on each step described above can be obtained by passing in a logging.Logger object to DirectSolverOptions or by setting the appropriate options in the root logger object, which is used by default:

>>> import logging
>>> logging.basicConfig(
...     level=logging.INFO,
...     format="%(asctime)s %(levelname)-8s %(message)s",
...     datefmt="%m-%d %H:%M:%S",
... )
Parameters:
  • a – The sparse operand (or sequence of operands) representing the left-hand side (LHS) of the system of equations. The LHS operand may be a (sequence of) scipy.sparse.csr_matrix, scipy.sparse.csr_array, cupyx.scipy.sparse.csr_matrix, or torch.sparse_csr_tensor(). That is, the LHS is a sparse matrix or tensor in Compressed Sparse Row (CSR) format from one of the supported packages: SciPy, CuPy, PyTorch. Refer to the semantics section for details.

  • b – The ndarray/tensor or (sequence of ndarray/tensors) representing the dense right-hand side (RHS) of the system of equations. The RHS operand may be a (sequence of) numpy.ndarray, cupy.ndarray, and torch.Tensor. Refer to the semantics section for details.

  • options – Specify options for the direct sparse solver as a DirectSolverOptions object. Alternatively, a dict containing the parameters for the DirectSolverOptions constructor can also be provided. If not specified, the value will be set to the default-constructed DirectSolverOptions object.

  • execution – Specify execution space options for the direct solver as a ExecutionCUDA or ExecutionHybrid object. Alternatively, a string (‘cuda’ or ‘hybrid’), or a dict with the ‘name’ key set to ‘cuda’ or ‘hybrid’ and optional parameters relevant to the given execution space. The default execution space is ‘cuda’ and the corresponding ExecutionCUDA object will be default-constructed.

  • stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream from the operand package will be used.

Examples

>>> import numpy as np
>>> import scipy.sparse as sp
>>> import nvmath

Create a sparse float64 ndarray in CSR format on the CPU for the LHS.

>>> n = 16
>>> a = sp.random_array((n, n), density=0.5, format="csr", dtype="float64")

Ensure that the randomly-generated LHS is not singular.

>>> a += sp.diags_array([2.0] * n, format="csr", dtype="float64")

The RHS can be a vector or matrix. Here we create a random vector.

>>> b = np.random.rand(n).astype(dtype="float64")

We will define a sparse direct solver operation for solving the system a @ x = b using the specialized sparse direct solver interface.

>>> solver = nvmath.sparse.advanced.DirectSolver(a, b)

Options can be provided above to control the behavior of the operation using the options argument (see DirectSolverOptions).

Next, plan the operation. The planning operation can be configured through the DirectSolverPlanConfig interface, which is accessed through plan_config.

>>> plan_config = solver.plan_config

Here we set the reordering algorithm to choice 1.

>>> AlgType = nvmath.sparse.advanced.DirectSolverAlgType
>>> plan_config.reordering_algorithm = AlgType.ALG_1

Plan the operation, which reorders the system to minimize fill-in and performs the symbolic factorization. Planning returns a DirectSolverPlanInfo object, whose attributes (such as row or column permutation) can be queried.

>>> plan_info = solver.plan()
>>> plan_info.col_permutation
array([ 0,  1,  8,  9,  2,  3,  4, 11, 15,  5, 10,  6, 12, 13,  7, 14],
      dtype=int32)

The next step is to perform the numerical factorization of the system using factorize(). Similar to planning, the numerical factorization step can also be configured if desired.

>>> fac_config = solver.factorization_config

Here we set the pivot epsilon value to 1e-14, instead of the default 1e-13.

>>> fac_config.pivot_eps = 1e-14

Factorize the system, which returns a DirectSolverFactorizationInfo object, whose attributes can be inspected. We print the Sylvester inertia here.

>>> fac_info = solver.factorize()
>>> fac_info.inertia
array([0, 0], dtype=int32)

Now solve the factorized system, and obtain the result x as a NumPy ndarray on the CPU.

>>> x = solver.solve()

Finally, free the object’s resources. To avoid having to explicitly making this call, it’s recommended to use the DirectSolver object as a context manager as shown below, if possible.

>>> solver.free()

Note

All DirectSolver methods execute on the package current stream by default. Alternatively, the stream argument can be used to run a method on a specified stream.

Let’s now look at a batched solve with PyTorch operands on the GPU.

Create a 3D complex128 PyTorch sparse tensor on the GPU representing the LHS, along with the corresponding RHS:

>>> import torch
>>> n = 8
>>> batch = 2
>>> device_id = 0

Prepare sample input data. Create a diagonally-dominant random CSR matrix.

>>> a = torch.rand(n, n, dtype=torch.complex128) + torch.diag(torch.tensor([2.0] * n))
>>> a = torch.stack([a] * batch, dim=0)
>>> a = a.to_sparse_csr()

Important

PyTorch uses int64 for index buffers, whereas cuDSS currently requires int32. So we’ll have to convert the indices.

>>> a = torch.sparse_csr_tensor(
...     a.crow_indices().to(dtype=torch.int32),
...     a.col_indices().to(dtype=torch.int32),
...     a.values(),
...     size=a.size(),
...     device=device_id,
... )

Create the RHS, which can be a matrix or vector in column-major layout.

>>> b = torch.ones(batch, 3, n, dtype=torch.complex128, device=device_id)
>>> b = b.permute(0, 2, 1)

Create a DirectSolver object encapsulating the problem specification described earlier and use it as a context manager.

>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver:
...     plan_info = solver.plan()
...
...     # Factorize the system.
...     fac_info = solver.factorize()
...
...     # Solve the factorized system.
...     x1 = solver.solve()
...
...     # Update the RHS b in-place (see reset_operands() for an alternative).
...     b[...] = torch.rand(*b.shape, dtype=torch.complex128, device=device_id)
...
...     # Solve again to get the new result.
...     x2 = solver.solve()

All the resources used by the object are released at the end of the block.

Batching can be implicitly-specified as shown above, or explicitly-specified as a sequence for both the LHS and the RHS. This, as well as other options and usage patterns, are illustrated in the examples found in the nvmath/examples/sparse/advanced/direct_solver directory.

Methods

__init__(
a,
b,
*,
options: DirectSolverOptions | None = None,
execution: ExecutionCUDA | ExecutionHybrid | None = None,
stream: AnyStream | int | None = None,
)[source]#
factorize(
*,
stream: AnyStream | None = None,
)[source]#

Factorize the system of equations. Numerical factorization is required each time the values in the LHS change, unless (for example) iterative refinement is used during the solve and it converges (see solution_config).

This phase can be optionally configured through the property factorization_config (an object of type DirectSolverFactorizationConfig). Factorization returns a DirectSolverFactorizationInfo object, which can also be accessed through the property factorization_info.

Parameters:

stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream from the operand package will be used.

Returns:

A DirectSolverFactorizationInfo object, whose attributes can be queried for information regarding the numerical factorization.

Examples

>>> import cupy as cp
>>> import cupyx.scipy.sparse as sp
>>> import nvmath

Prepare sample input data.

>>> n = 8
>>> a = sp.random(n, n, density=0.25, format="csr", dtype="float64")
>>> a += sp.diags([2.0] * n, format="csr", dtype="float64")

Create the RHS, which can be a matrix or vector in column-major layout.

>>> b = cp.ones((n, 2), dtype="float64", order="F")

Specify, plan, and factorize a @ x = b. Use the stateful object as a context manager to automatically release resources.

>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver:
...     # Plan the operation (configure if desired).
...     plan_info = solver.plan()
...     # (Optionally) configure the factorization.
...     fac_config = solver.factorization_config
...     fac_config.pivot_eps = 1e-14
...     # Factorize using the specified factorization configuration, which
...     # returns a DirectSolverFactorizationInfo object.
...     fac_info = solver.factorize()
...     # Query the number of non-zeros, inertia, ...
...     fac_info.lu_nnz
40

Further examples can be found in the nvmath/examples/sparse/advanced/direct_solver directory.

free()[source]#

Free DirectSolver resources.

It is recommended that the DirectSolver object be used within a context, but if it is not possible then this method must be called explicitly to ensure that the sparse direct solver resources (especially internal library objects) are properly cleaned up.

plan(
*,
stream: AnyStream | None = None,
)[source]#

Plan the sparse direct solve (reordering to minimize fill-in, and symbolic factorization). The planning phase can be optionally configured through the property plan_config (an object of type DirectSolverPlanConfig). Planning returns a DirectSolverPlanInfo object, which can also be accessed through the property plan_info.

Parameters:

stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream from the operand package will be used.

Returns:

A DirectSolverPlanInfo object, whose attributes can be queried for information regarding the plan.

Examples

>>> import cupy as cp
>>> import cupyx.scipy.sparse as sp
>>> import nvmath

Prepare sample input data.

>>> n = 8
>>> a = sp.random(n, n, density=0.5, format="csr", dtype="float64")
>>> a += sp.diags([2.0] * n, format="csr", dtype="float64")

Create the RHS, which can be a matrix or vector in column-major layout.

>>> b = cp.ones((n,), dtype="float64")

Specify, configure, and plan a @ x = b. Use the stateful object as a context manager to automatically release resources.

>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver:
...     # Configure the reordering algorithm for the plan.
...     plan_config = solver.plan_config
...     plan_config.algorithm = nvmath.sparse.advanced.DirectSolverAlgType.ALG_1
...     # Plan the operation using the specified plan configuration, which
...     # returns a DirectSolverPlanInfo object.
...     plan_info = solver.plan()
...     # Query the column permutation, memory estimates, ...
...     plan_info.col_permutation
array([6, 1, 4, 0, 3, 2, 5, 7], dtype=int32)

Further examples can be found in the nvmath/examples/sparse/advanced/direct_solver directory.

reset_operands(
a=None,
b=None,
*,
stream: AnyStream | int | None = None,
)[source]#

Reset the operands held by this DirectSolver instance.

This method has two use cases:

  1. it can be used to provide new operands for execution when the original operands are on the CPU

  2. it can be used to release the internal reference to the previous operands and make their memory available for other use by passing None for all arguments. In this case, this method must be called again to provide the desired operands before another call to planning and execution APIs like plan(), factorize(), or solve().

This method will perform various checks on the new operands to make sure:

  • The shapes, index and data types match those of the old ones.

  • The packages that the operands belong to match those of the old ones.

  • If input tensors are on GPU, the device must match.

Parameters:
  • a – The sparse operand (or sequence of operands) representing the left-hand side (LHS) of the system of equations. The LHS operand may be a (sequence of) scipy.sparse.csr_matrix, scipy.sparse.csr_array, cupyx.scipy.sparse.csr_matrix, or torch.sparse_csr_tensor(). That is, the LHS is a sparse matrix or tensor in Compressed Sparse Row (CSR) format from one of the supported packages: SciPy, CuPy, PyTorch. Refer to the semantics section for details.

  • b – The ndarray/tensor or (sequence of ndarray/tensors) representing the dense right-hand side (RHS) of the system of equations. The RHS operand may be a (sequence of) numpy.ndarray, cupy.ndarray, and torch.Tensor. Refer to the semantics section for details.

  • stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream from the operand package will be used.

Examples

>>> import cupy as cp
>>> import cupyx.scipy.sparse as sp
>>> import nvmath

Prepare sample input data.

>>> n = 8
>>> a = sp.random(n, n, density=0.15, format="csr", dtype="float64")
>>> a += sp.diags([2.0] * n, format="csr", dtype="float64")

Create the RHS, which can be a matrix or vector in column-major layout.

>>> b = cp.ones((n,), dtype="float64")

Specify, plan, factorize and solve a @ x = b. Use the stateful object as a context manager to automatically release resources.

>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver:
...     # Plan the operation.
...     plan_info = solver.plan()
...
...     # Factorize the system.
...     fac_info = solver.factorize()
...
...     # Solve the factorized system for the first result.
...     x1 = solver.solve()
...
...     # Reset the RHS to a new CuPy ndarray.
...     c = cp.random.rand(n, dtype="float64")
...     solver.reset_operands(b=c)
...
...     # Solve for the second result corresponding to the updated operands.
...     x2 = solver.solve()

Tip

If only a subset of operands are reset, the operands that are not reset hold their original values.

With reset_operands(), minimal overhead is achieved as problem specification and planning are only performed once.

For the particular example above, explicitly calling reset_operands() is equivalent to updating the operand b in-place, i.e, replacing solver.reset_operands(b=c) with b[:]=c.

Danger

Updating the operand in-place can only yield the expected result under the additional constraints below:

  • The operand is on the GPU (more precisely, the operand memory space should be accessible from the execution space).

  • The user has called factorize() if needed (they are not relying on iterative refinement for example)

Assuming that the constraint above is satisfied, updating an operand in-place is preferred to avoid the extra checks incurred in reset_operands().

For more details, please refer to the reset operand example.

solve(
*,
stream: AnyStream | None = None,
)[source]#

Solve the factorized system of equations.

This phase can be optionally configured through the property solution_config (an object of type DirectSolverSolutionConfig).

Parameters:

stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream from the operand package will be used.

Returns:

The result of the specified sparse direct solve, which has the same shape, remains on the same device, and belongs to the same package as the RHS b. If b is a sequence, the result x is also a sequence of ndarray/tensor, each of which has the same shape as the corresponding tensor in b.

Examples

>>> import cupy as cp
>>> import cupyx.scipy.sparse as sp
>>> import nvmath

Prepare sample input data.

>>> n = 8
>>> a = sp.random(n, n, density=0.25, format="csr", dtype="float64")
>>> a += sp.diags([2.0] * n, format="csr", dtype="float64")

Create the RHS, which can be a matrix or vector in column-major layout.

>>> b = cp.ones((n, 2), dtype="float64", order="F")

Specify, plan, factorize, and solve a @ x = b for x. Use the stateful object as a context manager to automatically release resources.

>>> with nvmath.sparse.advanced.DirectSolver(a, b) as solver:
...     # Plan the operation (configure if desired).
...     plan_info = solver.plan()
...     # Factorize the system (configure if desired).
...     fac_info = solver.factorize()
...     # (Optionally) configure the solve.
...     solution_config = solver.solution_config
...     solution_config.ir_num_steps = 10
...     # Solve the system based on the solution configuration set above.
...     x = solver.solve()

Further examples can be found in the nvmath/examples/sparse/advanced/direct_solver directory.

Attributes

factorization_config#

An accessor to configure or query the solver factorization phase attributes.

Returns:

A DirectSolverFactorizationConfig object, whose attributes can be set (or queried) to configure the factorization phase.

factorization_info#

Query solver factorization information (see nvmath.sparse.advanced.DirectSolverFactorizationInfo). An accessor to get information about the solver factorization phase.

Returns:

A DirectSolverFactorizationInfo object, whose attributes can be queried for information regarding the factorization phase.

plan_config#

An accessor to configure or query the solver planning phase attributes.

Returns:

A DirectSolverPlanConfig object, whose attributes can be set (or queried) to configure the planning phase.

plan_info#

An accessor to get information about the solver planning phase.

Returns:

A DirectSolverPlanInfo object, whose attributes can be queried for information regarding the planning phase.

solution_config#

An accessor to configure or query the solver solution phase attributes.

Returns:

A DirectSolverSolutionConfig object, whose attributes can be set (or queried) to configure the factorization phase.