.. _cuDensityMat C++ examples: ******** Examples ******** In this section, we show basic examples on how to define a quantum operator, quantum state, and then compute the action of the quantum operator on a quantum state, and, optionally, backward-differentiate the operator action (compute gradients) with respect to user-provided real parameters parameterizing the operator. For clarity, the quantum operator for each example is defined inside a separate C++ header, specifically ``transverse_ising_full_fused_noisy.h`` and ``transverse_ising_full_fused_noisy_grad.h``, where it is wrapped in a helper C++ class ``UserDefinedLiouvillian``. We also provide a utility header ``helpers.h`` containing convenient GPU array instantiation, initialization, and printing functions. ============== Compiling code ============== Assuming **cuQuantum** has been extracted in ``CUQUANTUM_ROOT`` and **cuTENSOR** is in ``CUTENSOR_ROOT``, we update the library path as follows: .. code-block:: bash export LD_LIBRARY_PATH=${CUQUANTUM_ROOT}/lib:${CUTENSOR_ROOT}/lib/12:${LD_LIBRARY_PATH} Depending on your CUDA Toolkit, you might have to choose a different library version (e.g., ``${CUTENSOR_ROOT}/lib/11``). A serial sample code discussed below (``operator_action_example.cpp``) can be compiled via the following command: .. code-block:: bash nvcc operator_action_example.cpp -I${CUQUANTUM_ROOT}/include -L${CUQUANTUM_ROOT}/lib -L${CUTENSOR_ROOT}/lib/12 -lcudensitymat -lcutensornet -lcutensor -lcublas -o operator_action_example For static linking against the **cuDensityMat** library, use the following command: .. code-block:: bash nvcc operator_action_example.cpp -I${CUQUANTUM_ROOT}/include ${CUQUANTUM_ROOT}/lib/libcudensitymat_static.a ${CUQUANTUM_ROOT}/lib/libcutensornet_static.a -L${CUTENSOR_ROOT}/lib/12 -lcutensor -lcublas -o operator_action_example In order to build a parallel (MPI) version of the example ``operator_action_mpi_example.cpp``, one will need to have a *CUDA-aware* MPI library installed (e.g., recent OpenMPI, MPICH or MVAPICH) and then set the environment variable ``$CUDENSITYMAT_COMM_LIB`` to the path to the MPI interface wrapper shared library ``libcudensitymat_distributed_interface_mpi.so``. The MPI interface wrapper shared library ``libcudensitymat_distributed_interface_mpi.so`` can be built inside the ``${CUQUANTUM_ROOT}/distributed_interfaces`` folder by calling the provided build script there. In order to link the executable to a CUDA-aware MPI library, one will need to add ``-I${MPI_PATH}/include`` and ``-L${MPI_PATH}/lib -lmpi`` to the build command: .. code-block:: bash nvcc operator_action_mpi_example.cpp -I${CUQUANTUM_ROOT}/include -I${MPI_PATH}/include -L${CUQUANTUM_ROOT}/lib -L${CUTENSOR_ROOT}/lib/12 -L${MPI_PATH}/lib -lcudensitymat -lcutensornet -lcutensor -lcublas -lmpi -o operator_action_mpi_example .. warning:: When running ``operator_action_mpi_example.cpp`` with a non-CUDA-aware MPI library, the program will crash. .. note:: Depending on the installation of the **cuQuantum SDK** package, you may need to replace ``lib`` above by ``lib64``, depending which folder name is used inside your **cuQuantum SDK** package. =============================================== Code example (serial execution on a single GPU) =============================================== The following code example illustrates the common steps necessary to use the **cuDensityMat** library to compute the action of a quantum many-body operator on a quantum state. The full sample code can be found in the `NVIDIA/cuQuantum `_ repository (`main serial code `_ and `operator definition `_ as well as the `utility code `_). First let's introduce a helper class to construct a specific quantum many-body operator, for example, the transverse field Ising Hamiltonian with fused ``ZZ`` terms and an additional noise term. Here we choose to make the ``f(t)`` coefficient depend on time and a single user-provided real parameter `Omega`. We use a CPU-side user-defined scalar callback function to define the dependence of the ``f(t)`` coefficient on time and the user-provided real parameter ``Omega``. Note that inside the callback function definition, we explicitly expect the data type to be ``CUDA_C_64F`` (double-precision complex numbers), which applies to the scalar coefficient ``f(t)`` set by the callback function in-place. .. literalinclude:: ../../../density_matrix/examples/transverse_ising_full_fused_noisy.h :language: c++ :linenos: :lineno-match: Now we can use this parameterized quantum many-body operator in our main code to compute the action of the operator on a mixed quantum state. .. literalinclude:: ../../../density_matrix/examples/operator_action_example.cpp :language: c++ :linenos: :lineno-match: ================================================== Code example (parallel execution on multiple GPUs) ================================================== It is straightforward to adapt the `main serial code`_ and enable parallel execution across multiple/many GPU devices (across multiple/many nodes). We will illustrate this with an example using the Message Passing Interface (MPI) as the communication layer. Below we show the minor additions that need to be made in order to enable distributed parallel execution without making any changes to the original serial source code. The full sample code can be found in the `NVIDIA/cuQuantum `_ repository (`main MPI code `_ and `operator definition`_ `as well as the utility code `_). Here is the updated main code for multi-GPU runs. .. literalinclude:: ../../../density_matrix/examples/operator_action_mpi_example.cpp :language: c++ :linenos: :lineno-match: ============================================================= Code example (serial execution with backward differentiation) ============================================================= The following code example illustrates how to use the **cuDensityMat** library to not only compute the action of a quantum many-body operator on a quantum state, but also backward-differentiate it (compute gradients) with respect to user-provided real parameters parameterizing the operator (one real parameter Omega in this example). The full sample code can be found in the `NVIDIA/cuQuantum `_ repository (`main serial gradient code `_ and `operator definition for gradient `_ as well as the `utility code`_). First let's construct a specific quantum many-body operator which, in this case, is a slightly modified version of the quantum many-body operator used in `main serial code`_. Here we make both the ``h(t)`` and ``f(t)`` scalar coefficients depend on time and a single user-provided real parameter ``Omega`` via different (made-up) functional forms. In order to backward-differentiate the operator action with respect to this single user-provided real parameter ``Omega``, we need to manually define a gradient callback function for each regular callback function we have (for ``h(t)`` and ``f(t)`` in this example). In our example, we define two CPU-side scalar gradient callback functions which compute the vector-jacobian product (VJP) of the scalar adjoint of ``h(t)`` and ``f(t)`` with respect to the user-provided real parameter ``Omega``, respectively. A gradient callback function is expected to accumulate the VJP result into the ``paramsGrad`` output array, the final value of which will be the gradient(s) of the user-defined cost function with respect to the user-provided real parameters parameterizing the operator. As before, all regular and gradient callback functions used in our example explicitly expect the data type to be ``CUDA_C_64F`` (double-precision complex numbers). Updating the data type of the operator and quantum states would require updating the callback functions accordingly (``CUDA_C_32F``). .. literalinclude:: ../../../density_matrix/examples/transverse_ising_full_fused_noisy_grad.h :language: c++ :linenos: :lineno-match: Now we can use the defined quantum many-body operator in our main code to compute its action on a mixed quantum state and then backward-differentiate it (compute gradients) with respect to the user-provided real parameter ``Omega``. For the sake of simplicity, we pass a made-up adjoint of the output quantum state to the `cudensitymatOperatorComputeActionBackwardDiff` call, which is just the output quantum state itself (in real scenarios, the adjoint of the output quantum state will depend on the user-chosen cost function and will be provided by the user). Upon completion of the `cudensitymatOperatorComputeActionBackwardDiff` call, the ``paramsGrad`` output argument will contain the gradient of the user-defined cost function with respect to the user-provided real parameter ``Omega``. Additionally, the backward-differentiation API call will also return the adjoint of the input quantum state for cases where the input quantum state implicitly depends on the user-provided real parameters (for example, cases where the input quantum state comes from a previous operator action step, which is typical for time-integration of quantum dynamics master equations). Note that both output arguments, namely ``paramsGrad`` and ``stateInAdj``, are accumulative, i.e., they will be accumulated into (it is user's responsibility to zero them out before the first call). .. literalinclude:: ../../../density_matrix/examples/operator_action_gradient_example.cpp :language: c++ :linenos: :lineno-match: =========== Useful tips =========== * For debugging, one can set the environment variable ``CUDENSITYMAT_LOG_LEVEL=n``. The level ``n`` = 0, 1, ..., 5 corresponds to the logger level as described in the table below. The environment variable ``CUDENSITYMAT_LOG_FILE=`` can be used to redirect the log output to a custom file at ```` instead of ``stdout``. .. list-table:: :widths: 10 25 65 * - **Level** - **Summary** - **Long Description** * - 0 - Off - Logging is disabled (default) * - 1 - Errors - Only errors will be logged * - 2 - Performance Trace - API calls that launch CUDA kernels will log their parameters and important information * - 3 - Performance Hints - Hints that can potentially improve the application's performance * - 4 - Heuristics Trace - Provides general information about the library execution, may contain details about heuristic status * - 5 - API Trace - API calls will log their parameter and important information