cuStateVec Functions¶
Library Management¶
Handle Management API¶
custatevecCreate
¶
-
custatevecStatus_t custatevecCreate(custatevecHandle_t *handle)¶
This function initializes the cuStateVec library and creates a handle on the cuStateVec context. It must be called prior to any other cuStateVec API functions.
- Parameters
handle – [in] the pointer to the handle to the cuStateVec context
custatevecDestroy
¶
-
custatevecStatus_t custatevecDestroy(custatevecHandle_t handle)¶
This function releases resources used by the cuStateVec library.
- Parameters
handle – [in] the handle to the cuStateVec context
custatevecGetDefaultWorkspaceSize
¶
-
custatevecStatus_t custatevecGetDefaultWorkspaceSize(custatevecHandle_t handle, size_t *workspaceSizeInBytes)¶
This function returns the default workspace size defined by the cuStateVec library.
This function returns the default size used for the workspace.
- Parameters
handle – [in] the handle to the cuStateVec context
workspaceSizeInBytes – [out] default workspace size
custatevecSetWorkspace
¶
-
custatevecStatus_t custatevecSetWorkspace(custatevecHandle_t handle, void *workspace, size_t workspaceSizeInBytes)¶
This function sets the workspace used by the cuStateVec library.
This function sets the workspace attached to the handle. The required size of the workspace is obtained by custatevecGetDefaultWorkspaceSize().
By setting a larger workspace, users are able to execute functions without allocating the extra workspace in some functions.
If a device memory handler is set, the
workspace
can be set to null and the workspace is allocated using the user-defined memory pool.- Parameters
handle – [in] the handle to the cuStateVec context
workspace – [in] device pointer to workspace
workspaceSizeInBytes – [in] workspace size
CUDA Stream Management API¶
custatevecSetStream
¶
-
custatevecStatus_t custatevecSetStream(custatevecHandle_t handle, cudaStream_t streamId)¶
This function sets the stream to be used by the cuStateVec library to execute its routine.
- Parameters
handle – [in] the handle to the cuStateVec context
streamId – [in] the stream to be used by the library
custatevecGetStream
¶
-
custatevecStatus_t custatevecGetStream(custatevecHandle_t handle, cudaStream_t *streamId)¶
This function gets the cuStateVec library stream used to execute all calls from the cuStateVec library functions.
- Parameters
handle – [in] the handle to the cuStateVec context
streamId – [out] the stream to be used by the library
Error Management API¶
custatevecGetErrorName
¶
-
const char *custatevecGetErrorName(custatevecStatus_t status)¶
This function returns the name string for the input error code. If the error code is not recognized, “unrecognized error code” is returned.
- Parameters
status – [in] Error code to convert to string
custatevecGetErrorString
¶
-
const char *custatevecGetErrorString(custatevecStatus_t status)¶
This function returns the description string for an error code. If the error code is not recognized, “unrecognized error code” is returned.
- Parameters
status – [in] Error code to convert to string
Logger API¶
custatevecLoggerSetCallback
¶
-
custatevecStatus_t custatevecLoggerSetCallback(custatevecLoggerCallback_t callback)¶
Experimental: This function sets the logging callback function.
- Parameters
callback – [in] Pointer to a callback function. See custatevecLoggerCallback_t.
custatevecLoggerSetCallbackData
¶
-
custatevecStatus_t custatevecLoggerSetCallbackData(custatevecLoggerCallbackData_t callback, void *userData)¶
Experimental: This function sets the logging callback function with user data.
- Parameters
callback – [in] Pointer to a callback function. See custatevecLoggerCallbackData_t.
userData – [in] Pointer to user-provided data.
custatevecLoggerSetFile
¶
-
custatevecStatus_t custatevecLoggerSetFile(FILE *file)¶
Experimental: This function sets the logging output file.
Note
Once registered using this function call, the provided file handle must not be closed unless the function is called again to switch to a different file handle.
- Parameters
file – [in] Pointer to an open file. File should have write permission.
custatevecLoggerOpenFile
¶
-
custatevecStatus_t custatevecLoggerOpenFile(const char *logFile)¶
Experimental: This function opens a logging output file in the given path.
- Parameters
logFile – [in] Path of the logging output file.
custatevecLoggerSetLevel
¶
-
custatevecStatus_t custatevecLoggerSetLevel(int32_t level)¶
Experimental: This function sets the value of the logging level.
Levels are defined as follows:
Level
Summary
Long Description
“0”
Off
logging is disabled (default)
“1”
Errors
only errors will be logged
“2”
Performance Trace
API calls that launch CUDA kernels will log their parameters and important information
“3”
Performance Hints
hints that can potentially improve the application’s performance
“4”
Heuristics Trace
provides general information about the library execution, may contain details about heuristic status
“5”
API Trace
API Trace - API calls will log their parameter and important information
- Parameters
level – [in] Value of the logging level.
custatevecLoggerSetMask
¶
-
custatevecStatus_t custatevecLoggerSetMask(int32_t mask)¶
Experimental: This function sets the value of the logging mask. Masks are defined as a combination of the following masks:
Level
Description
“0”
Off
“1”
Errors
“2”
Performance Trace
“4”
Performance Hints
“8”
Heuristics Trace
“16”
API Trace
- Parameters
mask – [in] Value of the logging mask.
custatevecLoggerForceDisable
¶
-
custatevecStatus_t custatevecLoggerForceDisable()¶
Experimental: This function disables logging for the entire run.
Versioning API¶
custatevecGetProperty
¶
-
custatevecStatus_t custatevecGetProperty(libraryPropertyType type, int32_t *value)¶
This function returns the version information of the cuStateVec library.
- Parameters
type – [in] requested property (
MAJOR_VERSION
,MINOR_VERSION
, orPATCH_LEVEL
).value – [out] value of the requested property.
Memory Management API¶
A stream-ordered memory allocator (or mempool for short) allocates/deallocates memory asynchronously from/to a mempool in a
stream-ordered fashion, meaning memory operations and computations enqueued on the streams have a well-defined inter- and intra-
stream dependency. There are several well-implemented stream-ordered mempools available, such as cudaMemPool_t
that is built-in
at the CUDA driver level since CUDA 11.2 (so that all CUDA applications in the same process can easily share the same pool,
see here) and the RAPIDS
Memory Manager (RMM). For a detailed introduction, see the NVIDIA Developer Blog.
The new device memory handler APIs allow users to bind a stream-ordered mempool to the library handle, such that cuStateVec can take care of most of the memory management for users. Below is an illustration of what can be done:
MyMemPool pool = MyMemPool(); // kept alive for the entire process in real apps
int my_alloc(void* ctx, void** ptr, size_t size, cudaStream_t stream) {
return reinterpret_cast<MyMemPool*>(ctx)->alloc(ptr, size, stream);
}
int my_dealloc(void* ctx, void* ptr, size_t size, cudaStream_t stream) {
return reinterpret_cast<MyMemPool*>(ctx)->dealloc(ptr, size, stream);
}
// create a mem handler and fill in the required members for the library to use
custatevecDeviceMemHandler_t handler;
handler.ctx = reinterpret_cast<void*>(&pool);
handler.device_alloc = my_alloc;
handler.device_free = my_dealloc;
memcpy(handler.name, std::string("my pool").c_str(), CUSTATEVEC_ALLOCATOR_NAME_LEN);
// bind the handler to the library handle
custatevecSetDeviceMemHandler(handle, &handler);
/* ... use gate application as usual ... */
// User doesn’t compute the required sizes
// User doesn’t query the workspace size (but one can if desired)
// User doesn’t allocate memory!
// User sets null pointer to indicate the library should draw memory from the user's pool;
void* extraWorkspace = nullptr;
size_t extraWorkspaceInBytes = 0;
custatevecApplyMatrix(
handle, sv, svDataType, nIndexBits, matrix, matrixDataType, layout,
adjoint, targets, nTargets, controls, nControls, controlBitValues,
computeType, extraWorkspace, extraWorkspaceSizeInBytes);
// User doesn’t deallocate memory!
As shown above, several calls to the workspace-related APIs can be skipped. Moreover, allowing the library to share your memory pool not only can alleviate potential memory conflicts, but also enable possible optimizations.
In the current release, only a device mempool can be bound.
custatevecSetDeviceMemHandler
¶
-
custatevecStatus_t custatevecSetDeviceMemHandler(custatevecHandle_t handle, const custatevecDeviceMemHandler_t *handler)¶
Set the current device memory handler.
Once set, when cuStateVec needs device memory in various API calls it will allocate from the user-provided memory pool and deallocate at completion. See custatevecDeviceMemHandler_t and APIs that require extra workspace for further detail.
The internal stream order is established using the user-provided stream set via custatevecSetStream().
If
handler
argument is set to nullptr, the library handle will detach its existing memory handler.Warning
It is undefined behavior for the following scenarios:
the library handle is bound to a memory handler and subsequently to another handler
the library handle outlives the attached memory pool
the memory pool is not stream-ordered
- Parameters
handle – [in] Opaque handle holding cuStateVec’s library context.
handler – [in] the device memory handler that encapsulates the user’s mempool. The struct content is copied internally.
custatevecGetDeviceMemHandler
¶
-
custatevecStatus_t custatevecGetDeviceMemHandler(custatevecHandle_t handle, custatevecDeviceMemHandler_t *handler)¶
Get the current device memory handler.
- Parameters
handle – [in] Opaque handle holding cuStateVec’s library context.
handler – [out] If previously set, the struct pointed to by
handler
is filled in, otherwise CUSTATEVEC_STATUS_NO_DEVICE_ALLOCATOR is returned.
Gate Application¶
General Matrices¶
cuStateVec API custatevecApplyMatrix()
can apply a matrix representing a gate to a state vector.
The API may require external workspace for large matrices,
and custatevecApplyMatrixGetWorkspaceSize()
provides the size of workspace.
If a device memory handler is set, custatevecApplyMatrixGetWorkspaceSize()
can be skipped.
Use case¶
// check the size of external workspace
custatevecApplyMatrixGetWorkspaceSize(
handle, svDataType, nIndexBits, matrix, matrixDataType, layout, adjoint, nTargets,
nControls, computeType, &extraWorkspaceSizeInBytes);
// allocate external workspace if necessary
void* extraWorkspace = nullptr;
if (extraWorkspaceSizeInBytes > 0)
cudaMalloc(&extraWorkspace, extraWorkspaceSizeInBytes);
// apply gate
custatevecApplyMatrix(
handle, sv, svDataType, nIndexBits, matrix, matrixDataType, layout,
adjoint, targets, nTargets, controls, controlBitValues, nControls,
computeType, extraWorkspace, extraWorkspaceSizeInBytes);
API reference¶
custatevecApplyMatrixGetWorkspaceSize
¶
-
custatevecStatus_t custatevecApplyMatrixGetWorkspaceSize(custatevecHandle_t handle, cudaDataType_t svDataType, const uint32_t nIndexBits, const void *matrix, cudaDataType_t matrixDataType, custatevecMatrixLayout_t layout, const int32_t adjoint, const uint32_t nTargets, const uint32_t nControls, custatevecComputeType_t computeType, size_t *extraWorkspaceSizeInBytes)¶
This function gets the required workspace size for custatevecApplyMatrix().
This function returns the required extra workspace size to execute custatevecApplyMatrix().
extraWorkspaceSizeInBytes
will be set to 0 if no extra buffer is required for a given set of arguments.- Parameters
handle – [in] the handle to the cuStateVec context
svDataType – [in] Data type of the state vector
nIndexBits – [in] the number of index bits of the state vector
matrix – [in] host or device pointer to a matrix
matrixDataType – [in] data type of matrix
layout – [in] enumerator specifying the memory layout of matrix
adjoint – [in] apply adjoint of matrix
nTargets – [in] the number of target bits
nControls – [in] the number of control bits
computeType – [in] computeType of matrix multiplication
extraWorkspaceSizeInBytes – [out] workspace size
custatevecApplyMatrix
¶
-
custatevecStatus_t custatevecApplyMatrix(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, const void *matrix, cudaDataType_t matrixDataType, custatevecMatrixLayout_t layout, const int32_t adjoint, const int32_t *targets, const uint32_t nTargets, const int32_t *controls, const int32_t *controlBitValues, const uint32_t nControls, custatevecComputeType_t computeType, void *extraWorkspace, size_t extraWorkspaceSizeInBytes)¶
Apply gate matrix.
Apply gate matrix to a state vector. The state vector size is \(2^\text{nIndexBits}\).
The matrix argument is a host or device pointer of a 2-dimensional array for a square matrix. The size of matrix is ( \(2^\text{nTargets} \times 2^\text{nTargets}\) ) and the value type is specified by the
matrixDataType
argument. Thelayout
argument specifies the matrix layout which can be in either the row-major or column-major order. Thetargets
andcontrols
arguments specify target and control bit positions in the state vector index.The
controlBitValues
argument specifies bit values of control bits. The ordering ofcontrolBitValues
is specified by thecontrols
argument. If a null pointer is specified to this argument, all control bit values are set to 1.By definition, bit positions in
targets
andcontrols
arguments should not overlap.This function may return CUSTATEVEC_STATUS_INSUFFICIENT_WORKSPACE for large
nTargets
. In such cases, theextraWorkspace
andextraWorkspaceSizeInBytes
arguments should be specified to provide extra workspace. The size of required extra workspace is obtained by calling custatevecApplyMatrixGetWorkspaceSize(). A null pointer can be passed to theextraWorkspace
argument if no extra workspace is required. Also, if a device memory handler is set, theextraWorkspace
can be set to null, and theextraWorkspaceSizeInBytes
can be set to 0.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of index bits of the state vector
matrix – [in] host or device pointer to a square matrix
matrixDataType – [in] data type of matrix
layout – [in] enumerator specifying the memory layout of matrix
adjoint – [in] apply adjoint of matrix
targets – [in] pointer to a host array of target bits
nTargets – [in] the number of target bits
controls – [in] pointer to a host array of control bits
controlBitValues – [in] pointer to a host array of control bit values
nControls – [in] the number of control bits
computeType – [in] computeType of matrix multiplication
extraWorkspace – [in] extra workspace
extraWorkspaceSizeInBytes – [in] extra workspace size
Pauli Matrices¶
Exponential of a tensor product of Pauli matrices can be expressed as follows:
Matrix \(P_{target[i]}\) can be either of Pauli matrices \(I\), \(X\), \(Y\), and \(Z\), which are corresponding to the custatevecPauli_t
enums CUSTATEVEC_PAULI_I
, CUSTATEVEC_PAULI_X
, CUSTATEVEC_PAULI_Y
, and CUSTATEVEC_PAULI_Z
, respectively.
Also refer to custatevecPauli_t for details.
Use case¶
// apply exponential
custatevecApplyPauliRotation(
handle, sv, svDataType, nIndexBits, theta, paulis, targets, nTargets,
controls, controlBitValues, nControls);
API reference¶
custatevecApplyPauliRotation
¶
-
custatevecStatus_t custatevecApplyPauliRotation(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, double theta, const custatevecPauli_t *paulis, const int32_t *targets, const uint32_t nTargets, const int32_t *controls, const int32_t *controlBitValues, const uint32_t nControls)¶
Apply the exponential of a multi-qubit Pauli operator.
Apply exponential of a tensor product of Pauli bases specified by bases, \( e^{i \theta P} \), where \(P\) is the product of Pauli bases. The
paulis
,targets
, andnTargets
arguments specify Pauli bases and their bit positions in the state vector index.At least one target and a corresponding Pauli basis should be specified.
The
controls
andnControls
arguments specifies the control bit positions in the state vector index.The
controlBitValues
argument specifies bit values of control bits. The ordering ofcontrolBitValues
is specified by thecontrols
argument. If a null pointer is specified to this argument, all control bit values are set to 1.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of bits in the state vector index
theta – [in] theta
paulis – [in] host pointer to custatevecPauli_t array
targets – [in] pointer to a host array of target bits
nTargets – [in] the number of target bits
controls – [in] pointer to a host array of control bits
controlBitValues – [in] pointer to a host array of control bit values
nControls – [in] the number of control bits
Generalized Permutation Matrices¶
A generalized permutation matrix can be expressed as the multiplication of a permutation matrix \(P\) and a diagonal matrix \(D\). For instance, we can decompose a 4 \(\times\) 4 generalized permutation matrix \(A\) as follows:
, where
When \(P\) is diagonal, the generalized permutation matrix is also diagonal. Similarly, when \(D\) is the identity matrix, the generalized permutation matrix becomes a permutation matrix.
The cuStateVec API custatevecApplyGeneralizedPermutationMatrix()
applies a generalized permutation matrix like \(A\) to a state vector.
The API may require extra workspace for large matrices,
whose size can be queried using custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize()
.
If a device memory handler is set, custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize()
can be skipped.
Use case¶
// check the size of external workspace
custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize(
handle, CUDA_C_64F, nIndexBits, permutation, diagonals, CUDA_C_64F, targets,
nTargets, nControls, &extraWorkspaceSizeInBytes);
// allocate external workspace if necessary
void* extraWorkspace = nullptr;
if (extraWorkspaceSizeInBytes > 0)
cudaMalloc(&extraWorkspace, extraWorkspaceSizeInBytes);
// apply a generalized permutation matrix
custatevecApplyGeneralizedPermutationMatrix(
handle, d_sv, CUDA_C_64F, nIndexBits, permutation, diagonals, CUDA_C_64F,
adjoint, targets, nTargets, controls, controlBitValues, nControls,
extraWorkspace, extraWorkspaceSizeInBytes);
The operation is equivalent to the following:
// sv, sv_temp: the state vector and temporary buffer.
int64_t sv_size = int64_t{1} << nIndexBits;
for (int64_t sv_idx = 0; sv_idx < sv_size; sv_idx++) {
// The basis of sv_idx is converted to permutation basis to obtain perm_idx
auto perm_idx = convertToPermutationBasis(sv_idx);
// apply generalized permutation matrix
if (adjoint == 0)
sv_temp[sv_idx] = sv[permutation[perm_idx]] * diagonals[perm_idx];
else
sv_temp[permutation[perm_idx]] = sv[sv_idx] * conj(diagonals[perm_idx]);
}
for (int64_t sv_idx = 0; sv_idx < sv_size; sv_idx++)
sv[sv_idx] = sv_temp[sv_idx];
API reference¶
custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize
¶
-
custatevecStatus_t custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize(custatevecHandle_t handle, cudaDataType_t svDataType, const uint32_t nIndexBits, const custatevecIndex_t *permutation, const void *diagonals, cudaDataType_t diagonalsDataType, const int32_t *targets, const uint32_t nTargets, const uint32_t nControls, size_t *extraWorkspaceSizeInBytes)¶
Get the extra workspace size required by custatevecApplyGeneralizedPermutationMatrix().
This function gets the size of extra workspace size required to execute custatevecApplyGeneralizedPermutationMatrix().
extraWorkspaceSizeInBytes
will be set to 0 if no extra buffer is required for a given set of arguments.- Parameters
handle – [in] the handle to the cuStateVec library
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of index bits of the state vector
permutation – [in] host or device pointer to a permutation table
diagonals – [in] host or device pointer to diagonal elements
diagonalsDataType – [in] data type of diagonals
targets – [in] pointer to a host array of target bits
nTargets – [in] the number of target bits
nControls – [in] the number of control bits
extraWorkspaceSizeInBytes – [out] extra workspace size
custatevecApplyGeneralizedPermutationMatrix
¶
-
custatevecStatus_t custatevecApplyGeneralizedPermutationMatrix(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, custatevecIndex_t *permutation, const void *diagonals, cudaDataType_t diagonalsDataType, const int32_t adjoint, const int32_t *targets, const uint32_t nTargets, const int32_t *controls, const int32_t *controlBitValues, const uint32_t nControls, void *extraWorkspace, size_t extraWorkspaceSizeInBytes)¶
Apply generalized permutation matrix.
This function applies the generalized permutation matrix.
The generalized permutation matrix, \(A\), is expressed as \(A = DP\), where \(D\) and \(P\) are diagonal and permutation matrices, respectively.
The permutation matrix, \(P\), is specified as a permutation table which is an array of custatevecIndex_t and passed to the
permutation
argument.The diagonal matrix, \(D\), is specified as an array of diagonal elements. The length of both arrays is \( 2^{{\text nTargets}} \). The
diagonalsDataType
argument specifies the type of diagonal elements.Below is the table of combinations of
svDataType
anddiagonalsDataType
arguments available in this version.svDataType
diagonalsDataType
CUDA_C_F64
CUDA_C_F64
CUDA_C_F32
CUDA_C_F64
CUDA_C_F32
CUDA_C_F32
This function can also be used to only apply either the diagonal or the permutation matrix. By passing a null pointer to the
permutation
argument, \(P\) is treated as an identity matrix, thus, only the diagonal matrix \(D\) is applied. Likewise, if a null pointer is passed to thediagonals
argument, \(D\) is treated as an identity matrix, and only the permutation matrix \(P\) is applied.The permutation argument should hold integers in [0, \( 2^{nTargets} \)). An integer should appear only once, otherwise the behavior of this function is undefined.
The
permutation
anddiagonals
arguments should not be null at the same time. In this case, this function returns CUSTATEVEC_STATUS_INVALID_VALUE.This function may return CUSTATEVEC_STATUS_INSUFFICIENT_WORKSPACE for large
nTargets
ornIndexBits
. In such cases, theextraWorkspace
andextraWorkspaceSizeInBytes
arguments should be specified to provide extra workspace. The size of required extra workspace is obtained by calling custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize().A null pointer can be passed to the
extraWorkspace
argument if no extra workspace is required. Also, if a device memory handler is set, theextraWorkspace
can be set to null, and theextraWorkspaceSizeInBytes
can be set to 0.Note
In this version, custatevecApplyGeneralizedPermutationMatrix() does not return error if an invalid
permutation
argument is specified.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of index bits of the state vector
permutation – [in] host or device pointer to a permutation table
diagonals – [in] host or device pointer to diagonal elements
diagonalsDataType – [in] data type of diagonals
adjoint – [in] apply adjoint of generalized permutation matrix
targets – [in] pointer to a host array of target bits
nTargets – [in] the number of target bits
controls – [in] pointer to a host array of control bits
controlBitValues – [in] pointer to a host array of control bit values
nControls – [in] the number of control bits
extraWorkspace – [in] extra workspace
extraWorkspaceSizeInBytes – [in] extra workspace size
Measurement¶
Measurement on Z-bases¶
Let us consider the measurement of an \(nIndexBits\)-qubit state vector \(sv\) on an \(nBasisBits\)-bit Z product basis \(basisBits\).
The sums of squared absolute values of state vector elements on the Z product basis, \(abs2sum0\) and \(abs2sum1\), are obtained by the followings:
Therefore, probabilities to obtain parity 0 and 1 are expressed in the following expression:
Depending on the measurement result, the state vector is collapsed. If parity is equal to 0, we obtain the following vector:
and if parity is equal to 1, we obtain the following vector:
where \(norm\) is the normalization factor.
Use case¶
We can measure by custatevecMeasureOnZBasis()
as follows:
// measure on a Z basis
custatevecMeasureOnZBasis(
handle, sv, svDataType, nIndexBits, &parity, basisBits, nBasisBits,
randnum, collapse);
The operation is equivalent to the following:
// compute the sums of squared absolute values of state vector elements
// on a Z product basis
double abs2sum0, abs2sum1;
custatevecAbs2SumOnZBasis(
handle, sv, svDataType, nIndexBits, &abs2sum0, &abs2sum1, basisBits,
nBasisBits);
// [User] compute parity and norm
double abs2sum = abs2sum0 + abs2sum1;
int parity = (randnum * abs2sum < abs2sum0) ? 0 : 1;
double norm = (parity == 0) ? abs2sum0 : abs2sum1;
// collapse if necessary
switch (collapse) {
case CUSTATEVEC_COLLAPSE_NORMALIZE_AND_ZERO:
custatevecCollapseOnZBasis(
handle, sv, svDataType, nIndexBits, parity, basisBits, nBasisBits,
norm);
break; /* collapse */
case CUSTATEVEC_COLLAPSE_NONE:
break; /* Do nothing */
API reference¶
custatevecAbs2SumOnZBasis
¶
-
custatevecStatus_t custatevecAbs2SumOnZBasis(custatevecHandle_t handle, const void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, double *abs2sum0, double *abs2sum1, const int32_t *basisBits, const uint32_t nBasisBits)¶
Calculates the sum of squared absolute values on a given Z product basis.
This function calculates sums of squared absolute values on a given Z product basis. If a null pointer is specified to
abs2sum0
orabs2sum1
, the sum for the corresponding value is not calculated. Since the sum of (abs2sum0
+abs2sum1
) is identical to the norm of the state vector, one can calculate the probability where parity == 0 as (abs2sum0
/ (abs2sum0
+abs2sum1
)).- Parameters
handle – [in] the handle to the cuStateVec library
sv – [in] state vector
svDataType – [in] data type of state vector
nIndexBits – [in] the number of index bits
abs2sum0 – [out] pointer to a host or device variable to store the sum of squared absolute values for parity == 0
abs2sum1 – [out] pointer to a host or device variable to store the sum of squared absolute values for parity == 1
basisBits – [in] pointer to a host array of Z-basis index bits
nBasisBits – [in] the number of basisBits
custatevecCollapseOnZBasis
¶
-
custatevecStatus_t custatevecCollapseOnZBasis(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, const int32_t parity, const int32_t *basisBits, const uint32_t nBasisBits, double norm)¶
Collapse state vector on a given Z product basis.
This function collapses state vector on a given Z product basis. The state elements that match the parity argument are scaled by a factor specified in the
norm
argument. Other elements are set to zero.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] data type of state vector
nIndexBits – [in] the number of index bits
parity – [in] parity, 0 or 1
basisBits – [in] pointer to a host array of Z-basis index bits
nBasisBits – [in] the number of Z basis bits
norm – [in] normalization factor
custatevecMeasureOnZBasis
¶
-
custatevecStatus_t custatevecMeasureOnZBasis(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, int32_t *parity, const int32_t *basisBits, const uint32_t nBasisBits, const double randnum, enum custatevecCollapseOp_t collapse)¶
Measurement on a given Z-product basis.
This function does measurement on a given Z product basis. The measurement result is the parity of the specified Z product basis. At least one basis bit should be specified, otherwise this function fails.
If CUSTATEVEC_COLLAPSE_NONE is specified for the collapse argument, this function only returns the measurement result without collapsing the state vector. If CUSTATEVEC_COLLAPSE_NORMALIZE_AND_ZERO is specified, this function collapses the state vector as custatevecCollapseOnZBasis() does.
If a random number is not in [0, 1), this function returns CUSTATEVEC_STATUS_INVALID_VALUE. At least one basis bit should be specified, otherwise this function returns CUSTATEVEC_STATUS_INVALID_VALUE.
- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] data type of state vector
nIndexBits – [in] the number of index bits
parity – [out] parity, 0 or 1
basisBits – [in] pointer to a host array of Z basis bits
nBasisBits – [in] the number of Z basis bits
randnum – [in] random number, [0, 1).
collapse – [in] Collapse operation
Batched Single Qubit Measurement¶
Assume that we measure an \(nIndexBits\)-qubits state vector \(sv\) with a \(bitOrderingLen\)-bits bit string \(bitOrdering\).
The sums of squared absolute values of state vector elements are obtained by the following:
where \(idx = b_{BitOrderingLen-1}\cdots b_1 b_0\), \(i = b_{bitOrdering[BitOrderingLen-1]} \cdots b_{bitOrdering[1]} b_{bitOrdering[0]}\), \(b_p \in \{0, 1\}\).
Therefore, probability to obtain the \(idx\)-th pattern of bits are expressed in the following expression:
Depending on the measurement result, the state vector is collapsed.
If \(idx\) satisfies \((idx \ \& \ bitString) = idx\), we obtain \(sv[idx] = \dfrac{1}{\sqrt{norm}} sv[idx]\). Otherwise, \(sv[idx] = 0\), where \(norm\) is the normalization factor.
Use case¶
We can measure by custatevecBatchMeasure()
as follows:
// measure with a bit string
custatevecBatchMeasure(
handle, sv, svDataType, nIndexBits, bitString, bitOrdering, bitStringLen,
randnum, collapse);
The operation is equivalent to the following:
// compute the sums of squared absolute values of state vector elements
int maskLen = 0;
int* maskBitString = nullptr;
int* maskOrdering = nullptr;
custatevecAbs2SumArray(
handle, sv, svDataType, nIndexBits, abs2Sum, bitOrdering, bitOrderingLen,
maskBitString, maskOrdering, maskLen);
// [User] compute a cumulative sum and choose bitString by a random number
// collapse if necessary
switch (collapse) {
case CUSTATEVEC_COLLAPSE_NORMALIZE_AND_ZERO:
custatevecCollapseByBitString(
handle, sv, svDataType, nIndexBits, bitString, bitOrdering,
bitStringLen, norm);
break; /* collapse */
case CUSTATEVEC_COLLAPSE_NONE:
break; /* Do nothing */
For multi-GPU computations, custatevecBatchMeasureWithOffset()
is available.
This function works on one device, and users are required to compute
the cumulative array of squared absolute values of state vector elements beforehand.
// The state vector is divided to nSubSvs sub state vectors.
// each state vector has its own ordinal and nLocalBits index bits.
// The ordinals of sub state vectors correspond to the extended index bits.
// In this example, all the local qubits are measured and collapsed.
// get abs2sum for each sub state vector
double abs2SumArray[nSubSvs];
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
custatevecAbs2SumArray(
handle[iSv], d_sv[iSv], CUDA_C_64F, nLocalBits, &abs2SumArray[iSv], nullptr,
0, nullptr, nullptr, 0);
}
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
cudaDeviceSynchronize();
}
// get cumulative array
double cumulativeArray[nSubSvs + 1];
cumulativeArray[0] = 0.0;
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cumulativeArray[iSv + 1] = cumulativeArray[iSv] + abs2SumArray[iSv];
}
// measurement
for (int iSv = 0; iSv < nSubSvs; iSv++) {
// detect which sub state vector will be used for measurement.
if (cumulativeArray[iSv] <= randnum && randnum < cumulativeArray[iSv + 1]) {
double norm = cumulativeArray[nSubSvs];
double offset = cumulativeArray[iSv];
cudaSetDevice(devices[iSv]);
// measure local qubits. Here the state vector will not be collapsed.
// Only local qubits can be included in bitOrdering and bitString arguments.
// That is, bitOrdering = {0, 1, 2, ..., nLocalBits - 1} and
// bitString will store values of local qubits as an array of integers.
custatevecBatchMeasureWithOffset(
handle[iSv], d_sv[iSv], CUDA_C_64F, nLocalBits, bitString, bitOrdering,
bitStringLen, randnum, CUSTATEVEC_COLLAPSE_NONE, offset, norm);
}
}
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
cudaDeviceSynchronize();
}
// get abs2Sum after collapse
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
custatevecAbs2SumArray(
handle[iSv], d_sv[iSv], CUDA_C_64F, nLocalBits, &abs2SumArray[iSv], nullptr,
0, bitString, bitOrdering, bitStringLen);
}
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
cudaDeviceSynchronize();
}
// get norm after collapse
double norm = 0.0;
for (int iSv = 0; iSv < nSubSvs; iSv++) {
norm += abs2SumArray[iSv];
}
// collapse sub state vectors
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
custatevecCollapseByBitString(
handle[iSv], d_sv[iSv], CUDA_C_64F, nLocalBits, bitString, bitOrdering,
bitStringLen, norm);
}
// destroy handle
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
custatevecDestroy(handle[iSv]);
}
Please refer to NVIDIA/cuQuantum repository for further detail.
API reference¶
custatevecAbs2SumArray
¶
-
custatevecStatus_t custatevecAbs2SumArray(custatevecHandle_t handle, const void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, double *abs2sum, const int32_t *bitOrdering, const uint32_t bitOrderingLen, const int32_t *maskBitString, const int32_t *maskOrdering, const uint32_t maskLen)¶
Calculate abs2sum array for a given set of index bits.
Calculates an array of sums of squared absolute values of state vector elements. The abs2sum array can be on host or device. The index bit ordering abs2sum array is specified by the
bitOrdering
andbitOrderingLen
arguments. Unspecified bits are folded (summed up).The
maskBitString
,maskOrdering
andmaskLen
arguments set bit mask in the state vector index. The abs2sum array is calculated by using state vector elements whose indices match the mask bit string. If themaskLen
argument is 0, null pointers can be specified to themaskBitString
andmaskOrdering
arguments, and all state vector elements are used for calculation.By definition, bit positions in
bitOrdering
andmaskOrdering
arguments should not overlap.The empty
bitOrdering
can be specified to calculate the norm of state vector. In this case, 0 is passed to thebitOrderingLen
argument and thebitOrdering
argument can be a null pointer.Note
Since the size of abs2sum array is proportional to \( 2^{bitOrderingLen} \) , the max length of
bitOrdering
depends on the amount of available memory andmaskLen
.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [in] state vector
svDataType – [in] data type of state vector
nIndexBits – [in] the number of index bits
abs2sum – [out] pointer to a host or device array of sums of squared absolute values
bitOrdering – [in] pointer to a host array of index bit ordering
bitOrderingLen – [in] the length of bitOrdering
maskBitString – [in] pointer to a host array for a bit string to specify mask
maskOrdering – [in] pointer to a host array for the mask ordering
maskLen – [in] the length of mask
custatevecCollapseByBitString
¶
-
custatevecStatus_t custatevecCollapseByBitString(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, const int32_t *bitString, const int32_t *bitOrdering, const uint32_t bitStringLen, double norm)¶
Collapse state vector to the state specified by a given bit string.
This function collapses state vector to the state specified by a given bit string. The state vector elements specified by the
bitString
,bitOrdering
andbitStringLen
arguments are normalized by thenorm
argument. Other elements are set to zero.At least one basis bit should be specified, otherwise this function returns CUSTATEVEC_STATUS_INVALID_VALUE.
- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] data type of state vector
nIndexBits – [in] the number of index bits
bitString – [in] pointer to a host array of bit string
bitOrdering – [in] pointer to a host array of bit string ordering
bitStringLen – [in] length of bit string
norm – [in] normalization constant
custatevecBatchMeasure
¶
-
custatevecStatus_t custatevecBatchMeasure(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, int32_t *bitString, const int32_t *bitOrdering, const uint32_t bitStringLen, const double randnum, enum custatevecCollapseOp_t collapse)¶
Batched single qubit measurement.
This function does batched single qubit measurement and returns a bit string. The
bitOrdering
argument specifies index bits to be measured. The measurement result is stored inbitString
in the ordering specified by thebitOrdering
argument.If CUSTATEVEC_COLLAPSE_NONE is specified for the
collapse
argument, this function only returns the measured bit string without collapsing the state vector. When CUSTATEVEC_COLLAPSE_NORMALIZE_AND_ZERO is specified, this function collapses the state vector as custatevecCollapseByBitString() does.If a random number is not in [0, 1), this function returns CUSTATEVEC_STATUS_INVALID_VALUE. At least one basis bit should be specified, otherwise this function returns CUSTATEVEC_STATUS_INVALID_VALUE.
- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of index bits
bitString – [out] pointer to a host array of measured bit string
bitOrdering – [in] pointer to a host array of bit string ordering
bitStringLen – [in] length of bitString
randnum – [in] random number, [0, 1).
collapse – [in] Collapse operation
custatevecBatchMeasureWithOffset
¶
-
custatevecStatus_t custatevecBatchMeasureWithOffset(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, int32_t *bitString, const int32_t *bitOrdering, const uint32_t bitStringLen, const double randnum, enum custatevecCollapseOp_t collapse, const double offset, const double abs2sum)¶
Batched single qubit measurement for partial vector.
This function does batched single qubit measurement and returns a bit string. The
bitOrdering
argument specifies index bits to be measured. The measurement result is stored inbitString
in the ordering specified by thebitOrdering
argument.If CUSTATEVEC_COLLAPSE_NONE is specified for the collapse argument, this function only returns the measured bit string without collapsing the state vector. When CUSTATEVEC_COLLAPSE_NORMALIZE_AND_ZERO is specified, this function collapses the state vector as custatevecCollapseByBitString() does.
This function assumes that
sv
is partial state vector and drops some most significant bits. Prefix sums for lower indices and the entire state vector must be provided asoffset
andabs2sum
, respectively. Whenoffset
==abs2sum
== 0, this function behaves in the same way as custatevecBatchMeasure().If a random number is not in [0, 1), this function returns CUSTATEVEC_STATUS_INVALID_VALUE. At least one basis bit should be specified, otherwise this function returns CUSTATEVEC_STATUS_INVALID_VALUE.
- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] partial state vector
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of index bits
bitString – [out] pointer to a host array of measured bit string
bitOrdering – [in] pointer to a host array of bit string ordering
bitStringLen – [in] length of bitString
randnum – [in] random number, [0, 1).
collapse – [in] Collapse operation
offset – [in] partial sum of squared absolute values
abs2sum – [in] sum of squared absolute values for the entire state vector
Expectation¶
Expectation via a Matrix¶
Expectation performs the following operation:
where \(\ket{\phi}\) is a state vector and \(A\) is a matrix or an observer.
The API for expectation custatevecComputeExpectation()
may require external workspace for large matrices,
and custatevecComputeExpectationGetWorkspaceSize()
provides the size of workspace.
If a device memory handler is set, custatevecComputeExpectationGetWorkspaceSize()
can be skipped.
Use case¶
// check the size of external workspace
custatevecComputeExpectationGetWorkspaceSize(
handle, svDataType, nIndexBits, matrix, matrixDataType, layout, nBasisBits, computeType,
&extraWorkspaceSizeInBytes);
// allocate external workspace if necessary
void* extraWorkspace = nullptr;
if (extraWorkspaceSizeInBytes > 0)
cudaMalloc(&extraWorkspace, extraWorkspaceSizeInBytes);
// perform expectation
custatevecComputeExpectation(
handle, sv, svDataType, nIndexBits, expect, expectDataType, residualNorm,
matrix, matrixDataType, layout, basisBits, nBasisBits, computeType,
extraWorkspace, extraWorkspaceSizeInBytes);
API reference¶
custatevecComputeExpectationGetWorkspaceSize
¶
-
custatevecStatus_t custatevecComputeExpectationGetWorkspaceSize(custatevecHandle_t handle, cudaDataType_t svDataType, const uint32_t nIndexBits, const void *matrix, cudaDataType_t matrixDataType, custatevecMatrixLayout_t layout, const uint32_t nBasisBits, custatevecComputeType_t computeType, size_t *extraWorkspaceSizeInBytes)¶
This function gets the required workspace size for custatevecComputeExpectation().
This function returns the size of the extra workspace required to execute custatevecComputeExpectation().
extraWorkspaceSizeInBytes
will be set to 0 if no extra buffer is required.- Parameters
handle – [in] the handle to the cuStateVec context
svDataType – [in] Data type of the state vector
nIndexBits – [in] the number of index bits of the state vector
matrix – [in] host or device pointer to a matrix
matrixDataType – [in] data type of matrix
layout – [in] enumerator specifying the memory layout of matrix
nBasisBits – [in] the number of target bits
computeType – [in] computeType of matrix multiplication
extraWorkspaceSizeInBytes – [out] size of the extra workspace
custatevecComputeExpectation
¶
-
custatevecStatus_t custatevecComputeExpectation(custatevecHandle_t handle, const void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, void *expectationValue, cudaDataType_t expectationDataType, double *residualNorm, const void *matrix, cudaDataType_t matrixDataType, custatevecMatrixLayout_t layout, const int32_t *basisBits, const uint32_t nBasisBits, custatevecComputeType_t computeType, void *extraWorkspace, size_t extraWorkspaceSizeInBytes)¶
Compute expectation of matrix observable.
This function calculates expectation for a given matrix observable. The acceptable values for the
expectationDataType
argument are CUDA_R_64F and CUDA_C_64F.The
basisBits
andnBasisBits
arguments specify the basis to calculate expectation. For thecomputeType
argument, the same combinations for custatevecApplyMatrix() are available.This function may return CUSTATEVEC_STATUS_INSUFFICIENT_WORKSPACE for large
nBasisBits
. In such cases, theextraWorkspace
andextraWorkspaceSizeInBytes
arguments should be specified to provide extra workspace. The size of required extra workspace is obtained by calling custatevecComputeExpectationGetWorkspaceSize(). A null pointer can be passed to theextraWorkspace
argument if no extra workspace is required. Also, if a device memory handler is set, theextraWorkspace
can be set to null, and theextraWorkspaceSizeInBytes
can be set to 0.Note
The
residualNorm
argument is not available in this version. If a matrix given by the matrix argument may not be a Hermitian matrix, please specify CUDA_C_64F to theexpectationDataType
argument and check the imaginary part of the calculated expectation value.- Parameters
handle – [in] the handle to the cuStateVec library
expectationValue – [out] host pointer to a variable to store an expectation value
expectationDataType – [in] data type of expect
residualNorm – [out] result of matrix type test
sv – [in] state vector
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of index bits of the state vector
matrix – [in] observable as matrix
matrixDataType – [in] data type of matrix
layout – [in] matrix memory layout
basisBits – [in] pointer to a host array of basis index bits
nBasisBits – [in] the number of basis bits
computeType – [in] computeType of matrix multiplication
extraWorkspace – [in] pointer to an extra workspace
extraWorkspaceSizeInBytes – [in] the size of extra workspace
Expectation on Pauli Basis¶
cuStateVec API custatevecComputeExpectationsOnPauliBasis()
computes expectation values for a batch of Pauli strings.
Each observable can be expressed as follows:
Each matrix \(P_{\text{basisBits}[i]}\) can be one of the Pauli matrices \(I\), \(X\), \(Y\), and \(Z\), corresponding to the custatevecPauli_t
enums CUSTATEVEC_PAULI_I
, CUSTATEVEC_PAULI_X
, CUSTATEVEC_PAULI_Y
, and CUSTATEVEC_PAULI_Z
, respectively.
Also refer to custatevecPauli_t for details.
Use case¶
// calculate the norm and the expectations for Z(q1) and X(q0)Y(q2)
uint32_t nPauliOperatorArrays = 3;
custatevecPauli_t pauliOperators0[] = {}; // III
int32_t basisBits0[] = {};
custatevecPauli_t pauliOperators1[] = {CUSTATEVEC_PAULI_Z}; // IZI
int32_t basisBits1[] = {1};
custatevecPauli_t pauliOperators2[] = {CUSTATEVEC_PAULI_X, CUSTATEVEC_PAULI_Y}; // XIY
int32_t basisBits2[] = {0, 2};
const uint32_t nBasisBitsArray[] = {0, 1, 2};
const custatevecPauli_t*
pauliOperatorsArray[] = {pauliOperators0, pauliOperators1, pauliOperators2};
const int32_t *basisBitsArray[] = { basisBits0, basisBits1, basisBits2};
uint32_t nIndexBits = 3;
double expectationValues[nPauliOperatorArrays];
custatevecComputeExpectationsOnPauliBasis(
handle, sv, svDataType, nIndexBits, expectationValues,
pauliOperatorsArray, nPauliOperatorArrays,
basisBitsArray, nBasisBitsArray);
API reference¶
custatevecComputeExpectationsOnPauliBasis
¶
-
custatevecStatus_t custatevecComputeExpectationsOnPauliBasis(custatevecHandle_t handle, const void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, double *expectationValues, const custatevecPauli_t **pauliOperatorsArray, const uint32_t nPauliOperatorArrays, const int32_t **basisBitsArray, const uint32_t *nBasisBitsArray)¶
Calculate expectation values for a batch of (multi-qubit) Pauli operators.
This function calculates multiple expectation values for given sequences of Pauli operators by a single call.
A single Pauli operator sequence, pauliOperators, is represented by using an array of custatevecPauli_t. The basis bits on which these Pauli operators are acting are represented by an array of index bit positions. If no Pauli operator is specified for an index bit, the identity operator (CUSTATEVEC_PAULI_I) is implicitly assumed.
The length of pauliOperators and basisBits are the same and specified by nBasisBits.
The number of Pauli operator sequences is specified by the
nPauliOperatorArrays
argument.Multiple sequences of Pauli operators are represented in the form of arrays of arrays in the following manners:
The
pauliOperatorsArray
argument is an array for arrays of custatevecPauli_t.The
basisBitsArray
is an array of the arrays of basis bit positions.The
nBasisBitsArray
argument holds an array of the length of Pauli operator sequences and basis bit arrays.
Calculated expectation values are stored in a host buffer specified by the
expectationValues
argument of lengthnPauliOpeartorsArrays
.This function returns CUSTATEVEC_STATUS_INVALID_VALUE if basis bits specified for a Pauli operator sequence has duplicates and/or out of the range of [0,
nIndexBits
).This function accepts empty Pauli operator sequence to get the norm of the state vector.
- Parameters
handle – [in] the handle to the cuStateVec library
sv – [in] state vector
svDataType – [in] data type of the state vector
nIndexBits – [in] the number of index bits of the state vector
expectationValues – [out] pointer to a host array to store expectation values
pauliOperatorsArray – [in] pointer to a host array of Pauli operator arrays
nPauliOperatorArrays – [in] the number of Pauli operator arrays
basisBitsArray – [in] host array of basis bit arrays
nBasisBitsArray – [in] host array of the number of basis bits
Sampling¶
Sampling enables to obtain measurement results many times by using probability calculated from quantum states.
Use case¶
// create sampler and check the size of external workspace
custatevecSamplerCreate(
handle, sv, svDataType, nIndexBits, &sampler, nMaxShots,
&extraWorkspaceSizeInBytes);
// allocate external workspace if necessary
void extraWorkspace = nullptr;
if (extraWorkspaceSizeInBytes > 0)
cudaMalloc(extraWorkspace, extraWorkspaceSizeInBytes);
// calculate cumulative abs2sum
custatevecSamplerPreprocess(
handle, sampler, extraWorkspace, extraWorkspaceSizeInBytes);
// [User] generate randnums, array of random numbers [0, 1) for sampling
...
// sample bit strings
custatevecSamplerSample(
handle, sampler, bitStrings, bitOrdering, bitStringLen, randnums, nShots,
output);
// deallocate the sampler
custatevecSamplerDestroy(sampler);
For multi-GPU computations, cuStateVec provides custatevecSamplerGetSquaredNorm()
and
custatevecSamplerApplySubSVOffset()
.
Users are required to calculate cumulative abs2sum array with the squared norm of
each sub state vector via custatevecSamplerGetSquaredNorm()
and provide its values
to the sampler descriptor via custatevecSamplerApplySubSVOffset()
.
// The state vector is divided to nSubSvs sub state vectors.
// each state vector has its own ordinal and nLocalBits index bits.
// The ordinals of sub state vectors correspond to the extended index bits.
// create sampler and check the size of external workspace
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
custatevecSamplerCreate(
handle[iSv], d_sv[iSv], CUDA_C_64F, nLocalBits, &sampler[iSv], nMaxShots,
&extraWorkspaceSizeInBytes[iSv]);
}
// allocate external workspace if necessary
for (int iSv = 0; iSv < nSubSvs; iSv++) {
if (extraWorkspaceSizeInBytes[iSv] > 0) {
cudaSetDevice(devices[iSv]);
cudaMalloc(&extraWorkspace[iSv], extraWorkspaceSizeInBytes[iSv]);
}
}
// sample preprocess
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]) );
custatevecSampler_preprocess(
handle[iSv], sampler[iSv], extraWorkspace[iSv],
extraWorkspaceSizeInBytes[iSv]);
}
// get norm of the sub state vectors
double subNorms[nSubSvs];
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
custatevecSamplerGetSquaredNorm(
handle[iSv], sampler[iSv], &subNorms[iSv]);
}
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
cudaDeviceSynchronize();
}
// get cumulative array
double cumulativeArray[nSubSvs + 1];
cumulativeArray[0] = 0.0;
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cumulativeArray[iSv + 1] = cumulativeArray[iSv] + subNorms[iSv];
}
double norm = cumulativeArray[nSubSvs];
// apply offset and norm
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]) );
custatevecSamplerApplySubSVOffset(
handle[iSv], sampler[iSv], iSv, nSubSvs, cumulativeArray[iSv], norm);
}
// divide randnum array. randnums must be sorted in the ascending order.
int shotOffsets[nSubSvs + 1];
shotOffsets[0] = 0;
for (int iSv = 0; iSv < nSubSvs; iSv++) {
double* pos = std::lower_bound(randnums, randnums + nShots,
cumulativeArray[iSv + 1] / norm);
if (iSv == nSubSvs - 1) {
pos = randnums + nShots;
}
shotOffsets[iSv + 1] = pos - randnums;
}
// sample bit strings
for (int iSv = 0; iSv < nSubSvs; iSv++) {
int shotOffset = shotOffsets[iSv];
int nSubShots = shotOffsets[iSv + 1] - shotOffsets[iSv];
if (nSubShots > 0) {
cudaSetDevice(devices[iSv]);
custatevecSamplerSample(
handle[iSv], sampler[iSv], &bitStrings[shotOffset], bitOrdering,
bitStringLen, &randnums[shotOffset], nSubShots,
CUSTATEVEC_SAMPLER_OUTPUT_RANDNUM_ORDER);
}
}
for (int iSv = 0; iSv < nSubSvs; iSv++) {
cudaSetDevice(devices[iSv]);
custatevecSamplerDestroy(sampler[iSv]);
}
Please refer to NVIDIA/cuQuantum repository for further detail.
API reference¶
custatevecSamplerCreate
¶
-
custatevecStatus_t custatevecSamplerCreate(custatevecHandle_t handle, const void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, custatevecSamplerDescriptor_t *sampler, uint32_t nMaxShots, size_t *extraWorkspaceSizeInBytes)¶
Create sampler descriptor.
This function creates a sampler descriptor. If an extra workspace is required, its size is set to
extraWorkspaceSizeInBytes
.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [in] pointer to state vector
svDataType – [in] data type of state vector
nIndexBits – [in] the number of index bits of the state vector
sampler – [out] pointer to a new sampler descriptor
nMaxShots – [in] the max number of shots used for this sampler context
extraWorkspaceSizeInBytes – [out] workspace size
custatevecSamplerPreprocess
¶
-
custatevecStatus_t custatevecSamplerPreprocess(custatevecHandle_t handle, custatevecSamplerDescriptor_t sampler, void *extraWorkspace, const size_t extraWorkspaceSizeInBytes)¶
Preprocess the state vector for preparation of sampling.
This function prepares internal states of the sampler descriptor. If a device memory handler is set, the
extraWorkspace
can be set to null, and theextraWorkspaceSizeInBytes
can be set to 0. Otherwise, a pointer passed to theextraWorkspace
argument is associated to the sampler handle and should be kept during its life time. The size ofextraWorkspace
is obtained when custatevecSamplerCreate() is called.- Parameters
handle – [in] the handle to the cuStateVec library
sampler – [inout] the sampler descriptor
extraWorkspace – [in] extra workspace
extraWorkspaceSizeInBytes – [in] size of the extra workspace
custatevecSamplerGetSquaredNorm
¶
-
custatevecStatus_t custatevecSamplerGetSquaredNorm(custatevecHandle_t handle, custatevecSamplerDescriptor_t sampler, double *norm)¶
Get the squared norm of the state vector.
This function returns the squared norm of the state vector. An intended use case is sampling with multiple devices. This API should be called after custatevecSamplerPreprocess(). Otherwise, the behavior of this function is undefined.
- Parameters
handle – [in] the handle to the cuStateVec library
sampler – [in] the sampler descriptor
norm – [out] the norm of the state vector
custatevecSamplerApplySubSVOffset
¶
-
custatevecStatus_t custatevecSamplerApplySubSVOffset(custatevecHandle_t handle, custatevecSamplerDescriptor_t sampler, int32_t subSVOrd, uint32_t nSubSVs, double offset, double norm)¶
Apply the partial norm and norm to the state vector to the sample descriptor.
This function applies offsets assuming the given state vector is a sub state vector. An intended use case is sampling with distributed state vectors. The
nSubSVs
argument should be a power of 2 andsubSVOrd
should be less thannSubSVs
. Otherwise, this function returns CUSTATEVEC_STATUS_INVALID_VALUE.- Parameters
handle – [in] the handle to the cuStateVec library
sampler – [in] the sampler descriptor
subSVOrd – [in] sub state vector ordinal
nSubSVs – [in] the number of sub state vectors
offset – [in] cumulative sum offset for the sub state vector
norm – [in] norm for all sub vectors
custatevecSamplerSample
¶
-
custatevecStatus_t custatevecSamplerSample(custatevecHandle_t handle, custatevecSamplerDescriptor_t sampler, custatevecIndex_t *bitStrings, const int32_t *bitOrdering, const uint32_t bitStringLen, const double *randnums, const uint32_t nShots, enum custatevecSamplerOutput_t output)¶
Sample bit strings from the state vector.
This function does sampling. The
bitOrdering
andbitStringLen
arguments specify bits to be sampled. Sampled bit strings are represented as an array of custatevecIndex_t and are stored to the host memory buffer that thebitStrings
argument points to.The
randnums
argument is an array of user-generated random numbers whose length isnShots
. The range of random numbers should be in [0, 1). A random number given by therandnums
argument is clipped to [0, 1) if its range is not in [0, 1).The
output
argument specifies the order of sampled bit strings:If CUSTATEVEC_SAMPLER_OUTPUT_RANDNUM_ORDER is specified, the order of sampled bit strings is the same as that in the
randnums
argument.If CUSTATEVEC_SAMPLER_OUTPUT_ASCENDING_ORDER is specified, bit strings are returned in the ascending order.
This API should be called after custatevecSamplerPreprocess(). Otherwise, the behavior of this function is undefined. By calling custatevecSamplerApplySubSVOffset() prior to this function, it is possible to sample bits corresponding to the ordinal of sub state vector.
- Parameters
handle – [in] the handle to the cuStateVec library
sampler – [in] the sampler descriptor
bitStrings – [out] pointer to a host array to store sampled bit strings
bitOrdering – [in] pointer to a host array of bit ordering for sampling
bitStringLen – [in] the number of bits in bitOrdering
randnums – [in] pointer to an array of random numbers
nShots – [in] the number of shots
output – [in] the order of sampled bit strings
custatevecSamplerDestroy
¶
-
custatevecStatus_t custatevecSamplerDestroy(custatevecSamplerDescriptor_t sampler)¶
This function releases resources used by the sampler.
- Parameters
sampler – [in] the sampler descriptor
Accessor¶
An accessor extracts or updates state vector segments.
The APIs custatevecAccessorCreate()
and custatevecAccessorCreateView()
initialize an accessor and also return the size of an extra workspace (if needed by the APIs custatevecAccessorGet()
and custatevecAccessorSet()
to perform the copy).
The workspace must be bound to an accessor by custatevecAccessorSetExtraWorkspace()
, and the lifetime of the workspace must be as long as the accessor’s to cover the entire duration of the copy operation.
If a device memory handler is set, it is not necessary to provide explicit workspace by users.
The begin
and end
arguments in the Get/Set APIs correspond to the state vector elements’ indices such that elements within the specified range are copied.
Use case¶
Extraction¶
// create accessor and check the size of external workspace
custatevecAccessorCreateView(
handle, d_sv, CUDA_C_64F, nIndexBits, &accessor, bitOrdering, bitOrderingLen,
maskBitString, maskOrdering, maskLen, &extraWorkspaceSizeInBytes);
// allocate external workspace if necessary
if (extraWorkspaceSizeInBytes > 0)
cudaMalloc(&extraWorkspace, extraWorkspaceSizeInBytes);
// set external workspace
custatevecAccessorSetExtraWorkspace(
handle, &accessor, extraWorkspace, extraWorkspaceSizeInBytes);
// get state vector elements
custatevecAccessorGet(
handle, &accessor, buffer, accessBegin, accessEnd);
// deallocate the accessor
custatevecAccessorDestroy(accessor);
Update¶
// create accessor and check the size of external workspace
custatevecAccessorCreate(
handle, d_sv, CUDA_C_64F, nIndexBits, &accessor, bitOrdering, bitOrderingLen,
maskBitString, maskOrdering, maskLen, &extraWorkspaceSizeInBytes);
// allocate external workspace if necessary
if (extraWorkspaceSizeInBytes > 0)
cudaMalloc(&extraWorkspace, extraWorkspaceSizeInBytes);
// set external workspace
custatevecAccessorSetExtraWorkspace(
handle, &accessor, extraWorkspace, extraWorkspaceSizeInBytes);
// set state vector elements
custatevecAccessorSet(
handle, &accessor, buffer, 0, nSvSize);
// deallocate the accessor
custatevecAccessorDestroy(accessor);
API reference¶
custatevecAccessorCreate
¶
-
custatevecStatus_t custatevecAccessorCreate(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, custatevecAccessorDescriptor_t *accessor, const int32_t *bitOrdering, const uint32_t bitOrderingLen, const int32_t *maskBitString, const int32_t *maskOrdering, const uint32_t maskLen, size_t *extraWorkspaceSizeInBytes)¶
Create accessor to copy elements between the state vector and an external buffer.
Accessor copies state vector elements between the state vector and external buffers. During the copy, the ordering of state vector elements are rearranged according to the bit ordering specified by the
bitOrdering
argument.The state vector is assumed to have the default ordering: the LSB is the 0th index bit and the (N-1)th index bit is the MSB for an N index bit system. The bit ordering of the external buffer is specified by the
bitOrdering
argument. When 3 is given to thenIndexBits
argument and [1, 2, 0] to thebitOrdering
argument, the state vector index bits are permuted to specified bit positions. Thus, the state vector index is rearranged and mapped to the external buffer index as [0, 4, 1, 5, 2, 6, 3, 7].The
maskBitString
,maskOrdering
andmaskLen
arguments specify the bit mask for the state vector index being accessed. If themaskLen
argument is 0, themaskBitString
and/ormaskOrdering
arguments can be null.All bit positions [0,
nIndexBits
), should appear exactly once, either in thebitOrdering
or themaskOrdering
arguments. If a bit position does not appear in these arguments and/or there are overlaps of bit positions, this function returns CUSTATEVEC_STATUS_INVALID_VALUE.The extra workspace improves performance if the accessor is called multiple times with small external buffers placed on device. A null pointer can be specified to the
extraWorkspaceSizeInBytes
if the extra workspace is not necessary.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [in] state vector
svDataType – [in] Data type of state vector
nIndexBits – [in] the number of index bits of state vector
accessor – [in] pointer to an accessor descriptor
bitOrdering – [in] pointer to a host array to specify the basis bits of the external buffer
bitOrderingLen – [in] the length of bitOrdering
maskBitString – [in] pointer to a host array to specify the mask values to limit access
maskOrdering – [in] pointer to a host array for the mask ordering
maskLen – [in] the length of mask
extraWorkspaceSizeInBytes – [out] the required size of extra workspace
custatevecAccessorCreateView
¶
-
custatevecStatus_t custatevecAccessorCreateView(custatevecHandle_t handle, const void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, custatevecAccessorDescriptor_t *accessor, const int32_t *bitOrdering, const uint32_t bitOrderingLen, const int32_t *maskBitString, const int32_t *maskOrdering, const uint32_t maskLen, size_t *extraWorkspaceSizeInBytes)¶
Create accessor for the constant state vector.
This function is the same as custatevecAccessorCreate(), but only accepts the constant state vector.
- Parameters
handle – [in] the handle to the cuStateVec library
sv – [in] state vector
svDataType – [in] Data type of state vector
nIndexBits – [in] the number of index bits of state vector
accessor – [in] pointer to an accessor descriptor
bitOrdering – [in] pointer to a host array to specify the basis bits of the external buffer
bitOrderingLen – [in] the length of bitOrdering
maskBitString – [in] pointer to a host array to specify the mask values to limit access
maskOrdering – [in] pointer to a host array for the mask ordering
maskLen – [in] the length of mask
extraWorkspaceSizeInBytes – [out] the required size of extra workspace
custatevecAccessorDestroy
¶
-
custatevecStatus_t custatevecAccessorDestroy(custatevecAccessorDescriptor_t accessor)¶
This function releases resources used by the accessor.
- Parameters
accessor – [in] the accessor descriptor
custatevecAccessorSetExtraWorkspace
¶
-
custatevecStatus_t custatevecAccessorSetExtraWorkspace(custatevecHandle_t handle, custatevecAccessorDescriptor_t accessor, void *extraWorkspace, size_t extraWorkspaceSizeInBytes)¶
Set the external workspace to the accessor.
This function sets the extra workspace to the accessor. The required size for extra workspace can be obtained by custatevecAccessorCreate() or custatevecAccessorCreateView(). if a device memory handler is set, the
extraWorkspace
can be set to null, and theextraWorkspaceSizeInBytes
can be set to 0.- Parameters
handle – [in] the handle to the cuStateVec library
accessor – [in] the accessor descriptor
extraWorkspace – [in] extra workspace
extraWorkspaceSizeInBytes – [in] extra workspace size
custatevecAccessorGet
¶
-
custatevecStatus_t custatevecAccessorGet(custatevecHandle_t handle, custatevecAccessorDescriptor_t accessor, void *externalBuffer, const custatevecIndex_t begin, const custatevecIndex_t end)¶
Copy state vector elements to an external buffer.
This function copies state vector elements to an external buffer specified by the
externalBuffer
argument. During the copy, the index bit is permuted as specified by thebitOrdering
argument in custatevecAccessorCreate() or custatevecAccessorCreateView().The
begin
andend
arguments specify the range of state vector elements being copied. Both arguments have the bit ordering specified by thebitOrdering
argument.- Parameters
handle – [in] the handle to the cuStateVec library
accessor – [in] the accessor descriptor
externalBuffer – [out] pointer to a host or device buffer to receive copied elements
begin – [in] index in the permuted bit ordering for the first elements being copied to the state vector
end – [in] index in the permuted bit ordering for the last elements being copied to the state vector (non-inclusive)
custatevecAccessorSet
¶
-
custatevecStatus_t custatevecAccessorSet(custatevecHandle_t handle, custatevecAccessorDescriptor_t accessor, const void *externalBuffer, const custatevecIndex_t begin, const custatevecIndex_t end)¶
Set state vector elements from an external buffer.
This function sets complex numbers to the state vector by using an external buffer specified by the
externalBuffer
argument. During the copy, the index bit is permuted as specified by thebitOrdering
argument in custatevecAccessorCreate().The
begin
andend
arguments specify the range of state vector elements being set to the state vector. Both arguments have the bit ordering specified by thebitOrdering
argument.If a read-only accessor created by calling custatevecAccessorCreateView() is provided, this function returns CUSTATEVEC_STATUS_NOT_SUPPORTED.
- Parameters
handle – [in] the handle to the cuStateVec library
accessor – [in] the accessor descriptor
externalBuffer – [in] pointer to a host or device buffer of complex values being copied to the state vector
begin – [in] index in the permuted bit ordering for the first elements being copied from the state vector
end – [in] index in the permuted bit ordering for the last elements being copied from the state vector (non-inclusive)
Qubit reordering¶
cuStateVec provides custatevecSwapIndexBits()
API to reorder state vector elements in one device.
Use case¶
// This example uses 3 qubits.
const int nIndexBits = 3;
// swap 0th and 2nd qubits
const int nBitSwaps = 1;
const int2 bitSwaps[] = {{0, 2}}; // specify the qubit pairs
// swap the state vector elements only if 1st qubit is 1
const int maskLen = 1;
int maskBitString[] = {1}; // specify the values of mask qubits
int maskOrdering[] = {1}; // specify the mask qubits
// Swap index bit pairs.
// {|000>, |001>, |010>, |011>, |100>, |101>, |110>, |111>} will be permuted to
// {|000>, |001>, |010>, |110>, |100>, |101>, |011>, |111>}.
custatevecSwapIndexBits(handle, sv, svDataType, nIndexBits, bitSwaps, nBitSwaps,
maskBitString, maskOrdering, maskLen);
API reference¶
custatevecSwapIndexBits
¶
-
custatevecStatus_t custatevecSwapIndexBits(custatevecHandle_t handle, void *sv, cudaDataType_t svDataType, const uint32_t nIndexBits, const int2 *bitSwaps, const uint32_t nBitSwaps, const int32_t *maskBitString, const int32_t *maskOrdering, const uint32_t maskLen)¶
Swap index bits and reorder state vector elements in one device.
This function updates the bit ordering of the state vector by swapping the pairs of bit positions.
The state vector is assumed to have the default ordering: the LSB is the 0th index bit and the (N-1)th index bit is the MSB for an N index bit system. The
bitSwaps
argument specifies the swapped bit index pairs, whose values must be in the range [0,nIndexBits
).The
maskBitString
,maskOrdering
andmaskLen
arguments specify the bit mask for the state vector index being accessed. If themaskLen
argument is 0, themaskBitString
and/ormaskOrdering
arguments can be null.A bit position can be included in both
bitSwaps
andmaskOrdering
. When a masked bit is swapped, state vector elements whose original indices match the mask bit string are written to the permuted indices while other elements are not copied.- Parameters
handle – [in] the handle to the cuStateVec library
sv – [inout] state vector
svDataType – [in] Data type of state vector
nIndexBits – [in] the number of index bits of state vector
bitSwaps – [in] pointer to a host array of swapping bit index pairs
nBitSwaps – [in] the number of bit swaps
maskBitString – [in] pointer to a host array to mask output
maskOrdering – [in] pointer to a host array to specify the ordering of maskBitString
maskLen – [in] the length of mask
Matrix property testing¶
The API custatevecTestMatrixType()
is available to check the properties of matrices.
If a matrix \(A\) is unitary, \(AA^{\dagger} = A^{\dagger}A = I\), where \(A^{\dagger}\) is the conjugate transpose of \(A\) and \(I\) is the identity matrix, respectively.
When CUSTATEVEC_MATRIX_TYPE_UNITARY
is given for its argument, this API computes the 1-norm \(||R||_1 = \sum{|r_{ij}|}\), where
\(R = AA^{\dagger} - I\).
This value will be approximately zero if \(A\) is unitary.
If a matrix \(A\) is Hermitian, \(A^{\dagger} = A\).
When CUSTATEVEC_MATRIX_TYPE_HERMITIAN
is given for its argument, this API computes the 2-norm \(||R||_2 = \sum{|r_{ij}|^2}\), where
\(R = (A - A^{\dagger}) / 2\).
This value will be approximately zero if \(A\) is Hermitian.
The API may require external workspace for large matrices,
and custatevecTestMatrixTypeGetWorkspaceSize()
provides the size of workspace.
If a device memory handler is set, it is not necessary to provide explicit workspace by users.
Use case¶
double residualNorm;
void* extraWorkspace = nullptr;
size_t extraWorkspaceSizeInBytes = 0;
// check the size of external workspace
custatevecTestMatrixTypeGetWorkspaceSize(
handle, matrixType, matrix, matrixDataType, layout,
nTargets, adjoint, computeType, &extraWorkspaceSizeInBytes);
// allocate external workspace if necessary
if (extraWorkspaceSizeInBytes > 0)
cudaMalloc(&extraWorkspace, extraWorkspaceSizeInBytes);
// execute testing
custatevecTestMatrixType(
handle, &residualNorm, matrixType, matrix, matrixDataType, layout,
nTargets, adjoint, computeType, extraWorkspace, extraWorkspaceSizeInBytes);
API reference¶
custatevecTestMatrixTypeGetWorkspaceSize
¶
-
custatevecStatus_t custatevecTestMatrixTypeGetWorkspaceSize(custatevecHandle_t handle, custatevecMatrixType_t matrixType, const void *matrix, cudaDataType_t matrixDataType, custatevecMatrixLayout_t layout, const uint32_t nTargets, const int32_t adjoint, custatevecComputeType_t computeType, size_t *extraWorkspaceSizeInBytes)¶
Get extra workspace size for custatevecTestMatrixType()
This function gets the size of an extra workspace required to execute custatevecTestMatrixType().
extraWorkspaceSizeInBytes
will be set to 0 if no extra buffer is required.- Parameters
handle – [in] the handle to cuStateVec library
matrix – [in] host or device pointer to a matrix
matrixType – [in] matrix type
matrixDataType – [in] data type of matrix
layout – [in] enumerator specifying the memory layout of matrix
nTargets – [in] the number of target bits, up to 15
adjoint – [in] flag to control whether the adjoint of matrix is tested
computeType – [in] compute type
extraWorkspaceSizeInBytes – [out] workspace size
custatevecTestMatrixType
¶
-
custatevecStatus_t custatevecTestMatrixType(custatevecHandle_t handle, double *residualNorm, custatevecMatrixType_t matrixType, const void *matrix, cudaDataType_t matrixDataType, custatevecMatrixLayout_t layout, const uint32_t nTargets, const int32_t adjoint, custatevecComputeType_t computeType, void *extraWorkspace, size_t extraWorkspaceSizeInBytes)¶
Test the deviation of a given matrix from a Hermitian (or Unitary) matrix.
This function tests if the type of a given matrix matches the type given by the
matrixType
argument.For tests for the unitary type, \( R = (AA^{\dagger} - I) \) is calculated where \( A \) is the given matrix. The sum of absolute values of \( R \) matrix elements is returned.
For tests for the Hermitian type, \( R = (M - M^{\dagger}) / 2 \) is calculated. The sum of squared absolute values of \( R \) matrix elements is returned.
This function may return CUSTATEVEC_STATUS_INSUFFICIENT_WORKSPACE for large
nTargets
. In such cases, theextraWorkspace
andextraWorkspaceSizeInBytes
arguments should be specified to provide extra workspace. The required size of an extra workspace is obtained by calling custatevecTestMatrixTypeGetWorkspaceSize(). A null pointer can be passed to theextraWorkspace
argument if no extra workspace is required. Also, if a device memory handler is set, theextraWorkspace
can be set to null, and theextraWorkspaceSizeInBytes
can be set to 0.Note
The
nTargets
argument must be no more than 15 in this version. For largernTargets
, this function returns CUSTATEVEC_STATUS_INVALID_VALUE.- Parameters
handle – [in] the handle to cuStateVec library
residualNorm – [out] host pointer, to store the deviation from certain matrix type
matrixType – [in] matrix type
matrix – [in] host or device pointer to a matrix
matrixDataType – [in] data type of matrix
layout – [in] enumerator specifying the memory layout of matrix
nTargets – [in] the number of target bits, up to 15
adjoint – [in] flag to control whether the adjoint of matrix is tested
computeType – [in] compute type
extraWorkspace – [in] extra workspace
extraWorkspaceSizeInBytes – [in] extra workspace size