Release Notes¶
cuStateVec v1.0.0¶
Improve performance/functionality:
Gate application APIs are reoptimized:
custatevecApplyMatrix()
reduced API execution latencies. Performance with small state vectors may improve for 1-4 qubits matrix application in single precision and 1-5 qubits matrix application in double precision, respectively.custatevecApplyGeneralizedPermutationMatrix()
reduced API execution latencies. Performance with small state vectors may improve for diagonal matrix cases.
Resolve issues:
Multi-threading issues in
custatevecApplyMatrix()
,custatevecApplyGeneralizedPermutationMatrix()
, andcustatevecComputeExpectationsOnPauliBasis()
are fixed. All the cuStateVec APIs in this version are thread safe as long as each host thread has its own cuStateVec handle.
Add new API:
Binding a user-provided, stream-ordered memory pool to the library (see the introduction for Workspace and Memory Management API for detail).
Extensions for the batch-measure and sampler APIs to accept state vector partitions across multiple GPUs (see
custatevecBatchMeasureWithOffset()
,custatevecSamplerGetSquaredNorm()
,custatevecSamplerApplySubSVOffset()
)Optimized state vector element swap algorithm on single GPU (see
custatevecSwapIndexBits()
)Testing whether a given matrix is Hermitian or unitary (see
custatevecTestMatrixType()
)Setting a logger callback with user-provided data (see
custatevecLoggerSetCallbackData()
)
API breaking changes:
The sampler and accessor descriptors are made completely opaque, just like the library handle
custatevecHandle_t
. For both descriptors there is a corresponding destructor API. Also, they are now passed by value in various routines. Now the C and Python APIs are unified.Some APIs are renamed as follows:
previous version (< 1.0.0)
new version (= 1.0.0)
custatevecApplyMatrix_bufferSize
custatevecApplyExp
custatevecApplyGeneralizedPermutationMatrix_bufferSize
custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize()
custatevecExpectation_bufferSize
custatevecExpectation
custatevecExpectationsOnPauliBasis
custatevecSampler_create
custatevecSampler_preprocess
custatevecSampler_sample
custatevecAccessor_create
custatevecAccessor_createReadOnly
custatevecAccessor_setExtraWorkspace
custatevecAccessor_set
custatevecAccessor_get
The arguments of the following APIs are reordered/renamed:
Compatibility notes:
cuStateVec requires CUDA 11.x
Limitation notes:
CUSTATEVEC_STATUS_INTERNAL_ERROR
might be returned if a wrong device pointer is passed to functions. If a function returnsCUSTATEVEC_STATUS_INTERNAL_ERROR
, please check if a correct pointer is passed and the size is correctly specified.
cuStateVec v0.1.1¶
Support for the NVIDIA cuQuantum Appliance (see here):
Extensions for the batch-measure and sampler APIs to accept state vector partitions across multiple GPUs
Optimized state vector element swap algorithm for multiple GPUs
Note: the multi-GPU features & optimizations are currently available only in the cuQuantum Appliance
cuStateVec v0.1.0¶
Add support for
Linux ppc64le
Add new APIs:
Gate application for generalized permutation matrices
Expectation values of Pauli strings
Accessor to get/set state vector elements
Compatibility notes:
cuStateVec requires CUDA 11.4 or above
cuStateVec requires NVIDIA HPC SDK 21.11 or above
Limitation notes:
CUSTATEVEC_STATUS_INTERNAL_ERROR
might be returned if a wrong device pointer is passed to functions. If a function returnsCUSTATEVEC_STATUS_INTERNAL_ERROR
, please check if a correct pointer is passed and the size is correctly specified.
cuStateVec v0.0.1¶
Initial release
Support
Linux x86_64
,Linux Arm64
Support Volta and Ampere architectures (compute capability 7.0+)
Compatibility notes:
cuStateVec requires CUDA 11.4 or above
cuStateVec requires NVIDIA HPC SDK 21.7 or above
Limitation notes:
This release is optimized for NVIDIA A100 and V100 GPUs.
CUSTATEVEC_STATUS_INTERNAL_ERROR
might be returned if a wrong device pointer is passed to functions. If a function returnsCUSTATEVEC_STATUS_INTERNAL_ERROR
, please check if a correct pointer is passed and the size is correctly specified.Performance optimization is planned in future releases.