************* Release Notes ************* ================= cuStateVec v1.0.0 ================= * Improve performance/functionality: * Gate application APIs are reoptimized: - `custatevecApplyMatrix` reduced API execution latencies. Performance with small state vectors may improve for 1-4 qubits matrix application in single precision and 1-5 qubits matrix application in double precision, respectively. - `custatevecApplyGeneralizedPermutationMatrix` reduced API execution latencies. Performance with small state vectors may improve for diagonal matrix cases. * Resolve issues: * Multi-threading issues in `custatevecApplyMatrix`, `custatevecApplyGeneralizedPermutationMatrix`, and `custatevecComputeExpectationsOnPauliBasis` are fixed. All the cuStateVec APIs in this version are thread safe as long as each host thread has its own cuStateVec handle. * Add new API: * Binding a user-provided, stream-ordered memory pool to the library (see the introduction for :ref:`workspace-label` and :ref:`cuStateVec memory management API` for detail). * Extensions for the batch-measure and sampler APIs to accept state vector partitions across multiple GPUs (see `custatevecBatchMeasureWithOffset`, `custatevecSamplerGetSquaredNorm`, `custatevecSamplerApplySubSVOffset`) * Optimized state vector element swap algorithm on single GPU (see `custatevecSwapIndexBits`) * Testing whether a given matrix is Hermitian or unitary (see `custatevecTestMatrixType`) * Setting a logger callback with user-provided data (see `custatevecLoggerSetCallbackData`) *API breaking changes*: * The sampler and accessor descriptors are made completely opaque, just like the library handle `custatevecHandle_t`. For both descriptors there is a corresponding destructor API. Also, they are now passed by value in various routines. Now the C and Python APIs are unified. * Some APIs are renamed as follows: ====================================================== ======================================================================== previous version (< 1.0.0) new version (= 1.0.0) ====================================================== ======================================================================== custatevecApplyMatrix_bufferSize `custatevecApplyMatrixGetWorkspaceSize` custatevecApplyExp `custatevecApplyPauliRotation` custatevecApplyGeneralizedPermutationMatrix_bufferSize `custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize` custatevecExpectation_bufferSize `custatevecComputeExpectationGetWorkspaceSize` custatevecExpectation `custatevecComputeExpectation` custatevecExpectationsOnPauliBasis `custatevecComputeExpectationsOnPauliBasis` custatevecSampler_create `custatevecSamplerCreate` custatevecSampler_preprocess `custatevecSamplerPreprocess` custatevecSampler_sample `custatevecSamplerSample` custatevecAccessor_create `custatevecAccessorCreate` custatevecAccessor_createReadOnly `custatevecAccessorCreateView` custatevecAccessor_setExtraWorkspace `custatevecAccessorSetExtraWorkspace` custatevecAccessor_set `custatevecAccessorSet` custatevecAccessor_get `custatevecAccessorGet` ====================================================== ======================================================================== * The arguments of the following APIs are reordered/renamed: * `custatevecApplyMatrix` * `custatevecApplyGeneralizedPermutationMatrixGetWorkspaceSize` * `custatevecApplyGeneralizedPermutationMatrix` * `custatevecComputeExpectationsOnPauliBasis` *Compatibility notes*: * *cuStateVec* requires CUDA 11.x *Limitation notes*: * ``CUSTATEVEC_STATUS_INTERNAL_ERROR`` might be returned if a wrong device pointer is passed to functions. If a function returns ``CUSTATEVEC_STATUS_INTERNAL_ERROR``, please check if a correct pointer is passed and the size is correctly specified. ================= cuStateVec v0.1.1 ================= * Support for the NVIDIA cuQuantum Appliance (see :doc:`here <../appliance/index>`): * Extensions for the batch-measure and sampler APIs to accept state vector partitions across multiple GPUs * Optimized state vector element swap algorithm for multiple GPUs * Note: the multi-GPU features & optimizations are currently available only in the cuQuantum Appliance ================= cuStateVec v0.1.0 ================= * Add support for ``Linux ppc64le`` * Add new APIs: * Gate application for generalized permutation matrices * Expectation values of Pauli strings * Accessor to get/set state vector elements *Compatibility notes*: * *cuStateVec* requires CUDA 11.4 or above * *cuStateVec* requires NVIDIA HPC SDK 21.11 or above *Limitation notes*: * ``CUSTATEVEC_STATUS_INTERNAL_ERROR`` might be returned if a wrong device pointer is passed to functions. If a function returns ``CUSTATEVEC_STATUS_INTERNAL_ERROR``, please check if a correct pointer is passed and the size is correctly specified. ================= cuStateVec v0.0.1 ================= * Initial release * Support ``Linux x86_64``, ``Linux Arm64`` * Support Volta and Ampere architectures (compute capability 7.0+) *Compatibility notes*: * *cuStateVec* requires CUDA 11.4 or above * *cuStateVec* requires NVIDIA HPC SDK 21.7 or above *Limitation notes*: * This release is optimized for NVIDIA A100 and V100 GPUs. * ``CUSTATEVEC_STATUS_INTERNAL_ERROR`` might be returned if a wrong device pointer is passed to functions. If a function returns ``CUSTATEVEC_STATUS_INTERNAL_ERROR``, please check if a correct pointer is passed and the size is correctly specified. * Performance optimization is planned in future releases.