Release Notes#
cuPauliProp v0.3.0#
New features:
Added support for reverse-mode automatic differentiation of parameterized quantum circuits and traces. This permits, among other things, rapid calculation of gradients of expectation values as utilised by quantum variational algorithms and classical optimisation techniques. The relevant new functions are:
cupaulipropPauliExpansionViewPrepareTraceWithExpansionViewBackwardDiffcupaulipropPauliExpansionViewComputeTraceWithExpansionViewBackwardDiffcupaulipropPauliExpansionViewPrepareTraceWithZeroStateBackwardDiffcupaulipropPauliExpansionViewComputeTraceWithZeroStateBackwardDiffcupaulipropPauliExpansionViewPrepareOperatorApplicationBackwardDiffcupaulipropPauliExpansionViewComputeOperatorApplicationBackwardDiffcupaulipropQuantumOperatorAttachCotangentBuffercupaulipropQuantumOperatorGetCotangentBuffer
Compatibility notes:
The
tracescalar formally output by functionscupaulipropPauliExpansionViewComputeTraceWithExpansionViewandcupaulipropPauliExpansionViewComputeTraceWithZeroStatehas been separated into output scalarstraceSignificandandtraceExponent. The full trace can be obtained in a single scalar astraceSignificand * pow(2,traceExponent), though this may suffer from numerical precision issues for systems of very many qubits.cupaulipropQuantumOperatorGetKindhas been removed.cupaulipropQuantumOperatorKind_thas been removed.cupaulipropComputeType_thas been removed.
Bugs fixed:
Previously,
cupaulipropPauliExpansionViewComputeTraceWithExpansionViewwas missing apow(2,N)prefactor when evaluating the trace ofN-qubit views. As such, when given two views both containing only a single, unit-coefficient all-identity Pauli string, the function returned1in lieu of the correctpow(2,N)value. The function now returns a separatetraceSignificandandtraceExponent, as above, wheretraceExponentpresently encodes the (log2of the) prefactor.
Known issues:
When truncation removes all terms from a Pauli expansion (producing a 0-term result), subsequent operations on that expansion — such as further gate application, trace computation, or backward differentiation — crash with
CUPAULIPROP_STATUS_INVALID_VALUEand the messageTerm index is out of range!. This can occur whencupaulipropPauliExpansionViewComputeOperatorApplicationis called with a truncation specification whose coefficient or Pauli-weight cutoff eliminates every term. The same error is raised bycupaulipropPauliExpansionPopulateFromViewwhen the source view has zero terms.Workspace buffers attached via
cupaulipropWorkspaceSetMemorymust be aligned to 16 bytes. Passing an unaligned buffer may result in undefined behavior or CUDA errors. This requirement is not currently documented in the API header.
cuPauliProp v0.2.0#
New features:
Added amplitude damping channel support via
cupaulipropPrepareAmplitudeDampingChannelApplicationandcupaulipropExecuteAmplitudeDampingChannelApplication
Compatibility notes:
cupauliprop{Prepare,Execute}CanonicalSorthas been renamed tocupauliprop{Prepare,Execute}Sort.Boolean
isSorted/makeSortedparameters have been replaced with cupaulipropSortOrder_t enum:CUPAULIPROP_SORT_ORDER_NONE: no sorting (equivalent tomakeSorted=false)CUPAULIPROP_SORT_ORDER_INTERNAL: the library may choose any sort orderCUPAULIPROP_SORT_ORDER_LITTLE_ENDIAN_BITWISE: little-endian bitwise sort (equivalent tomakeSorted=truein previous release)
Affected functions include
cupaulipropCreatePauliExpansion,cupaulipropPauliExpansionView{Prepare,Compute}OperatorApplication,cupaulipropPauliExpansionView{Prepare,Execute}Sort, andcupaulipropPauliExpansionView{Prepare,Execute}Deduplication.The
cupaulipropContextSetStreamfunction has been removed. AcudaStream_t streamparameter has been added to the following API functions:cupaulipropPauliExpansionViewComputeOperatorApplicationcupaulipropPauliExpansionViewExecuteSortcupaulipropPauliExpansionViewExecuteDeduplicationcupaulipropPauliExpansionViewExecuteTruncationcupaulipropPauliExpansionViewComputeTraceWithZeroStatecupaulipropPauliExpansionViewComputeTraceWithExpansionViewcupaulipropPauliExpansionPopulateFromView
The
PauliKindenum values have been reordered fromI=0, X=1, Z=2, Y=3toI=0, X=1, Y=2, Z=3. The meaning ofCUPAULIPROP_SORT_ORDER_LITTLE_ENDIAN_BITWISEforPauliExpansionis not affected by this change. Sorting is based on the (X, Z) bit representation, not enum values.The
CliffordGateKindenum values have been reordered to align withPauliKind:Old
New
CUPAULIPROP_CLIFFORD_GATE_Z = 2CUPAULIPROP_CLIFFORD_GATE_Y = 2CUPAULIPROP_CLIFFORD_GATE_Y = 3CUPAULIPROP_CLIFFORD_GATE_Z = 3CUPAULIPROP_CLIFFORD_GATE_CZ = 8CUPAULIPROP_CLIFFORD_GATE_CY = 8CUPAULIPROP_CLIFFORD_GATE_CY = 9CUPAULIPROP_CLIFFORD_GATE_CZ = 9CUPAULIPROP_CLIFFORD_GATE_SQRTZ = 13CUPAULIPROP_CLIFFORD_GATE_SQRTY = 13CUPAULIPROP_CLIFFORD_GATE_SQRTY = 14CUPAULIPROP_CLIFFORD_GATE_SQRTZ = 14
Known issues:
preparecalls currentlyaccumulate (max-retained) workspace sizes across successive calls on the same descriptor. However,preparecalls are expected to have overwrite semantics regarding the workspace descriptor: each call sets the required buffer size for that operation alone, discarding any previously stored size or attached buffer. Future releases will transition to overwrite semantics. Code that relies on a single descriptor accumulating the peak of multiplepreparecalls should instead query the required size after eachpreparecall and track the maximum on the caller side. For example:// Correct pattern: track maximum workspace size explicitly int64_t maxWorkspaceSize = 0; int64_t reqSize; cupaulipropPauliExpansionViewPrepareOperatorApplication(..., workspace); cupaulipropWorkspaceGetMemorySize(handle, workspace, ..., &reqSize); if (reqSize > maxWorkspaceSize) maxWorkspaceSize = reqSize; cupaulipropPauliExpansionViewPrepareTraceWithZeroState(..., workspace); cupaulipropWorkspaceGetMemorySize(handle, workspace, ..., &reqSize); if (reqSize > maxWorkspaceSize) maxWorkspaceSize = reqSize;
cuPauliProp v0.1.0#
Initial release of cuPauliProp. We do not yet guarantee API or ABI stability between minor version updates.
Single-GPU capabilities only
Support
Linux x86_64andLinux Arm64targetsSupport Turing, Ampere, Ada and Hopper NVIDIA GPU architectures (compute capability 7.5+)
Compatibility notes:
cuPauliProp requires CUDA 12 or above
Known issues:
The docstring for
PauliNoiseChannelincorrectly documents the noise probability ordering asI=0, X=1, Z=2, Y=3, but the actual ordering followsI=0, X=1, Y=2, Z=3.