Release Notes#

cuPauliProp v0.3.0#

New features:

  • Added support for reverse-mode automatic differentiation of parameterized quantum circuits and traces. This permits, among other things, rapid calculation of gradients of expectation values as utilised by quantum variational algorithms and classical optimisation techniques. The relevant new functions are:

    • cupaulipropPauliExpansionViewPrepareTraceWithExpansionViewBackwardDiff

    • cupaulipropPauliExpansionViewComputeTraceWithExpansionViewBackwardDiff

    • cupaulipropPauliExpansionViewPrepareTraceWithZeroStateBackwardDiff

    • cupaulipropPauliExpansionViewComputeTraceWithZeroStateBackwardDiff

    • cupaulipropPauliExpansionViewPrepareOperatorApplicationBackwardDiff

    • cupaulipropPauliExpansionViewComputeOperatorApplicationBackwardDiff

    • cupaulipropQuantumOperatorAttachCotangentBuffer

    • cupaulipropQuantumOperatorGetCotangentBuffer

Compatibility notes:

  • The trace scalar formally output by functions cupaulipropPauliExpansionViewComputeTraceWithExpansionView and cupaulipropPauliExpansionViewComputeTraceWithZeroState has been separated into output scalars traceSignificand and traceExponent. The full trace can be obtained in a single scalar as traceSignificand * pow(2,traceExponent), though this may suffer from numerical precision issues for systems of very many qubits.

  • cupaulipropQuantumOperatorGetKind has been removed.

  • cupaulipropQuantumOperatorKind_t has been removed.

  • cupaulipropComputeType_t has been removed.

Bugs fixed:

  • Previously, cupaulipropPauliExpansionViewComputeTraceWithExpansionView was missing a pow(2,N) prefactor when evaluating the trace of N-qubit views. As such, when given two views both containing only a single, unit-coefficient all-identity Pauli string, the function returned 1 in lieu of the correct pow(2,N) value. The function now returns a separate traceSignificand and traceExponent, as above, where traceExponent presently encodes the (log2 of the) prefactor.

Known issues:

  • When truncation removes all terms from a Pauli expansion (producing a 0-term result), subsequent operations on that expansion — such as further gate application, trace computation, or backward differentiation — crash with CUPAULIPROP_STATUS_INVALID_VALUE and the message Term index is out of range!. This can occur when cupaulipropPauliExpansionViewComputeOperatorApplication is called with a truncation specification whose coefficient or Pauli-weight cutoff eliminates every term. The same error is raised by cupaulipropPauliExpansionPopulateFromView when the source view has zero terms.

  • Workspace buffers attached via cupaulipropWorkspaceSetMemory must be aligned to 16 bytes. Passing an unaligned buffer may result in undefined behavior or CUDA errors. This requirement is not currently documented in the API header.

cuPauliProp v0.2.0#

New features:

  • Added amplitude damping channel support via cupaulipropPrepareAmplitudeDampingChannelApplication and cupaulipropExecuteAmplitudeDampingChannelApplication

Compatibility notes:

  • cupauliprop{Prepare,Execute}CanonicalSort has been renamed to cupauliprop{Prepare,Execute}Sort.

  • Boolean isSorted/makeSorted parameters have been replaced with cupaulipropSortOrder_t enum:

    • CUPAULIPROP_SORT_ORDER_NONE: no sorting (equivalent to makeSorted=false)

    • CUPAULIPROP_SORT_ORDER_INTERNAL: the library may choose any sort order

    • CUPAULIPROP_SORT_ORDER_LITTLE_ENDIAN_BITWISE: little-endian bitwise sort (equivalent to makeSorted=true in previous release)

    Affected functions include cupaulipropCreatePauliExpansion, cupaulipropPauliExpansionView{Prepare,Compute}OperatorApplication, cupaulipropPauliExpansionView{Prepare,Execute}Sort, and cupaulipropPauliExpansionView{Prepare,Execute}Deduplication.

  • The cupaulipropContextSetStream function has been removed. A cudaStream_t stream parameter has been added to the following API functions:

    • cupaulipropPauliExpansionViewComputeOperatorApplication

    • cupaulipropPauliExpansionViewExecuteSort

    • cupaulipropPauliExpansionViewExecuteDeduplication

    • cupaulipropPauliExpansionViewExecuteTruncation

    • cupaulipropPauliExpansionViewComputeTraceWithZeroState

    • cupaulipropPauliExpansionViewComputeTraceWithExpansionView

    • cupaulipropPauliExpansionPopulateFromView

  • The PauliKind enum values have been reordered from I=0, X=1, Z=2, Y=3 to I=0, X=1, Y=2, Z=3. The meaning of CUPAULIPROP_SORT_ORDER_LITTLE_ENDIAN_BITWISE for PauliExpansion is not affected by this change. Sorting is based on the (X, Z) bit representation, not enum values.

  • The CliffordGateKind enum values have been reordered to align with PauliKind:

    Old

    New

    CUPAULIPROP_CLIFFORD_GATE_Z = 2

    CUPAULIPROP_CLIFFORD_GATE_Y = 2

    CUPAULIPROP_CLIFFORD_GATE_Y = 3

    CUPAULIPROP_CLIFFORD_GATE_Z = 3

    CUPAULIPROP_CLIFFORD_GATE_CZ = 8

    CUPAULIPROP_CLIFFORD_GATE_CY = 8

    CUPAULIPROP_CLIFFORD_GATE_CY = 9

    CUPAULIPROP_CLIFFORD_GATE_CZ = 9

    CUPAULIPROP_CLIFFORD_GATE_SQRTZ = 13

    CUPAULIPROP_CLIFFORD_GATE_SQRTY = 13

    CUPAULIPROP_CLIFFORD_GATE_SQRTY = 14

    CUPAULIPROP_CLIFFORD_GATE_SQRTZ = 14

Known issues:

  • prepare calls currentlyaccumulate (max-retained) workspace sizes across successive calls on the same descriptor. However, prepare calls are expected to have overwrite semantics regarding the workspace descriptor: each call sets the required buffer size for that operation alone, discarding any previously stored size or attached buffer. Future releases will transition to overwrite semantics. Code that relies on a single descriptor accumulating the peak of multiple prepare calls should instead query the required size after each prepare call and track the maximum on the caller side. For example:

    // Correct pattern: track maximum workspace size explicitly
    int64_t maxWorkspaceSize = 0;
    int64_t reqSize;
    
    cupaulipropPauliExpansionViewPrepareOperatorApplication(..., workspace);
    cupaulipropWorkspaceGetMemorySize(handle, workspace, ..., &reqSize);
    if (reqSize > maxWorkspaceSize) maxWorkspaceSize = reqSize;
    
    cupaulipropPauliExpansionViewPrepareTraceWithZeroState(..., workspace);
    cupaulipropWorkspaceGetMemorySize(handle, workspace, ..., &reqSize);
    if (reqSize > maxWorkspaceSize) maxWorkspaceSize = reqSize;
    

cuPauliProp v0.1.0#

  • Initial release of cuPauliProp. We do not yet guarantee API or ABI stability between minor version updates.

  • Single-GPU capabilities only

  • Support Linux x86_64 and Linux Arm64 targets

  • Support Turing, Ampere, Ada and Hopper NVIDIA GPU architectures (compute capability 7.5+)

Compatibility notes:

  • cuPauliProp requires CUDA 12 or above

Known issues:

  • The docstring for PauliNoiseChannel incorrectly documents the noise probability ordering as I=0, X=1, Z=2, Y=3, but the actual ordering follows I=0, X=1, Y=2, Z=3.