Unified Coulomb/Exchange Gradient#

group JKGrad

Functions

cuestStatus_t cuestDFSymmetricDerivativeCompute( cuestHandle_t handle, const cuestDFIntPlan_t plan, const cuestDFSymmetricDerivativeComputeParameters_t parameters, const cuestWorkspaceDescriptor_t *variableBufferSize, cuestWorkspace_t *temporaryWorkspace, double densityScale, const double *densityMatrix, double coefficientScale, uint64_t numCoefficientMatrices, const uint64_t *numOccupied, const double *coefficientMatrices, double *outGradient )#

Compute the nuclear gradient of the symmetric DF J/K matrices contracted with a density matrix and multiple coefficient matrices.

This routine evaluates the nuclear derivatives (gradients) of the Coulomb (J) and exchange (K) matrices in density fitting approximation, where the J matrix is contracted with a single density matrix and the K matrices are contracted with multiple occupied orbital coefficient matrices. The result is accumulated into the output gradient buffer (size: natom × 3).

All required temporary workspace must be sized using the corresponding workspace query function. A variableBufferSize descriptor must also be provided to constrain the memory usage.

The bare Coulomb and exchange energies are given as:

\[ E_{J} = \sum_{mnrs} D_{mn} (mn|rs) D_{rs} \]

\[ E_{K}^{N} = \sum_{mnrs} \sum_{ij} C_{im}^{N} C_{jn}^{N} (mn|rs) C_{ir}^{N} C_{js}^{N} \]

An effective J/K energy can be written as:

\[ E_{JK} = s_D * E_{J} + \sum_{N} s_C * f_E * E_{K}^{N} \]

where

\( s_D \) is a density scaling factor densityScale,
\( s_C \) is a coefficient scaling factor coefficientScale,
\( f_E \) is the fraction of Hartree-Fock exchange to add, CUEST_DFINTPLAN_PARAMETERS_EXCHANGE_FRACTION

This function computes: \(dE_{JK}/dR\), where R represents the basis function centers of the electron repulsion integrals.

CUEST_DFINTPLAN_PARAMETERS_EXCHANGE_FRACTION is set when the cuestDFIntPlan_t is constructed and is intended to describe the fraction of HF exchange in hybrid functionals.

The densityScale and coefficientScale parameters are intended to account for the spin summations needed to obtain the correct two-electron energy.

For example, in a B3LYP RKS computation the intended usage would be:

CUEST_DFINTPLAN_PARAMETERS_EXCHANGE_FRACTION = 0.2
densityScale = 2.0
coefficientScale = -1.0
numCoefficientMatrices = 1
\( D_{mn} = \sum_{i} C_{im} C_{in} \)
\( I_{ij} = \sum_{mn} C_{im} S_{mn} S_{jn} \)

For a PBE0 UKS computation the intended usage would be:

CUEST_DFINTPLAN_PARAMETERS_EXCHANGE_FRACTION = 0.25
densityScale = 0.5
coefficientScale = -0.5
numCoefficientMatrices = 2
\(D_{mn} = \sum_{i} C_{im}^{\alpha} C_{in}^{\alpha} + \sum_{i} C_{im}^{\beta} C_{in}^{\beta} \)
\(I_{ij} = \sum_{mn} C_{im}^{\alpha} S_{mn} S_{jn}^{\alpha} \)
\(I_{ij} = \sum_{mn} C_{im}^{\beta} S_{mn} S_{jn}^{\beta} \)

For pure functionals, where coefficientScale = 0.0 (and/or CUEST_DFINTPLAN_PARAMETERS_EXCHANGE_FRACTION = 0.0), the coefficient matrices do not need to be provided (if they are, they will not be used). For the pure functional case, the numCoefficientMatrices can be 0. The numOccupied array can be NULL (or contain zeros). The coefficientMatrices array can be NULL.

The following memory usage considerations apply only to cases where the fraction of Hartree-Fock exchange is non-zero.

The memory usage during the gradient evaluation can be changed at a high level by CUEST_DFSYMMETRICDERIVATIVECOMPUTE_PARAMETERS_MEMORY_POLICY. The default (CUEST_DFSYMMETRICDERIVATIVECOMPUTE_MEMORY_POLICY_FULL) will usually provide the best performance, but may try to allocate a large amount of memory. If this exceeds the capabilities of the current device, consider changing to CUEST_DFSYMMETRICDERIVATIVECOMPUTE_MEMORY_POLICY_BLOCKED, which will use less memory, but is usually slightly less performant. The recommendation is to use CUEST_DFSYMMETRICDERIVATIVECOMPUTE_MEMORY_POLICY_FULL when possible and switch to CUEST_DFSYMMETRICDERIVATIVECOMPUTE_MEMORY_POLICY_BLOCKED if not enough memory is available.

Note that variableBufferSize only applies to the formation of certain intermediates during the cuestDFSymmetricExchangeCompute and will not limit overall memory usage. We recommend a buffer size of at least 2 GB with modest performance improvements possible if larger buffers can be provided.

Parameters:

handle – [in] cuEST handle. Must not be NULL.
plan – [in] DF integral computation plan (opaque handle). Must not be NULL.
parameters – [in] Compute parameters (cuestDFSymmetricDerivativeComputeParameters_t). Must not be NULL
variableBufferSize – [in] The variableBufferSize determines the size of an internal scratch buffer used for certain transformations. A buffer of 2GB is a reasonable default value. Host memory is currently unused. Note that the variableBufferSize does not limit total memory usage. The value of the variableBufferSize must not change between WorkspaceQuery and Compute calls.
temporaryWorkspace – [in] Temporary workspace buffers (preallocated for this operation using the query function). Must not be NULL.
densityScale – [in] Scaling factor for the J (Coulomb) contribution.
densityMatrix – [in] Input density matrix (size: nao × nao) on GPU. Must not be NULL.
coefficientScale – [in] Scaling factor for each K (exchange) contribution.
numCoefficientMatrices – [in] Number of coefficient matrices (> 0). Must be > 0.
numOccupied – [in] Array of occupied orbital counts for each coefficient matrix (size: numCoefficientMatrices) on CPU. Each entry must be > 0. Must not be NULL.
coefficientMatrices – [in] Concatenated coefficient matrices. Each matrix has shape [numOccupied[i], nao] on GPU. Total size: sum(numOccupied[i]) × nao. Must not be NULL.
outGradient – [out] Output nuclear gradient (size: natom × 3) on GPU. Gradient results overwrite this buffer. Must not be NULL.

Returns:

CUEST_STATUS_SUCCESS on success;
CUEST_STATUS_INVALID_HANDLE if the cuEST handle is NULL;
CUEST_STATUS_NULL_POINTER if any required pointer is NULL;
CUEST_STATUS_INVALID_TYPE if opaque handles are not the correct type;
CUEST_STATUS_INVALID_ARGUMENT if coefficientScale != 0.0 and if numCoefficientMatrices == 0 or any numOccupied[i] == 0;
CUEST_STATUS_EXCEPTION or CUEST_STATUS_UNKNOWN_ERROR otherwise.

cuestStatus_t cuestDFSymmetricDerivativeComputeWorkspaceQuery( cuestHandle_t handle, const cuestDFIntPlan_t plan, const cuestDFSymmetricDerivativeComputeParameters_t parameters, const cuestWorkspaceDescriptor_t *variableBufferSize, cuestWorkspaceDescriptor_t *temporaryWorkspaceDescriptor, double densityScale, const double *densityMatrix, double coefficientScale, uint64_t numCoefficientMatrices, const uint64_t *numOccupied, const double *coefficientMatrices, double *outGradient )#

Query the temporary workspace required for a DF J/K nuclear gradient computation.

This function determines the workspace required for a symmetric DF J/K gradient evaluation with the given plan, density matrix, coefficient matrices, and maximum workspace constraint. The output descriptor is filled on success and can be used to allocate host and device workspace buffers prior to calling cuestDFSymmetricDerivativeCompute.

Some input pointers (densityMatrix, coefficientMatrices, outGradient) are optional and may be NULL. If provided, they can refine workspace estimates but are not modified.

Parameters:

handle – [in] cuEST handle. Must not be NULL.
plan – [in] DF integral computation plan (opaque handle). Must not be NULL.
parameters – [in] Compute parameters (cuestDFSymmetricDerivativeComputeParameters_t). Must not be NULL
variableBufferSize – [in] The variableBufferSize determines the size of an internal scratch buffer used for certain transformations. A buffer of 2GB is a reasonable default value. Host memory is currently unused. Note that the variableBufferSize does not limit total memory usage. The value of the variableBufferSize must not change between WorkspaceQuery and Compute calls.
temporaryWorkspaceDescriptor – [out] Output descriptor for required temporary workspace sizes (host and device). Must not be NULL.
densityScale – [in] Scaling factor for the J (Coulomb) contribution.
densityMatrix – [in] Optional density matrix. May be NULL.
coefficientScale – [in] Scaling factor for each K (exchange) contribution.
numCoefficientMatrices – [in] Number of coefficient matrices (> 0). Must be > 0.
numOccupied – [in] Array of occupied orbital counts for each coefficient matrix (size: numCoefficientMatrices) on CPU. Each entry must be > 0. Must not be NULL.
coefficientMatrices – [in] Optional concatenated coefficient matrices. May be NULL.
outGradient – [in] Optional output gradient buffer. May be NULL.

Returns:

CUEST_STATUS_SUCCESS on success;
CUEST_STATUS_INVALID_HANDLE if the cuEST handle is NULL;
CUEST_STATUS_NULL_POINTER if any required pointer is NULL;
CUEST_STATUS_INVALID_TYPE if opaque handles are not the correct type;
CUEST_STATUS_INVALID_ARGUMENT if coefficientScale != 0.0 and if numCoefficientMatrices == 0 or any numOccupied[i] == 0;
CUEST_STATUS_EXCEPTION or CUEST_STATUS_UNKNOWN_ERROR otherwise.