Available CUDA Kernels#

Here we list the available CUDA kernels in cuEquivariance and their use cases.

Fused TP#

This kernel is useful for tensor products which have small operand sizes.

ns, nv = 48, 10
irreps_feat = cue.Irreps("O3", f"{ns}x0e+{nv}x1o+{nv}x1e+{ns}x0o")
irreps_sh = cue.Irreps("O3", "0e + 1o + 2e")
cuet.FullyConnectedTensorProduct(irreps_feat, irreps_sh, irreps_feat, layout=cue.ir_mul)

Low level interface (non public API): cuequivariance_ops_torch.FusedTensorProductOp3 and cuequivariance_ops_torch.FusedTensorProductOp4

Uniform 1d#

This kernel works for STP with subscripts

^(|u),(|u),(|u),(|u)$
^(|u),(|u),(|u)$

A typical use case is the channel wise tensor product used in NequIP.

irreps_feat = 128 * cue.Irreps("O3", "0e + 1o + 2e")
irreps_sh = cue.Irreps("O3", "0e + 1o + 2e + 3o")
cuet.ChannelWiseTensorProduct(irreps_feat, irreps_sh, irreps_feat, layout=cue.ir_mul)

Low level interface (non public API): cuequivariance_ops_torch.TensorProductUniform1d

Symmetric Contractions#

This kernel is designed for the symmetric contraction of MACE. It supports subscripts u,u,u, u,u,u,u, etc up to 8 operands. The first operand is the weights which are optionally indexed by integers. The last operand is the output. The other operands are the repeated input.

irreps_feat = 128 * cue.Irreps("O3", "0e + 1o + 2e")
cuet.SymmetricContraction(irreps_feat, irreps_feat, 3, 1, layout=cue.ir_mul)

Low level interface (non public API): cuequivariance_ops_torch.SymmetricTensorContraction