NVPL TENSOR: Developer Guide and Reference¶
Welcome to the NVPL TENSOR library documentation.
NVPL TENSOR (NVIDIA Performance Libraries TENSOR) is part of NVIDIA Performance Libraries that provides tensor primitives.
NVPL TENSOR works on any 64-bit Arm based processors with Armv8.1 architecture extension and specifically optimized for:
Arm Neoverse V2 based CPUs, such as NVIDIA Grace
Arm Neoverse V1 based CPUs, such as Amazon (AWS) Graviton3
Key Features¶
Support for up to 64-dimensional tensors.
Arbitrary data layouts.
Main computational routines:
Element-wise tensor operations:
Arbitrary tensor permutations.
The documentation consists of three main components:
A User Guide that introduces important basics of cuTENSOR including details on notation and accuracy.
A Getting Started guide that steps through a simple tensor contraction example.
An API Reference that provides a comprehensive overview of all library routines, constants, and data types.