NVIDIA cuDNN#
The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. It provides highly tuned implementations of operations arising frequently in deep neural network (DNN) applications:
Scaled dot-product attention
Convolution, including cross-correlation
Matrix multiplication
Normalizations, softmax, and pooling
Arithmetic, mathematical, relational, and logical pointwise operations
Beyond just providing high-performance implementations of individual operations, cuDNN also supports a flexible set of multi-operation fusion patterns for further optimization. The goal is to achieve the best available performance on NVIDIA GPUs for important deep learning use cases.
In cuDNN, both single-operation and multi-operation computations are expressed as operation graphs. The following API layers available for constructing these graphs:
Python frontend API
C++ frontend API
C backend API
The NVIDIA cuDNN frontend API provides a simplified programming model that is sufficient for most use cases.
Use the NVIDIA cuDNN backend API only if you want to use the legacy fixed-function routines that are not graph-based interfaces and are not exposed by the frontend API layers, or if you need a C-only interface.
