Deep Learning SDK Documentation - Last updated September 19, 2018 - Send Feedback -

NVIDIA Deep Learning SDK


Introduction
The two major operations from which deep learning produces insight are training and inference. While similar, there are significant differences. Training, feeds examples of objects to be detected/recognized like animals, traffic signs, etc., allowing it to make predictions, as to what these objects are. The training process reinforces correct predictions and corrects the wrong ones. Once trained, a production neural network can achieve upwards of 90-98% correct results. "Inference", is the deployment of a trained network to evaluate new objects, and make predictions with similar predictive accuracy. Inference comes after training, therefore, you must obtain a trained neural network before you can perform inference.

Training


Training with Mixed Precision
The Training with Mixed Precision User Guide introduces NVIDIA's latest architecture called Volta. This guide summarizes the ways that a framework can be fine-tuned to gain additional speedups by leveraging the Volta architectural features.
cuDNN Release Notes
This document describes the key features, software enhancements and improvements, and known issues for cuDNN v7.3.0.
cuDNN SLA
This document is the Software License Agreement (SLA) for NVIDIA cuDNN. The following contains specific license terms and conditions for NVIDIA cuDNN. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
cuDNN Installation Guide
This guide provides step-by-step instructions on how to install and check for correct operation of NVIDIA cuDNN v7.3.0 on Linux, Mac OS X, and Microsoft Windows systems.
cuDNN Developer Guide
This NVIDIA CUDA Deep Neural Network (cuDNN) Developer Guide provides an overview of cuDNN 7.3.0, and details about the types, enums, and routines within the cuDNN library API.
NCCL Release Notes
This document describes the key features, software enhancements and improvements, and known issues for NCCL 2.3.4.
NCCL SLA
This document is the Software License Agreement (SLA) for NVIDIA NCCL. The following contains specific license terms and conditions for NVIDIA NCCL. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
NCCL Installation Guide
This NVIDIA Collective Communication Library (NCCL) Installation Guide provides a step-by-step instructions for downloading and installing NCCL 2.3.4.
NCCL Developer Guide
This NVIDIA Collective Communication Library (NCCL) Developer Guide provides a detailed discussion of the NCCL programming model, creating collective communications and working with operations.
NCCL API
This is the API documentation for the NVIDIA Collective Communications Library (NCCL). It provides information on individual functions, classes and methods.
Additional Resources
The Additional Resources topic provides you with important related links that are outside of this product documentation.

Inference


TensorRT Support Matrix
This support matrix is for TensorRT. These matrices provide a look into supported features and software for TensorRT APIs, parsers, and layers.
TensorRT Release Notes
NVIDIA TensorRT is a C++ library that facilitates high performance inference on NVIDIA GPUs. It is designed to work in connection with deep learning frameworks that are commonly used for training. TensorRT focuses specifically on running an already trained network quickly and efficiently on a GPU for the purpose of generating a result; also known as inferencing. These release notes describe the key features, software enhancements and improvements, and known issues for the TensorRT 5.0 Release Candidate (RC) product package.
TensorRT Installation Guide
This TensorRT Installation Guide provides the installation requirements, a list of what is included in the TensorRT package, and step-by-step instructions for installing TensorRT 5.0 Release Candidate (RC).
TensorRT Developer Guide
This TensorRT 5.0 Release Candidate (RC) Developer Guide demonstrates how to use the C++ and Python APIs for implementing the most common deep learning layers. It shows how you can take an existing model built with a deep learning framework and use that to build a TensorRT engine using the provided parsers. The Developer Guide also provides step-by-step instructions for common user tasks such as, creating a TensorRT network definition, invoking the TensorRT builder, serializing and deserializing, and how to feed the engine with data and perform inference; all while using either the C++ or Python API. Lastly, a section on every sample included in the package is also provided.
Best Practices For TensorRT Performance
This Best Practices guide covers various performance considerations related to deploying networks using TensorRT 5.0 Release Candidate (RC). These sections assume that you have a model that is working at an appropriate level of accuracy and that you are able to successfully use TensorRT to do inference for your model.
TensorRT API
This is the API documentation for the NVIDIA TensorRT library. The TensorRT API allows developers to import pre-trained models, calibrate their networks using INT8, and build and deploy optimized networks. Networks can be imported directly from NVCaffe, or from other frameworks via the UFF or ONNX formats. They may also be created programmatically using the C++ or Python API by instantiating individual layers and setting parameters and weights directly.
TensorRT SLA
This document is the Software License Agreement (SLA) for NVIDIA TensorRT. This document contains specific license terms and conditions for NVIDIA TensorRT. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
TensorRT Container Release Notes
The TensorRT container is an easy to use container for TensorRT development. The container allows for the TensorRT samples to be built, modified, and executed. These release notes provide a list of key features, packaged software included in the container, software enhancements and improvements, and any known issues for the 18.09 and earlier releases. The TensorRT container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
Inference Server Container Release Notes
The actual Inference Server is packaged within the Inference Server container. This document walks you through the process of getting up and running with the Inference Server container; from the prerequisites to running the container. Additionally, the release notes provide a list of key features, packaged software included in the container, software enhancements and improvements, and any known issues for the 18.09 and earlier releases. The Inference Server container is released on a monthly basis to provide you with up-to-date software that is tested, tuned, and optimized.
Inference Server User Guide
This User Guide focuses on documenting the Inference Server and its benefits. The Inference Server is included within the Inference Server container. This guide provides step-by-step instructions for pulling and running the Inference Server container, along with the details of the model store and the Inference API.
Additional Resources
The Additional Resources topic provides you with important related links that are outside of this product documentation.

Data Loading


DALI Release Notes
This document describes the key features, software enhancements and improvements, and known issues for DALI 0.2 beta and earlier releases.
DALI Quick Start Guide
This NVIDIA Data Loading Library (DALI) 0.2 Quick Start Guide provides the installation requirements and step-by-step instructions for installing DALI as a beta release. The guide demonstrates how to get compatible MXNet, TensorFlow, and PyTorch frameworks, and install DALI from a binary or GitHub installation. This guide also provides a sample for running a DALI accelerated pre-configured ResNet-50 model on MXNet, TensorFlow, or PyTorch for image classification training.
DALI Developer Guide
This NVIDIA Data Loading Library (DALI) 0.2 Developer Guide demonstrates how to define, build, and run a DALI pipeline, as a single library, that can be integrated into different deep learning training and inference applications. By exposing optimized building blocks that are executed using an efficient engine, and enabling offloading of operations onto a GPU, DALI provides both performance and flexibility of accelerating different data pipelines. DALI is available as a beta release.
DALI SLA
This document is the Software License Agreement (SLA) for NVIDIA Data Loading Library (DALI). This document contains specific license terms and conditions for NVIDIA DALI. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.

Archives


cuDNN Archives
This Archives document provides access to previously released cuDNN documentation versions.
NCCL Archives
This Archives document provides access to previously released NCCL documentation versions.
TensorRT Archives
This Archives document provides access to previously released TensorRT documentation versions.
DALI Archives
This Archives document provides access to previously released DALI documentation versions.