Deep Learning SDK Documentation - Last updated November 26, 2018 - Send Feedback -

NVIDIA Deep Learning SDK


Introduction
The two major operations from which deep learning produces insight are training and inference. While similar, there are significant differences. Training, feeds examples of objects to be detected/recognized like animals, traffic signs, etc., allowing it to make predictions, as to what these objects are. The training process reinforces correct predictions and corrects the wrong ones. Once trained, a production neural network can achieve upwards of 90-98% correct results. "Inference", is the deployment of a trained network to evaluate new objects, and make predictions with similar predictive accuracy. Inference comes after training, therefore, you must obtain a trained neural network before you can perform inference.

Training Library


Training with Mixed Precision
The Training with Mixed Precision User Guide introduces NVIDIA's latest architecture called Volta. This guide summarizes the ways that a framework can be fine-tuned to gain additional speedups by leveraging the Volta architectural features.
cuDNN Support Matrix
This document describes, for each cuDNN version, the supported versions of the OS, CUDA, the CUDA driver, and the NVIDIA hardware.
cuDNN Release Notes
This document describes the key features, software enhancements and improvements, and known issues for cuDNN v7.4.1.
cuDNN SLA
This document is the Software License Agreement (SLA) for NVIDIA cuDNN. The following contains specific license terms and conditions for NVIDIA cuDNN. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
cuDNN Installation Guide
This guide provides step-by-step instructions on how to install and check for correct operation of NVIDIA cuDNN v7.4.1 on Linux, Mac OS X, and Microsoft Windows systems.
cuDNN Developer Guide
This NVIDIA CUDA Deep Neural Network (cuDNN) Developer Guide provides an overview of cuDNN 7.4.1, and details about the types, enums, and routines within the cuDNN library API.
NCCL Release Notes
This document describes the key features, software enhancements and improvements, and known issues for NCCL 2.3.7.
NCCL Installation Guide
This NVIDIA Collective Communication Library (NCCL) Installation Guide provides a step-by-step instructions for downloading and installing NCCL 2.3.7.
NCCL Developer Guide
This NVIDIA Collective Communication Library (NCCL) 2.3.7 Developer Guide provides a detailed discussion of the NCCL programming model, creating collective communications and working with operations. This document also includes the API which includes information on individual functions, classes and methods.
NCCL SLA
This document is the Software License Agreement (SLA) for NVIDIA NCCL. The following contains specific license terms and conditions for NVIDIA NCCL. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
NCCL BSD License
This document is the Berkeley Software Distribution (BSD) license for NVIDIA NCCL. The following contains specific license terms and conditions for NVIDIA NCCL open sourced. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
Additional Resources
The Additional Resources topic provides you with important related links that are outside of this product documentation.

Inference Library


TensorRT Support Matrix
This support matrix is for TensorRT. These matrices provide a look into supported features and software for TensorRT APIs, parsers, and layers.
TensorRT Release Notes
NVIDIA TensorRT is a C++ library that facilitates high performance inference on NVIDIA GPUs. It is designed to work in connection with deep learning frameworks that are commonly used for training. TensorRT focuses specifically on running an already trained network quickly and efficiently on a GPU for the purpose of generating a result; also known as inferencing. These release notes describe the key features, software enhancements and improvements, and known issues for the TensorRT 5.0.4 product package.
TensorRT Installation Guide
This TensorRT Installation Guide provides the installation requirements, a list of what is included in the TensorRT package, and step-by-step instructions for installing TensorRT 5.0.4.
TensorRT Developer Guide
This TensorRT 5.0.4 Developer Guide demonstrates how to use the C++ and Python APIs for implementing the most common deep learning layers. It shows how you can take an existing model built with a deep learning framework and use that to build a TensorRT engine using the provided parsers. The Developer Guide also provides step-by-step instructions for common user tasks such as, creating a TensorRT network definition, invoking the TensorRT builder, serializing and deserializing, and how to feed the engine with data and perform inference; all while using either the C++ or Python API. Lastly, a section on every sample included in the package is also provided.
Best Practices For TensorRT Performance
This Best Practices guide covers various performance considerations related to deploying networks using TensorRT 5.0.4. These sections assume that you have a model that is working at an appropriate level of accuracy and that you are able to successfully use TensorRT to do inference for your model.
TensorRT API
This is the API documentation for the NVIDIA TensorRT library. The TensorRT API allows developers to import pre-trained models, calibrate their networks using INT8, and build and deploy optimized networks. Networks can be imported directly from NVCaffe, or from other frameworks via the UFF or ONNX formats. They may also be created programmatically using the C++ or Python API by instantiating individual layers and setting parameters and weights directly.
TensorRT SLA
This document is the Software License Agreement (SLA) for NVIDIA TensorRT. This document contains specific license terms and conditions for NVIDIA TensorRT. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
TensorRT Container Release Notes
The TensorRT container is an easy to use container for TensorRT development. The container allows for the TensorRT samples to be built, modified, and executed. These release notes provide a list of key features, packaged software included in the container, software enhancements and improvements, and any known issues for the 18.11 and earlier releases. The TensorRT container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
Additional Resources
The Additional Resources topic provides you with important related links that are outside of this product documentation.

Inference Server


TensorRT Inference Server Container Release Notes
The actual Inference Server is packaged within the TensorRT Inference Server container. This document walks you through the process of getting up and running with the Inference Server container; from the prerequisites to running the container. Additionally, the release notes provide a list of key features, packaged software included in the container, software enhancements and improvements, and any known issues for the 18.11 and earlier releases. The Inference Server container is released on a monthly basis to provide you with up-to-date software that is tested, tuned, and optimized.
TensorRT Inference Server Guide
This Guide covers topics such as getting started, installing, and how to use the TensorRT inference server. In addition, this guide encompasses a Developer Guide, configurations, and an API Reference.
TensorRT Inference Server SLA
This document is the Software License Agreement (SLA) for NVIDIA TensorRT Inference Server. The following contains specific license terms and conditions for NVIDIA TensorRT Inference Server. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
TensorRT Inference Server BSD License
This document is the Berkeley Software Distribution (BSD) license for NVIDIA TensorRT Inference Server. The following contains specific license terms and conditions for NVIDIA TensorRT Inference Server open sourced. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.

Data Loading


DALI Release Notes
This document describes the key features, software enhancements and improvements, and known issues for DALI 0.5.0 beta and earlier releases.
DALI Quick Start Guide
This NVIDIA Data Loading Library (DALI) 0.5,0 Quick Start Guide provides the installation requirements and step-by-step instructions for installing DALI as a beta release. The guide demonstrates how to get compatible MXNet, TensorFlow, and PyTorch frameworks, and install DALI from a binary or GitHub installation. This guide also provides a sample for running a DALI accelerated pre-configured ResNet-50 model on MXNet, TensorFlow, or PyTorch for image classification training.
DALI Developer Guide
This NVIDIA Data Loading Library (DALI) 0.5.0 Developer Guide demonstrates how to define, build, and run a DALI pipeline, as a single library, that can be integrated into different deep learning training and inference applications. By exposing optimized building blocks that are executed using an efficient engine, and enabling offloading of operations onto a GPU, DALI provides both performance and flexibility of accelerating different data pipelines. DALI is available as a beta release.
DALI SLA
This document is the Software License Agreement (SLA) for NVIDIA Data Loading Library (DALI). This document contains specific license terms and conditions for NVIDIA DALI. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.

Archives


cuDNN Archives
This Archives document provides access to previously released cuDNN documentation versions.
NCCL Archives
This Archives document provides access to previously released NCCL documentation versions.
TensorRT Archives
This Archives document provides access to previously released TensorRT documentation versions.
TensorRT Inference Server Archives
This Archives document provides access to previously released TensorRT inference server documentation versions.
DALI Archives
This Archives document provides access to previously released DALI documentation versions.