Deep Learning SDK Documentation
-
Last updated August 3, 2018
-
Send Feedback
-
NVIDIA Deep Learning SDK
- Introduction
- The two major operations from which deep learning produces insight are training and inference. While similar, there are significant differences. Training, feeds examples of objects to be detected/recognized like animals, traffic signs, etc., allowing it to make predictions, as to what these objects are. The training process reinforces correct predictions and corrects the wrong ones. Once trained, a production neural network can achieve upwards of 90-98% correct results. "Inference", is the deployment of a trained network to evaluate new objects, and make predictions with similar predictive accuracy. Inference comes after training, therefore, you must obtain a trained neural network before you can perform inference.
Training
- Training with Mixed Precision
- The Training with Mixed Precision User Guide introduces NVIDIA's latest architecture called Volta. This guide summarizes the ways that a framework can be fine-tuned to gain additional speedups by leveraging the Volta architectural features.
- cuDNN Release Notes
- This document describes the key features, software enhancements and improvements, and known issues for cuDNN v7.2.1.
- cuDNN SLA
- This document is the Software License Agreement (SLA) for NVIDIA cuDNN. The following contains specific license terms and conditions for NVIDIA cuDNN. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
- cuDNN Installation Guide
- This guide provides step-by-step instructions on how to install and check for correct operation of NVIDIA cuDNN v7.2.1 on Linux, Mac OS X, and Microsoft Windows systems.
- cuDNN Developer Guide
- This NVIDIA CUDA Deep Neural Network (cuDNN) Developer Guide provides an overview about cuDNN and details about the types, enums, and routines within the cuDNN library API.
- NCCL Release Notes
- This document describes the key features, software enhancements and improvements, and known issues for NCCL 2.2.13.
- NCCL SLA
- This document is the Software License Agreement (SLA) for NVIDIA NCCL. The following contains specific license terms and conditions for NVIDIA NCCL. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
- NCCL Installation Guide
- This NVIDIA Collective Communication Library (NCCL) Installation Guide provides a step-by-step instructions for downloading and installing NCCL 2.2.13.
- NCCL Developer Guide
- This NVIDIA Collective Communication Library (NCCL) Developer Guide provides a detailed discussion of the NCCL programming model, creating collective communications and working with operations.
- NCCL API
- This is the API documentation for the NVIDIA Collective Communications Library (NCCL). It provides information on individual functions, classes and methods.
- Additional Resources
- The Additional Resources topic provides you with important related links that are outside of this product documentation.
Inference
- TensorRT Support Matrix
- This support matrix is for TensorRT. These matrices provide a look into supported features and software for TensorRT APIs, parsers, and layers.
- TensorRT Release Notes
- NVIDIA TensorRT is a C++ library that facilitates high performance inference on NVIDIA GPUs. It is designed to work in connection with deep learning frameworks that are commonly used for training. TensorRT focuses specifically on running an already trained network quickly and efficiently on a GPU for the purpose of generating a result; also known as inferencing. These release notes describe the key features, software enhancements and improvements, and known issues for the TensorRT 4.0.1 product package.
- TensorRT Installation Guide
- This TensorRT Installation Guide provides the installation requirements, a list of what is included in the TensorRT package, and step-by-step instructions for installing TensorRT 4.0.1.
- TensorRT Developer Guide
- This TensorRT 4.0.1 Developer Guide demonstrates how to use the C++ and Python APIs for implementing the most common deep learning layers. It shows how you can take an existing model built with a deep learning framework and use that to build a TensorRT engine using the provided parsers. The Developer Guide also provides step-by-step instructions for common user tasks such as, creating a TensorRT network definition, invoking the TensorRT builder, serializing and deserializing, and how to feed the engine with data and perform inference; all while using either the C++ or Python API. Lastly, a section on every sample included in the package is also provided.
- TensorRT API
- This is the API documentation for the NVIDIA TensorRT library. The TensorRT API allows developers to import pre-trained models, calibrate their networks using INT8, and build and deploy optimized networks. Networks can be imported directly from NVCaffe, or from other frameworks via the UFF or ONNX formats. They may also be created programmatically using the C++ or Python API by instantiating individual layers and setting parameters and weights directly.
- TensorRT SLA
- This document is the Software License Agreement (SLA) for NVIDIA TensorRT. This document contains specific license terms and conditions for NVIDIA TensorRT. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
- TensorRT Container Release Notes
- The TensorRT container is an easy to use container for TensorRT development. The container allows for the TensorRT samples to be built, modified, and executed. These release notes provide a list of key features, packaged software included in the container, software enhancements and improvements, and any known issues for the 18.08 and earlier releases. The TensorRT container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
- Inference Server Container Release Notes
- The actual Inference Server is packaged within the Inference Server container. This document walks you through the process of getting up and running with the Inference Server container; from the prerequisites to running the container. Additionally, the release notes provide a list of key features, packaged software included in the container, software enhancements and improvements, and any known issues for the 18.08 and earlier releases. The Inference Server container is released on a monthly basis to provide you with up-to-date software that is tested, tuned, and optimized.
- Inference Server User Guide
- This User Guide focuses on documenting the Inference Server and its benefits. The Inference Server is included within the Inference Server container. This guide provides step-by-step instructions for pulling and running the Inference Server container, along with the details of the model store and the Inference API.
- Additional Resources
- The Additional Resources topic provides you with important related links that are outside of this product documentation.
Data Loading
- DALI Release Notes
- This document describes the key features, software enhancements and improvements, and known issues for DALI 0.2 beta and earlier releases.
- DALI Quick Start Guide
- This NVIDIA Data Loading Library (DALI) 0.2 Quick Start Guide provides the installation requirements and step-by-step instructions for installing DALI as a beta release. The guide demonstrates how to get compatible MXNet, TensorFlow, and PyTorch frameworks, and install DALI from a binary or GitHub installation. This guide also provides a sample for running a DALI accelerated pre-configured ResNet-50 model on MXNet, TensorFlow, or PyTorch for image classification training.
- DALI Developer Guide
- This NVIDIA Data Loading Library (DALI) 0.2 Developer Guide demonstrates how to define, build, and run a DALI pipeline, as a single library, that can be integrated into different deep learning training and inference applications. By exposing optimized building blocks that are executed using an efficient engine, and enabling offloading of operations onto a GPU, DALI provides both performance and flexibility of accelerating different data pipelines. DALI is available as a beta release.
- DALI SLA
- This document is the Software License Agreement (SLA) for NVIDIA Data Loading Library (DALI). This document contains specific license terms and conditions for NVIDIA DALI. By accepting this agreement, you agree to comply with all the terms and conditions applicable to the specific product(s) included herein.
Archives
- cuDNN Archives
- This Archives document provides access to previously released cuDNN documentation versions.
- NCCL Archives
- This Archives document provides access to previously released NCCL documentation versions.
- TensorRT Archives
- This Archives document provides access to previously released TensorRT documentation versions.
- DALI Archives
- This Archives document provides access to previously released DALI documentation versions.