Deep Learning Frameworks Documentation - Last updated September 25, 2019 - Send Feedback -

Deep Learning Frameworks


Preparing To Use NVIDIA Containers
This guide provides the first-step instructions for preparing to use NVIDIA containers on your DGX system. You must setup your DGX system before you can access the NVIDIA GPU Cloud (NGC) container registry to pull a container.
Containers And Frameworks User Guide
This guide provides a detailed overview about containers and step-by-step instructions for pulling and running a container, as well as customizing and extending containers.

Support Matrix


Frameworks Support Matrix
This support matrix is for NVIDIA optimized frameworks. The matrix provides a single view into the supported software and specific versions that come packaged with the frameworks based on the container image.

Optimized Frameworks Release Notes


Kaldi Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 19.09 and earlier releases. The Kaldi speech recognition framework is a useful framework for turning spoken audio into text based on an acoustic and language model. The Kaldi container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been or will be sent upstream; which are all tested, tuned, and optimized.
MXNet Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 19.09 and earlier releases. The MXNet framework delivers high convolutional neural network performance and multi-GPU training, provides automatic differentiation, and optimized predefined layers. It’s a useful framework for those who need their model inference to “run anywhere”; for example, a data scientist can train a model on a DGX-1 with Volta by writing a model in Python, while a data engineer can deploy the trained model using a Scala API tied to the company’s existing infrastructure. The MXNet container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
NVCaffe Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 19.09 and earlier releases. The NVCaffe framework can be used for image recognition, specifically used for creating, training, analyzing, and deploying deep neural networks. NVCaffe is based on the Caffe Deep Learning Framework by BVLC. The NVCaffe container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
PyTorch Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 19.09 and earlier releases. The PyTorch framework enables you to develop deep learning models with flexibility. With the PyTorch framework, you can make full use of Python packages, such as, SciPy, NumPy, etc. The PyTorch framework is known to be convenient and flexible, with examples covering reinforcement learning, image classification, and machine translation as the more common use cases. The PyTorch container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
TensorFlow Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 19.09 and earlier releases. The TensorFlow framework can be used for education, research, and for product usage within your products; specifically, speech, voice, and sound recognition, information retrieval, and image recognition and classification. Furthermore, the TensorFlow framework can also be used for text-based applications, such as detection of fraud and threats, analyzing time series data to extract statistics, and video detection, such as motion and real time threat detection in gaming, security, etc. The TensorFlow container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.

Optimized Frameworks User Guides


NVCaffe User Guide
Caffe is a deep-learning framework made with flexibility, speed, and modularity in mind. NVCaffe is an NVIDIA-maintained fork of BVLC Caffe tuned for NVIDIA GPUs, particularly in multi-GPU configurations. This guide provides a detailed overview and describes how to use and customize the NVCaffe deep learning framework. This guide also provides documentation on the NVCaffe parameters that you can use to help implement the optimizations of the container into your environment.
TensorFlow User Guide
TensorFlow is an open-source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. The TensorFlow User Guide provides a detailed overview and look into using and customizing the TensorFlow deep learning framework. This guide also provides documentation on the NVIDIA TensorFlow parameters that you can use to help implement the optimizations of the container into your environment.

Installing Frameworks For Jetson


TensorFlow For Jetson Platform
This guide provides the instructions for installing TensorFlow on Jetson Platform. The Jetson Platform includes modules such as Jetson Nano, Jetson AGX Xavier, and Jetson TX2. This guide describes the prerequisites for installing TensorFlow on Jetson Platform, the detailed steps for the installation and verification, and best practices for optimizing the performance of the Jetson Platform.
Release Notes For Jetson Platform
This document contains the release notes for installing TensorFlow for Jetson Platform. The Jetson Platform includes modules such as Jetson Nano, Jetson AGX Xavier, and Jetson TX2. These release notes describe the key features, software enhancements, and known issues when installing TensorFlow for Jetson Platform.

Accelerating Inference In Frameworks With TensorRT


Accelerating Inference In TF-TRT User Guide
During the TensorFlow with TensorRT (TF-TRT) optimization, TensorRT performs several important transformations and optimizations to the neural network graph. First, layers with unused output are eliminated to avoid unnecessary computation. Next, where possible, convolution, bias, and ReLU layers are fused to form a single layer. Another transformation is horizontal layer fusion, or layer aggregation, along with the required division of aggregated layers to their respective output. Horizontal layer fusion improves performance by combining layers that take the same source tensor and apply the same operations with similar parameters. This guide provides instructions on how to accelerate inference in TF-TRT.
Accelerating Inference In TF-TRT Release Notes
TensorRT optimizes the largest subgraphs possible in the TensorFlow graph. The more compute in the subgraph, the greater benefit obtained from TensorRT. You want most of the graph optimized and replaced with the fewest number of TensorRT nodes for best performance. Based on the operations in your graph, it’s possible that the final graph might have more than one TensorRT node. TensorFlow integration with TensorRT (TF-TRT) optimizes and executes compatible subgraphs, allowing TensorFlow to execute the remaining graph. This document describes the key features, software enhancements and improvements, and known issues when integrating TensorRT.

Profiling With DLProf


DLProf User Guide
The Deep Learning Profiler (DLProf) User Guide provides instructions on using the DLProf tool to improve the performance of deep learning models.
DLProf Release Notes
The Deep Learning Profiler (DLProf) Release Notes provides a brief description of the DLProf tool.

Archived Optimized Frameworks Release Notes


Caffe2 Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 18.08 and earlier releases. The Caffe2 framework is used primarily for detection, segmentation, and translation tasks in production for Facebook applications. The Caffe2 framework focuses on cross-platform deployment and performance. The Caffe2 and PyTorch frameworks have a lot of parallel features to them, which resulted in merging the two frameworks into a single package. However, for now, the Caffe2 container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
Microsoft Cognitive Toolkit Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 18.08 and earlier releases. The Microsoft Cognitive Toolkit, previously known as CNTK, framework can be used for applications such as large datasets, object detection and recognition, speech, text, vision and any combination of them. The Microsoft Cognitive Toolkit also supports inference use cases from C++, Python, C#/.NET and Java, which enables you to deploy service on Linux, Windows, Universal Windows Platform (UMP) and Azure. The Microsoft Cognitive Toolkit container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized.
Theano Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 18.08 and earlier releases. The Theano framework enables you to define, analyze, and optimize mathematical equations using the Python library. Developers use Theano for manipulate and analyze expressions, including matrix-valued expressions. Theano focuses on recognizing numerically expressions that are unstable, building symbolic graphs automatically, and compiling parts of your numeric expression into CPU or GPU instructions. The Theano container is currently released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized, however, we will be discontinuing container updates once the next major CUDA version is released.
Torch Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container for the 18.08 and earlier releases. The Torch framework is a scripting language that is based on a programming language called Lua. It is highly advised that you are familiar with Lua before using Torch. Torch provides numerous algorithms for deep learning networks mostly used by researchers. The Torch framework focuses on speeding up the time it takes to build a scientific algorithm. The Torch container is currently released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream; which are all tested, tuned, and optimized, however, we will be discontinuing container updates once the next major CUDA version is released.