NVIDIA Optimized Frameworks

NVIDIA Docs Hub Homepage NVIDIA Optimized Frameworks

NVIDIA Optimized Frameworks

Deep learning (DL) frameworks offer building blocks for designing, training, and validating deep neural networks through a high-level programming interface. Widely-used DL frameworks, such as PyTorch, TensorFlow, PyTorch Geometric, DGL, and others, rely on GPU-accelerated libraries, such as cuDNN, NCCL, and DALI to deliver high-performance, multi-GPU-accelerated training.

Developers, researchers, and data scientists can get easy access to NVIDIA AI-optimized DL framework containers with DL examples that are performance-tuned and tested for NVIDIA GPUs. This eliminates the need to manage packages and dependencies or build DL frameworks from source. Containerized DL frameworks, with all dependencies included, provide an easy place to start developing common applications, such as conversational AI, natural language understanding (NLU), recommenders, and computer vision. Visit the NVIDIA NGC™ catalog to learn more.

optimized frameworks

NVIDIA Optimized Frameworks

These documents provide information regarding the current NVIDIA Optimized Frameworks release.

Getting Started

Preparing To Use Docker Containers

This guide provides the first-step instructions for preparing to use Docker containers on your DGX system. You must setup your DGX system before you can access the NVIDIA GPU Cloud (NGC) container registry to pull a container.

Containers For Deep Learning Frameworks User Guide

This guide provides a detailed overview about containers and step-by-step instructions for pulling and running a container and customizing and extending containers.

Support Matrix

support matrix

Frameworks Support Matrix

This support matrix is for NVIDIA optimized frameworks. The matrix provides a single view into the supported software and specific versions that come packaged with the frameworks based on the container image.

More Coverage

Optimized Frameworks Release Notes

Kaldi Release Notes

These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container. The Kaldi speech recognition framework is a useful framework for turning spoken audio into text based on an acoustic and language model. The Kaldi container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been, or will be, sent upstream. The libraries and contributions have all been tested, tuned, and optimized.

NVIDIA Optimized Deep Learning Framework, powered by Apache MXNet Release Notes

These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container. The Apache MXNet framework delivers high convolutional neural network performance and multi-GPU training, provides automatic differentiation, and optimized predefined layers. it is a useful framework for those who need their model inference to run anywhere. For example, a data scientist can train a model on a DGX-1™ system with Volta by writing a model in Python, and a data engineer can deploy the trained model by using a Scala API that is tied to the company’s existing infrastructure. The Optimized Deep Learning Framework container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.

PyTorch Release Notes

These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container. The PyTorch framework enables you to develop deep learning models with flexibility, use Python packages such as SciPy, NumPy, and so on. The PyTorch framework is convenient and flexible, with examples that cover reinforcement learning, image classification, and machine translation as the more common use cases. The PyTorch container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.

PaddlePaddle Release Notes

This document describes the key features, software enhancements and improvements, known issues, and how to run this container. PaddlePaddle framework can be used for education, research, and product usage in your products, including for speech, voice, sound recognition, information retrieval, image recognition, and classification. The framework can also be used for text-based applications, such as detecting fraud and threats, analyzing time-series data to extract statistics, and video detection, such as motion and real-time threat detection in gaming, security, and so on. The PaddlePaddle container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.

TensorFlow Release Notes

These release notes provide information about the key features, software enhancements and improvements, known issues, and how to run this container. The TensorFlow framework can be used for education, research, and for product usage in your products, including for speech, voice, and sound recognition, information retrieval, and image recognition and classification. The TensorFlow framework can also be used for text-based applications, such as the detection of fraud and threats, analyzing time series data to extract statistics, and video detection, such as motion and real time threat detection in gaming, security, and so on. The TensorFlow container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.

TensorFlow Wheel Release Notes

These release notes describe the key features, software enhancements and improvements, known issues, and how to run the container for this release.

Optimized Frameworks User Guides

TensorFlow User Guide

TensorFlow is an open-source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. The TensorFlow User Guide provides a detailed overview and look into using and customizing the TensorFlow deep learning framework. This guide also provides documentation on the NVIDIA TensorFlow parameters that you can use to help implement the optimizations of the container into your environment.

Installing Frameworks for Jetson

Installing TensorFlow for Jetson Platform

This guide provides the instructions for installing TensorFlow on Jetson Platform. The Jetson Platform includes modules such as Jetson Nano, Jetson AGX Xavier, and Jetson TX2. This guide describes the prerequisites for installing TensorFlow on Jetson Platform, the detailed steps for the installation and verification, and best practices for optimizing the performance of the Jetson Platform.

TensorFlow for Jetson Platform Release Notes

This document contains the release notes for installing TensorFlow for Jetson Platform. It describes the key features, software enhancements, and known issues when installing TensorFlow for Jetson Platform.

Installing PyTorch for Jetson Platform

This guide provides instructions for installing PyTorch for Jetson Platform.

PyTorch for Jetson Platform Release Notes

This document contains the release notes for installing PyTorch for Jetson Platform. It describes the key features, software enhancements, and known issues when installing PyTorch for Jetson Platform.

Accelerating Inference In Frameworks With TensorRT

Accelerating Inference in TensorFlow with TensorRT User Guide

TensorFlow-TensorRT (TF-TRT) is a deep-learning compiler for TensorFlow that optimizes TF models for inference on NVIDIA devices. TF-TRT is the TensorFlow integration for NVIDIA’s TensorRT (TRT) High-Performance Deep-Learning Inference SDK, allowing users to take advantage of its functionality directly within the TensorFlow framework.

Profiling With DLProf

DLProf User Guide

The Deep Learning Profiler (DLProf) User Guide provides instructions on using the DLProf tool to improve the performance of deep learning models.

DLProf Release Notes

The Deep Learning Profiler (DLProf) Release Notes provides a brief description of the DLProf tool.

External Resources

Deep Learning Frameworks

Widely-used DL frameworks, such as PyTorch, TensorFlow, PyTorch Geometric, DGL, and others, rely on GPU-accelerated libraries, such as cuDNN, NCCL, and DALI to deliver high-performance, multi-GPU-accelerated training.