NVIDIA Optimized Frameworks

NVIDIA Optimized Frameworks

Deep learning (DL) frameworks offer building blocks for designing, training, and validating deep neural networks through a high-level programming interface. Widely-used DL frameworks, such as PyTorch, TensorFlow, PyTorch Geometric, DGL, and others, rely on GPU-accelerated libraries, such as cuDNN, NCCL, and DALI to deliver high-performance, multi-GPU-accelerated training.

Developers, researchers, and data scientists can get easy access to NVIDIA AI-optimized DL framework containers with DL examples that are performance-tuned and tested for NVIDIA GPUs. This eliminates the need to manage packages and dependencies or build DL frameworks from source. Containerized DL frameworks, with all dependencies included, provide an easy place to start developing common applications, such as conversational AI, natural language understanding (NLU), recommenders, and computer vision. Visit the NVIDIA NGC™ catalog to learn more.

Documentation Center
These documents provide information regarding the current NVIDIA Optimized Frameworks release.
Getting Started
This guide provides the first-step instructions for preparing to use Docker containers on your DGX system. You must setup your DGX system before you can access the NVIDIA GPU Cloud (NGC) container registry to pull a container.
This guide provides a detailed overview about containers and step-by-step instructions for pulling and running a container and customizing and extending containers.
Support Matrix
02/01/23
This support matrix is for NVIDIA optimized frameworks. The matrix provides a single view into the supported software and specific versions that come packaged with the frameworks based on the container image.
More Coverage
Optimized Frameworks Release Notes
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container. The Kaldi speech recognition framework is a useful framework for turning spoken audio into text based on an acoustic and language model. The Kaldi container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been, or will be, sent upstream. The libraries and contributions have all been tested, tuned, and optimized.
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container. The Apache MXNet framework delivers high convolutional neural network performance and multi-GPU training, provides automatic differentiation, and optimized predefined layers. it is a useful framework for those who need their model inference to run anywhere. For example, a data scientist can train a model on a DGX-1™ system with Volta by writing a model in Python, and a data engineer can deploy the trained model by using a Scala API that is tied to the company’s existing infrastructure. The Optimized Deep Learning Framework container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.
These release notes describe the key features, software enhancements and improvements, known issues, and how to run this container. The PyTorch framework enables you to develop deep learning models with flexibility, use Python packages such as SciPy, NumPy, and so on. The PyTorch framework is convenient and flexible, with examples that cover reinforcement learning, image classification, and machine translation as the more common use cases. The PyTorch container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.
This document describes the key features, software enhancements and improvements, known issues, and how to run this container. PaddlePaddle framework can be used for education, research, and product usage in your products, including for speech, voice, sound recognition, information retrieval, image recognition, and classification. The framework can also be used for text-based applications, such as detecting fraud and threats, analyzing time-series data to extract statistics, and video detection, such as motion and real-time threat detection in gaming, security, and so on. The PaddlePaddle container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.
These release notes provide information about the key features, software enhancements and improvements, known issues, and how to run this container. The TensorFlow framework can be used for education, research, and for product usage in your products, including for speech, voice, and sound recognition, information retrieval, and image recognition and classification. The TensorFlow framework can also be used for text-based applications, such as the detection of fraud and threats, analyzing time series data to extract statistics, and video detection, such as motion and real time threat detection in gaming, security, and so on. The TensorFlow container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.
These release notes describe the key features, software enhancements and improvements, known issues, and how to run the container for this release.
Optimized Frameworks User Guides
TensorFlow is an open-source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. The TensorFlow User Guide provides a detailed overview and look into using and customizing the TensorFlow deep learning framework. This guide also provides documentation on the NVIDIA TensorFlow parameters that you can use to help implement the optimizations of the container into your environment.
Installing Frameworks for Jetson
This guide provides the instructions for installing TensorFlow on Jetson Platform. The Jetson Platform includes modules such as Jetson Nano, Jetson AGX Xavier, and Jetson TX2. This guide describes the prerequisites for installing TensorFlow on Jetson Platform, the detailed steps for the installation and verification, and best practices for optimizing the performance of the Jetson Platform.
This document contains the release notes for installing TensorFlow for Jetson Platform. It describes the key features, software enhancements, and known issues when installing TensorFlow for Jetson Platform.
This guide provides instructions for installing PyTorch for Jetson Platform.
This document contains the release notes for installing PyTorch for Jetson Platform. It describes the key features, software enhancements, and known issues when installing PyTorch for Jetson Platform.
Accelerating Inference In Frameworks With TensorRT
TensorFlow-TensorRT (TF-TRT) is a deep-learning compiler for TensorFlow that optimizes TF models for inference on NVIDIA devices. TF-TRT is the TensorFlow integration for NVIDIA’s TensorRT (TRT) High-Performance Deep-Learning Inference SDK, allowing users to take advantage of its functionality directly within the TensorFlow framework.
Profiling With DLProf
The Deep Learning Profiler (DLProf) User Guide provides instructions on using the DLProf tool to improve the performance of deep learning models.
The Deep Learning Profiler (DLProf) Release Notes provides a brief description of the DLProf tool.
External Resources
Widely-used DL frameworks, such as PyTorch, TensorFlow, PyTorch Geometric, DGL, and others, rely on GPU-accelerated libraries, such as cuDNN, NCCL, and DALI to deliver high-performance, multi-GPU-accelerated training.