Installing AI and Data Science Applications and Frameworks

The AI and data science applications and frameworks are distributed as NGC container images through the NVIDIA NGC Enterprise Catalog. Each container image contains the entire user-space software stack that is required to run the application or framework; namely, the CUDA libraries, cuDNN, any required Magnum IO components, TensorRT, and the framework.

Execute the following workflow steps within the VM in order to pull AI and data science containers.

  1. Generate your API key.

  2. Access the NVIDIA NGC Enterprise Catalog.

  3. For each AI or data science application that you are interested in, load the container.

Detailed below are the Docker pull commands for downloading the container for each application or framework.

NVIDIA TensorRT

NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a trained network and produces a highly optimized runtime engine that performs inference for that network.

sudo docker pull nvcr.io/nvaie/tensorrt:21.07-py3

NVIDIA Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

  • The xx.yy-py3 image contains the Triton inference server with support for Tensorflow, PyTorch, TensorRT, ONNX and OpenVINO models.

sudo docker pull nvcr.io/nvaie/tritonserver:21.07-py3
  • The xx.yy-py3-sdk image contains Python and C++ client libraries, client examples, and the Model Analyzer.

sudo docker pull nvcr.io/nvaie/tritonserver:21.07-py3-sdk
  • The xx.yy-py3-min image is used as the base for creating custom Triton server containers as described in Customize Triton Container.

sudo docker pull nvcr.io/nvaie/tritonserver:21.07-py3-min

NVIDIA RAPIDS

The NVIDIA RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science, machine learning and analytics pipelines entirely on GPUs.

sudo docker pull nvcr.io/nvaie/nvidia-rapids:21.08-cuda11.4-ubuntu20.04-py3.8

NVIDIA GPU Operator

Deploy and manage NVIDIA GPU resources in Kubernetes.

1
2
sudo docker pull nvcr.io/nvaie/gpu-operator:v1.8.1
sudo docker pull nvcr.io/nvaie/vgpu-guest-driver:470.63.01-ubuntu20.04

NVIDIA Network Operator

Deploy and manage NVIDIA networking resources in Kubernetes.

sudo docker pull nvcr.io/nvaie/network-operator:v1.0.0

PyTorch

PyTorch is a GPU accelerated tensor computational framework. Functionality can be extended with common Python libraries such as NumPy and SciPy. Automatic differentiation is done with a tape-based system at the functional and neural network layer levels.

sudo docker pull nvcr.io/nvaie/pytorch:21.07-py3

TensorFlow

TensorFlow is an open source platform for machine learning. It provides comprehensive tools and libraries in a flexible architecture allowing easy deployment across a variety of platforms and devices.

1
2
sudo docker pull nvcr.io/nvaie/tensorflow:21.07-tf1-py3
sudo docker pull nvcr.io/nvaie/tensorflow:21.07-tf2-py3