NVIDIA AI Enterprise Documentation - v4.2 (all) - Last updated October 3, 2024 - Send Feedback

NVIDIA AI Enterprise v4.2

NVIDIA® AI Enterprise is an end-to-end, secure AI software platform that accelerates the data science pipeline and streamlines the development and deployment of production AI.

Release Notes: Current status, information on validated platforms, and known issues with NVIDIA AI Enterprise.
Product Support Matrix: Matrix of all products and platforms that are supported for NVIDIA AI Enterprise.
Quick Start Guide: Documentation for system administrators that provides minimal instructions for installing and configuring NVIDIA AI Enterprise.
User Guide: Documentation for administrators that explains how to install and configure NVIDIA AI Enterprise.

NVIDIA License System: NVIDIA® License System is used to serve a pool of floating licenses to NVIDIA licensed products. The NVIDIA License System is configured with licenses obtained from the NVIDIA Licensing Portal.
NVIDIA GPU Operator: NVIDIA GPU Operator simplifies the deployment of NVIDIA AI Enterprise with software container platforms.
NVIDIA Network Operator: NVIDIA Network Operator uses Kubernetes CRDs and Operator SDK to manage networking related components, to enable fast networking, RDMA, and GPUDirect® technology for workloads in a Kubernetes cluster. The Network Operator works in conjunction with the NVIDIA GPU Operator to enable GPUDirect RDMA on compatible systems.
NVIDIA Base Command™ Manager Essentials: NVIDIA Base Command Manager streamlines cluster provisioning, workload management, and infrastructure monitoring. It provides all the tools you need to deploy and manage an AI data center. NVIDIA Base Command Manager Essentials comprises the features of NVIDIA Base Command Manager that are certified for use with NVIDIA AI Enterprise.

Tools for AI Development and Use Cases

NVIDIA TensorRT: NVIDIA TensorRT is a C++ library that facilitates high performance inference on NVIDIA GPUs. It is designed to work in connection with deep learning frameworks that are commonly used for training. TensorRT focuses specifically on running an already trained network quickly and efficiently on a GPU for the purpose of generating a result; also known as inferencing.
PyTorch: The PyTorch framework enables you to develop deep learning models with flexibility. With the PyTorch framework, you can make full use of Python packages, such as, SciPy, NumPy, etc.
NVIDIA RAPIDS: The RAPIDS data science framework is a collection of libraries for running end-to-end data science pipelines completely on the GPU. The interaction is designed to have a familiar look and feel to working in Python, but uses optimized NVIDIA® CUDA® Toolkit primitives and high-bandwidth GPU memory.
NVIDIA RAPIDS Accelerator for Apache Spark: NVIDIA RAPIDS Accelerator for Apache Spark uses NVIDIA GPUs to accelerate Spark data frame workloads transparently, that is, without code changes.
TensorFlow: TensorFlow is an open-source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code.
NVIDIA Triton Inference Server: Triton Inference Server provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. Triton supports an HTTP/REST and GRPC protocol that allows remote clients to request inferencing for any model being managed by the server.
NVIDIA Triton Management Service: Triton Management Service automates the deployment of Triton Inference Server instances at scale in Kubernetes with resource-efficient model orchestration on GPUs and CPUs.
NVIDIA Clara Parabricks: NVIDIA Clara Parabricks is a software suite for genomic analysis. It delivers major improvements in throughput time for common analytical tasks in genomics, including germline and somatic analysis.
NVIDIA DeepStream: DeepStream is a streaming analytic toolkit for building AI-powered applications. It takes the streaming data - from USB and CSI cameras, video from files, or streams over RTSP - as input and uses AI and computer vision to generate insights from pixels for a better understanding of the environment.
MONAI (Medical Open Network for Artificial Intelligence) Enterprise: MONAI - Medical Open Network for Artificial Intelligence - is the domain-specific, open-source Medical AI framework that drives research breakthroughs and accelerates AI into clinical impact. MONAI unlocks the power of medical data to build deep learning models for medical AI workflows. MONAI provides the essential domain-specific tools from data labeling to model training, making it easy to develop, reproduce and standardize medical AI lifecycles.
TAO Toolkit: The NVIDIA TAO Toolkit allows you to combine NVIDIA pre-trained models with your own data to create custom Computer Vision (CV) and Conversational AI models.
NVIDIA DGL: NVIDIA DGL is an easy-to-use, high performance and scalable Python package for deep learning on graphs. NVIDIA DGL containers are built on top of an optimized deep learning framework container, such as Pytorch NGC container with the latest stable DGL open source code.
NVIDIA Modulus: NVIDIA Modulus blends physics, as expressed by governing partial differential equations (PDEs), boundary conditions, and training data to build high-fidelity, parameterized, surrogate deep learning models. The platform abstracts the complexity of setting up a scalable training pipeline, so you can leverage your domain expertise to map problems to an AI model’s training and develop better neural network architectures.
NVIDIA NeMo™: NVIDIA NeMo is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures.
NVIDIA Maxine: NVIDIA Maxine is a suite of GPU-accelerated AI SDKs and cloud-native microservices for deploying AI features that enhance audio, video, and augmented reality effects in real time. Maxine’s state-of-the-art models create high-quality effects that can be achieved with standard microphone and camera equipment. Maxine can be deployed on premises, in the cloud, or at the edge.
Note: NVIDIA Maxine documentation is available only in the NVIDIA AI Enterprise distribution of NVIDIA Maxine.
NVIDIA Riva: NVIDIA Riva is a set of GPU-accelerated multilingual speech and translation microservices for building fully customizable, real-time conversational AI pipelines. Riva includes automatic speech recognition (ASR), text-to-speech (TTS), and neural machine translation (NMT) and is deployable in all clouds, in data centers, at the edge, or on embedded devices.

Infrastructure and Workload Management Components and Related Software

Tools for AI Development and Use Cases