Infrastructure Layer Software#

Infrastructure Layer Components: The infrastructure layer includes GPU and networking drivers optimized for bare-metal and virtualized environments, Kubernetes operators for managing GPU and networking resources and the lifecycle of microservices and AI pipelines, software for AI workload and GPU orchestration, and cluster management software for provisioning and monitoring servers at scale.

AI Infrastructure Management: These components ensure efficient deployment, management, and scaling of AI compute resources across bare-metal, virtualized, and containerized environments. Each infrastructure release provides versioned updates to maintain compatibility and stability, along with feature updates, security patches, and performance improvements. For application layer components, refer to Application Layer Software.

Release Distribution: Infrastructure software is distributed through the Infrastructure Branch, which provides regular updates with 1-year support windows and can be designated as a Long-Term Support Branch (LTSB) for extended 3-year support. For more information, refer to the Lifecycle Policy.

Software Components#

NVIDIA AI Enterprise includes the following NVIDIA infrastructure layer software components. For application layer components, refer to Application Layer Software.

Table 3 Infrastructure Layer Software Components#

Component

Description

NGC Catalog

Documentation

NGC Catalog Resources

NVIDIA AI Enterprise

Lists all NVIDIA AI Enterprise supported software available on NGC, including AI frameworks, microservices, pre-trained models, SDKs, and tools with enterprise support.

NVIDIA AI Enterprise on NGC

NVIDIA AI Enterprise Docs Hub

NVIDIA Infra Collections

Lists all NVIDIA AI Enterprise Infrastructure collections available on NGC, including infrastructure software components such as drivers, operators, and management tools for deploying and managing AI workloads.

Infra Collections on NGC

Software Branch Documentation

Core Infrastructure Drivers

NVIDIA GPU Driver

Provides hardware support for NVIDIA GPUs. Refer to the Infrastructure Release Notes for supported GPUs and operating systems.

NVIDIA GPU Driver on NGC

NVIDIA Data Center Driver Documentation

NVIDIA DOCA Driver for Networking

Provides hardware support for NVIDIA BlueField DPUs and SuperNICs. Installing DOCA on the host provides all necessary drivers and tools to manage BlueField and ConnectX devices.

NVIDIA DOCA Driver on NGC

NVIDIA DOCA Drivers Documentation

NVIDIA Fabric Manager

Manages NVSwitch-based GPU-to-GPU interconnects in multi-GPU systems such as DGX and HGX platforms.

For NVIDIA vGPU: NVIDIA Fabric Manager on NGC

For NVIDIA vGPU Passthrough: NVIDIA Fabric Manager User Guide

Data Center Services

NVIDIA DOCA Microservices

Infrastructure acceleration and offload services for NVIDIA BlueField, enabling accelerated networking, storage, and security workloads.

NVIDIA DOCA Microservices on NGC

NVIDIA DOCA Documentation

Virtualization [1]

NVIDIA Virtual GPU Manager

NVIDIA GPU driver deployed in the hypervisor for virtualized environments. Enables multi-tenant GPU sharing, live migration, and monitoring.

NVIDIA vGPU Manager on NGC

NVIDIA vGPU for Compute Documentation

NVIDIA vGPU for Compute Guest Driver

NVIDIA GPU driver deployed in the VM or on bare metal OS to enable multiple VMs to have simultaneous, direct access to a single physical GPU.

NVIDIA vGPU Guest Driver on NGC

NVIDIA vGPU for Compute Documentation

Container Platform

NVIDIA Container Toolkit

Enables GPU-accelerated containers by providing runtime components and utilities for container engines (Docker, containerd, CRI-O).

NVIDIA Container Toolkit on NGC

NVIDIA Container Toolkit Documentation

GPU Orchestration

NVIDIA Run:ai

Provides a Kubernetes-native orchestration and management platform that maximizes GPU utilization for AI workloads through advanced scheduling and resource management.

NVIDIA Run:ai on NGC

NVIDIA Run:ai Documentation

Kubernetes Operators

NVIDIA DPU Operator (DPF)

Enables cluster administrators to automate provisioning, orchestration, and lifecycle management of BlueField DPUs and DOCA Microservices to enable DPU-accelerated North-South networking in Kubernetes.

NVIDIA DPU Operator on NGC

NVIDIA DPU Operator Documentation

NVIDIA GPU Operator

Simplifies deployment of NVIDIA AI Enterprise by automating management of all NVIDIA software components needed to provision GPUs in Kubernetes.

NVIDIA GPU Operator on NGC

NVIDIA GPU Operator Documentation

NVIDIA Network Operator

Simplifies deployment of high-speed networking by automating management of NVIDIA ConnectX NICs and SuperNICs required to optimize East-West traffic and RDMA transfers in Kubernetes.

NVIDIA Network Operator on NGC

NVIDIA Network Operator Documentation

NVIDIA NIM Operator

Enables cluster administrators to operate the software components and services required to run LLM, embedding, and other models using NVIDIA NIM microservices in Kubernetes.

NVIDIA NIM Operator on NGC

NVIDIA NIM Operator Documentation

Supported Infrastructure Software Not on NGC#

Attention

In this context, Supported means NVIDIA accepts bug reports from customers with valid NVIDIA AI Enterprise subscriptions and will work to address reported issues. All supported software is covered by NVIDIA AI Enterprise Support Services. NVIDIA supports the individual software components listed below, but not glue code, third-party applications, or full deployment integration.

The following infrastructure software is supported for use with NVIDIA AI Enterprise but is not distributed through NGC. Some components require enterprise licensing or are tightly coupled with hardware support, while third-party infrastructure software is obtained directly from the respective vendors.

Table 4 Supported Infrastructure Software Not on NGC#

Component

Description

Access

NVIDIA Infrastructure Software

NVIDIA Base Command Manager (BCM)

Cluster management and provisioning tool for NVIDIA DGX systems.

NVIDIA Enterprise Support Portal

NVIDIA vGPU Software Licensing Server

Enables GPU virtualization for VDI and AI workloads.

NVIDIA Licensing Portal or partner channels

Networking Drivers and SDKs

NVIDIA DOCA Host Package (with DOCA-OFED Drivers)

DOCA-Host includes all needed host drivers and tools for NVIDIA BlueField and ConnectX devices.

NVIDIA Networking Support Portal

NVIDIA BlueField Software Bundle

BlueField software bundle (BF-Bundle) is installed on the BlueField Arm cores to provide a complete DOCA experience on the BlueField networking platform.

NVIDIA Networking Support Portal

NVIDIA BlueField Firmware Bundle

BlueField Firmware Bundle (BF-FWBundle) is a minimal software package installed on the BlueField Arm cores.

NVIDIA Networking Support Portal

Release Branches and Lifecycle#

Infrastructure software is distributed through the Infrastructure Branch, which provides regular updates with 1-year support windows and can be designated as a Long-Term Support Branch (LTSB) for extended 3-year support.

Infrastructure Support Matrix#

NVIDIA AI Enterprise is certified to run across public cloud, data centers, workstations, DGX platform, and edge environments. Use the support matrix to verify supported configurations for your deployment.

Important

Interactive Support Matrix

Explore the support matrix with interactive filtering and version comparison across releases 7.0-7.4:

Launch Interactive Tool

Interactive Features:

  • Compare configurations across multiple releases side-by-side

  • Filter by deployment type (bare metal, virtualized, cloud)

  • Progressive filtering guides you through OS, hypervisor, and orchestration options

  • Interactive footnotes with hover tooltips

  • Dynamic version badges show exactly which releases support each configuration

Infrastructure Release Notes#