Infrastructure Layer Software#
Infrastructure Layer Components: The infrastructure layer includes GPU and networking drivers optimized for bare-metal and virtualized environments, Kubernetes operators for managing GPU and networking resources and the lifecycle of microservices and AI pipelines, and cluster management software for provisioning and monitoring servers at scale.
AI Infrastructure Management: These components ensure efficient deployment, management, and scaling of AI compute resources across bare-metal, virtualized, and containerized environments. Each infrastructure release provides versioned updates to maintain compatibility and stability, along with feature updates, security patches, and performance improvements. For application layer components, see Application Layer Software.
Release Distribution: Infrastructure software is distributed through the Infrastructure Branch, which provides regular updates with 1-year support windows and can be designated as a Long-Term Support Branch (LTSB) for extended 3-year support. For more information, see Lifecycle Policy.
Software Components#
NVIDIA AI Enterprise includes the following infrastructure software components. For application layer components, see Application Layer Software.
Component |
Description |
NGC Catalog |
Documentation |
|---|---|---|---|
NGC Catalog Resources |
|||
NVIDIA AI Enterprise |
Lists all NVIDIA AI Enterprise supported software available on NGC, including AI frameworks, microservices, pre-trained models, SDKs, and tools with enterprise support. |
||
NVIDIA Infra Collections |
Lists all NVIDIA AI Enterprise Infrastructure collections available on NGC, including infrastructure software components such as drivers, operators, and management tools for deploying and managing AI workloads. |
||
Core Infrastructure Drivers |
|||
NVIDIA Data Center Driver |
Provides hardware support for NVIDIA GPUs. Consult the appropriate NVIDIA AI Enterprise Release Notes to see which GPUs and operating systems are supported. |
||
NVIDIA DOCA Driver for Networking |
Provides hardware support for NVIDIA BlueField DPUs and SuperNICs. Installing DOCA on the host provides all necessary drivers and tools to manage BlueField and ConnectX devices. |
||
NVIDIA Fabric Manager |
Manages NVSwitch-based GPU-to-GPU interconnects in multi-GPU systems such as DGX and HGX platforms. |
For vGPU: Fabric Manager on NGC |
For Passthrough: Fabric Manager User Guide |
Data Center Services |
|||
NVIDIA DOCA Microservices |
Infrastructure acceleration and offload services for NVIDIA BlueField, enabling accelerated networking, storage, and security workloads. |
||
Virtualization [1] |
|||
NVIDIA Virtual GPU Manager |
GPU driver deployed in the hypervisor for virtualized environments. Enables multi-tenant GPU sharing, live migration, and monitoring. |
||
NVIDIA vGPU for Compute Guest Driver |
GPU driver deployed in the VM or on bare metal OS to enable multiple VMs to have simultaneous, direct access to a single physical GPU. |
||
Container Platform |
|||
NVIDIA Container Toolkit |
Enables GPU-accelerated containers by providing runtime components and utilities for container engines (Docker, containerd, CRI-O). |
||
Kubernetes Operators |
|||
NVIDIA DPU Operator (DPF) |
Enables cluster administrators to automate provisioning, orchestration, and lifecycle management of BlueField DPUs and DOCA Microservices to enable DPU-accelerated North-South networking in Kubernetes. |
||
NVIDIA GPU Operator |
Simplifies deployment of NVIDIA AI Enterprise by automating management of all NVIDIA software components needed to provision GPUs in Kubernetes. |
||
NVIDIA Network Operator |
Simplifies deployment of high-speed networking by automating management of NVIDIA ConnectX NICs and SuperNICs required to optimize East-West traffic and RDMA transfers in Kubernetes. |
||
NVIDIA NIM Operator |
Enables cluster administrators to operate the software components and services required to run LLM, embedding, and other models using NVIDIA NIM microservices in Kubernetes. |
Supported Infrastructure Software Not on NGC#
Attention
In this context, Supported means NVIDIA accepts bug reports from customers with valid NVIDIA AI Enterprise subscriptions and will work to address reported issues. All supported software is covered by NVIDIA AI Enterprise Support Services. NVIDIA supports the individual software components listed below, but not glue code, third-party applications, or full deployment integration.
The following infrastructure software is supported for use with NVIDIA AI Enterprise but is not distributed through NGC. Some components require enterprise licensing or are tightly coupled with hardware support, while third-party infrastructure software is obtained directly from the respective vendors.
Component |
Description |
Access |
|---|---|---|
NVIDIA Infrastructure Software |
||
NVIDIA Base Command Manager (BCM) |
Cluster management and provisioning tool for NVIDIA DGX systems. |
|
NVIDIA vGPU Software Licensing Server |
Enables GPU virtualization for VDI and AI workloads. |
NVIDIA Licensing Portal or partner channels |
Networking Drivers and SDKs |
||
NVIDIA DOCA Host Package (with DOCA-OFED Drivers) |
DOCA-Host includes all needed host drivers and tools for NVIDIA BlueField and ConnectX devices. |
|
NVIDIA BlueField Software Bundle |
BlueField software bundle (BF-Bundle) is installed on the BlueField Arm cores to provide a complete DOCA experience on the BlueField networking platform. |
|
NVIDIA BlueField Firmware Bundle |
BlueField Firmware Bundle (BF-FWBundle) is a minimal software package installed on the BlueField Arm cores. |
Release Branches and Lifecycle#
Infrastructure software is distributed through the Infrastructure Branch, which provides regular updates with 1-year support windows and can be designated as a Long-Term Support Branch (LTSB) for extended 3-year support.
Lifecycle Policy — Defines each branch type, support periods, and update cadence.
Choosing the Right Release Branch — Decision guide with comparison table and industry scenarios.
Infrastructure Software Releases — Active and archived release branches.
Infrastructure Support Matrix#
NVIDIA AI Enterprise is certified to run across public cloud, data centers, workstations, DGX platform, and edge environments. Use the support matrix to verify supported configurations for your deployment.
Important
Interactive Support Matrix
Explore the support matrix with interactive filtering and version comparison across releases 7.0-7.4:
Interactive Features:
Compare configurations across multiple releases side-by-side
Filter by deployment type (bare metal, virtualized, cloud)
Progressive filtering guides you through OS, hypervisor, and orchestration options
Interactive footnotes with hover tooltips
Dynamic version badges show exactly which releases support each configuration