NVIDIA AI Enterprise Documentation#
This documentation covers the NVIDIA AI Enterprise Infrastructure Layer software. NVIDIA AI Enterprise is a software suite for building and running AI applications across cloud, data center, and edge environments with optimized performance, security, and stability.
NVIDIA AI Enterprise Components
Infrastructure Software (this documentation): Tools for managing your compute resources, including GPU and network drivers, Kubernetes operators for container orchestration, and NVIDIA Run:ai (self-hosted) for AI workload management and optimization. Refer to Infrastructure Software.
Application Software: Tools for building AI solutions, including generative AI, AI agents, 3D applications and digital twins with NVIDIA Omniverse, and domain-specific SDKs for speech AI, computer vision, cybersecurity, and more. Refer to Application Software.
Enterprise Support: Direct access to NVIDIA AI experts who provide technical guidance, performance optimization assistance, onboarding and training services, and infrastructure troubleshooting across bare-metal, virtualized, and containerized environments—backed by service-level agreements. Refer to Support.
Quick Start#
🆕 New to NVIDIA AI Enterprise? → Start with the Quick Start Guide to deploy your first AI workload
⬆️ Upgrading from 7.3 or earlier? → Refer to What’s New in 7.4 in the following section
🔧 Need help with a specific task? → Jump to the Deployment Guide
🆕 What’s New in NVIDIA AI Enterprise Infra 7.4#
Latest Release Highlights
Blackwell Architecture Support - NVIDIA GPU Data Center Driver 580.126.09 adds support for the latest Blackwell GPU architecture
vGPU for Compute Updates - Enhancements and bug fixes based on vGPU Software 19.4
Updated Kubernetes Operators - GPU Operator 25.10.1, Network Operator 25.10.0, DPU Operator 25.10.1, and NIM Operator 3.0.2 deliver improved lifecycle automation and streamlined deployment for GPU workloads
Run:ai Updates - NVIDIA Run:ai 2.24 provides AI workload and GPU orchestration capabilities for self-hosted deployments
DOCA Ecosystem Updates - DOCA Driver 3.2.0 and DOCA Microservices 3.2.1 provide enhanced networking performance and infrastructure acceleration for data-intensive workloads
Enterprise Management - Base Command Manager 11.31.0 offers refined cluster provisioning and workload orchestration for large-scale AI infrastructure
Fabric Manager Support - NVIDIA Fabric Manager supported in GPU Passthrough and vGPU for Compute deployment modes
Interactive Support Matrix - New web-based support matrix tool for exploring infrastructure compatibility across releases 7.0-7.4 with progressive filtering, cross-version comparison, and dynamic search capabilities
Lifecycle and Compatibility Explorer - New interactive tool for verifying cross-stack compatibility between infrastructure components, with query modes for browsing by branch, release, component, or full stack validation
What You’ll Find Here#
🚀 Getting Started - Account activation, software installation, and first workload deployment
⚙️ Infrastructure Software - NVIDIA vGPU for Compute configuration, licensing, and management
📊 Support - Platform compatibility matrices and release information
📝 Overview - Release notes and version information
📖 Glossary - Key terms and concepts explained
Previous Releases#
📋 Release 7.3 Highlights
Blackwell Platform Introduction - Initial support for NVIDIA Blackwell-based systems including DGX B300, HGX B300, DGX GB300 NVL72, and GB300 NVL72 configurations
Updated Infrastructure Components - GPU Data Center Driver 580.105.08, DOCA-OFED v25.7.0, and Container Toolkit v1.18.0 for enhanced performance and compatibility
Kubernetes Orchestration - GPU Operator v25.10.0, Network Operator v25.7.0, DPU Operator v25.7.1, and NIM Operator v3.0.1 for streamlined GPU workload management
Enhanced Virtualization - vGPU for Compute 19.3 with improved performance and feature enhancements
Enterprise Management - Base Command Manager 11.25.08 for cluster provisioning and workload orchestration
📋 Release 7.2 Highlights
RTX PRO Blackwell Support - vGPU for Compute 19.2 adds support for NVIDIA RTX PRO 6000 Blackwell Server Edition on VMware vSphere 9.0.1
Infrastructure Updates - GPU Data Center Driver 580.95.05, DOCA-OFED v25.7.0, and Container Toolkit v1.18.0 for enhanced performance
Kubernetes Orchestration - GPU Operator v25.10.0, Network Operator v25.7.0, DPU Operator v25.7.1, and NIM Operator v3.0.1 for streamlined GPU workload management
Enterprise Management - Base Command Manager 11.25.08 for cluster provisioning and workload orchestration
📋 Release 7.1 Highlights
DPU and Networking Advancements - BlueField-3 DPU support with integrated DOCA Platform Framework (DPF) DPU Operator for cluster-wide provisioning and lifecycle management
NIC Automation - Network Operator 25.7.0 introduces NIC Configuration Operator for automated firmware upgrades and configuration management for ConnectX NICs and SuperNICs
Infrastructure Updates - GPU Data Center Driver 580.82.07, DOCA-OFED 25.7.0, and Container Toolkit 1.17.8 for enhanced performance
Kubernetes Orchestration - GPU Operator 25.3.2, Network Operator 25.7.0, DPU Operator 25.7.0, and NIM Operator 3.0.0 for streamlined GPU workload management
Virtualization - vGPU for Compute 19.1 with performance enhancements
📋 Release 7.0 Highlights
Blackwell GPU Introduction - Initial support for NVIDIA HGX B200 and RTX PRO 6000 Blackwell SE across all supported hypervisors
Product Naming Updates - NVIDIA vGPU C-Series officially renamed to NVIDIA vGPU for Compute; NVIDIA vGPU Host Driver renamed to NVIDIA Virtual GPU Manager
Infrastructure Foundation - GPU Data Center Driver 580.65.06, DOCA-OFED 25.4.0, and Container Toolkit 1.17.8 establishing the 7.x infrastructure baseline
Kubernetes Orchestration - GPU Operator 25.3.2, Network Operator 25.4.0, and NIM Operator 2.0.2 for GPU workload management
Enterprise Management - Base Command Manager 11.25.05 and 10.25.03 for cluster provisioning