NVIDIA AI Enterprise Documentation#

This documentation covers the NVIDIA AI Enterprise Infrastructure Layer software. NVIDIA AI Enterprise is a software suite for building and running AI applications across cloud, data center, and edge environments with optimized performance, security, and stability.

NVIDIA AI Enterprise Components

  • Infrastructure Software (this documentation): Tools for managing your compute resources, including GPU and network drivers, Kubernetes operators for container orchestration, and NVIDIA Run:ai (self-hosted) for AI workload management and optimization. Refer to Infrastructure Software.

  • Application Software: Tools for building AI solutions, including generative AI, AI agents, 3D applications and digital twins with NVIDIA Omniverse, and domain-specific SDKs for speech AI, computer vision, cybersecurity, and more. Refer to Application Software.

  • Enterprise Support: Direct access to NVIDIA AI experts who provide technical guidance, performance optimization assistance, onboarding and training services, and infrastructure troubleshooting across bare-metal, virtualized, and containerized environments—backed by service-level agreements. Refer to Support.

Quick Start#

  • 🆕 New to NVIDIA AI Enterprise? → Start with the Quick Start Guide to deploy your first AI workload

  • ⬆️ Upgrading from 7.3 or earlier? → Refer to What’s New in 7.4 in the following section

  • 🔧 Need help with a specific task? → Jump to the Deployment Guide

🆕 What’s New in NVIDIA AI Enterprise Infra 7.4#

Latest Release Highlights

  • Blackwell Architecture Support - NVIDIA GPU Data Center Driver 580.126.09 adds support for the latest Blackwell GPU architecture

  • vGPU for Compute Updates - Enhancements and bug fixes based on vGPU Software 19.4

  • Updated Kubernetes Operators - GPU Operator 25.10.1, Network Operator 25.10.0, DPU Operator 25.10.1, and NIM Operator 3.0.2 deliver improved lifecycle automation and streamlined deployment for GPU workloads

  • Run:ai Updates - NVIDIA Run:ai 2.24 provides AI workload and GPU orchestration capabilities for self-hosted deployments

  • DOCA Ecosystem Updates - DOCA Driver 3.2.0 and DOCA Microservices 3.2.1 provide enhanced networking performance and infrastructure acceleration for data-intensive workloads

  • Enterprise Management - Base Command Manager 11.31.0 offers refined cluster provisioning and workload orchestration for large-scale AI infrastructure

  • Fabric Manager Support - NVIDIA Fabric Manager supported in GPU Passthrough and vGPU for Compute deployment modes

  • Interactive Support Matrix - New web-based support matrix tool for exploring infrastructure compatibility across releases 7.0-7.4 with progressive filtering, cross-version comparison, and dynamic search capabilities

  • Lifecycle and Compatibility Explorer - New interactive tool for verifying cross-stack compatibility between infrastructure components, with query modes for browsing by branch, release, component, or full stack validation

View Full 7.4 Release Notes

What You’ll Find Here#

  • 🚀 Getting Started - Account activation, software installation, and first workload deployment

  • ⚙️ Infrastructure Software - NVIDIA vGPU for Compute configuration, licensing, and management

  • 📊 Support - Platform compatibility matrices and release information

  • 📝 Overview - Release notes and version information

  • 📖 Glossary - Key terms and concepts explained

Previous Releases#

📋 Release 7.3 Highlights
  • Blackwell Platform Introduction - Initial support for NVIDIA Blackwell-based systems including DGX B300, HGX B300, DGX GB300 NVL72, and GB300 NVL72 configurations

  • Updated Infrastructure Components - GPU Data Center Driver 580.105.08, DOCA-OFED v25.7.0, and Container Toolkit v1.18.0 for enhanced performance and compatibility

  • Kubernetes Orchestration - GPU Operator v25.10.0, Network Operator v25.7.0, DPU Operator v25.7.1, and NIM Operator v3.0.1 for streamlined GPU workload management

  • Enhanced Virtualization - vGPU for Compute 19.3 with improved performance and feature enhancements

  • Enterprise Management - Base Command Manager 11.25.08 for cluster provisioning and workload orchestration

View Full 7.3 Release Notes

📋 Release 7.2 Highlights
  • RTX PRO Blackwell Support - vGPU for Compute 19.2 adds support for NVIDIA RTX PRO 6000 Blackwell Server Edition on VMware vSphere 9.0.1

  • Infrastructure Updates - GPU Data Center Driver 580.95.05, DOCA-OFED v25.7.0, and Container Toolkit v1.18.0 for enhanced performance

  • Kubernetes Orchestration - GPU Operator v25.10.0, Network Operator v25.7.0, DPU Operator v25.7.1, and NIM Operator v3.0.1 for streamlined GPU workload management

  • Enterprise Management - Base Command Manager 11.25.08 for cluster provisioning and workload orchestration

View Full 7.2 Release Notes

📋 Release 7.1 Highlights
  • DPU and Networking Advancements - BlueField-3 DPU support with integrated DOCA Platform Framework (DPF) DPU Operator for cluster-wide provisioning and lifecycle management

  • NIC Automation - Network Operator 25.7.0 introduces NIC Configuration Operator for automated firmware upgrades and configuration management for ConnectX NICs and SuperNICs

  • Infrastructure Updates - GPU Data Center Driver 580.82.07, DOCA-OFED 25.7.0, and Container Toolkit 1.17.8 for enhanced performance

  • Kubernetes Orchestration - GPU Operator 25.3.2, Network Operator 25.7.0, DPU Operator 25.7.0, and NIM Operator 3.0.0 for streamlined GPU workload management

  • Virtualization - vGPU for Compute 19.1 with performance enhancements

View Full 7.1 Release Notes

📋 Release 7.0 Highlights
  • Blackwell GPU Introduction - Initial support for NVIDIA HGX B200 and RTX PRO 6000 Blackwell SE across all supported hypervisors

  • Product Naming Updates - NVIDIA vGPU C-Series officially renamed to NVIDIA vGPU for Compute; NVIDIA vGPU Host Driver renamed to NVIDIA Virtual GPU Manager

  • Infrastructure Foundation - GPU Data Center Driver 580.65.06, DOCA-OFED 25.4.0, and Container Toolkit 1.17.8 establishing the 7.x infrastructure baseline

  • Kubernetes Orchestration - GPU Operator 25.3.2, Network Operator 25.4.0, and NIM Operator 2.0.2 for GPU workload management

  • Enterprise Management - Base Command Manager 11.25.05 and 10.25.03 for cluster provisioning

View Full 7.0 Release Notes