NVIDIA vGPU for Compute Features#

NVIDIA vGPU (Virtual GPU) for Compute virtualizes NVIDIA GPUs for AI, machine learning, and high-performance computing. The subsections below describe MIG (Multi-Instance GPU) partitioning, provisioning, data paths, migration, multi-GPU guests, interconnects, scheduling, power state, and unified memory.

Table 18 Feature Summary#

Feature

Description

MIG-Backed vGPU

Hardware-level GPU partitioning with spatial isolation for multi-tenant workloads

Device Groups

Automated topology-aware device provisioning for multi-GPU and GPU-NIC pairs

GPUDirect RDMA and Storage

Direct memory access and storage I/O bypass between GPUs and network/storage devices

Heterogeneous vGPU

Mixed vGPU profiles with different framebuffer sizes on a single GPU

Live Migration

Zero-downtime VM migration between physical hosts

Multi-vGPU and P2P

Multiple vGPUs per VM with peer-to-peer NVLink communication

NVIDIA NVSwitch

High-bandwidth GPU-to-GPU interconnect fabric for HGX systems

NVLink Multicast

Efficient one-to-many data distribution across NVLink-connected GPUs

Scheduling Policies

Workload-specific GPU scheduling algorithms (Best Effort, Equal Share, Fixed Share)

Suspend-Resume

VM state preservation and resumption without losing GPU context

Unified Virtual Memory

Single memory address space across CPU and GPU for simplified CUDA programming