Is this page helpful?

Multi-vGPU and P2P#

Multi-vGPU and Peer-to-Peer (P2P) CUDA transfers are two related capabilities for multi-GPU VM workloads. Each capability is documented on its own page, and the hypervisor support table on this page applies to both.

Table 22 Multi-vGPU and P2P Subpages#
Subpage	Use when you need
Multi-vGPU	To attach several vGPU devices to one VM (time-sliced or MIG-backed, possibly across different physical GPUs) for training and inference that need multiple GPUs inside a single guest.
Peer-to-Peer (P2P) CUDA Transfers	To allow CUDA kernels to access device memory between vGPUs in the same VM over NVLink, including the supported board/vGPU profile combinations and the known A100 and UVM caveat.

Hypervisor Platform Support for Multi-vGPU and P2P#

The hypervisor support below applies to both Multi-vGPU and P2P features.

Table 23 Hypervisor Platform Support for Multi-vGPU and P2P#
Hypervisor Platform	NVIDIA AI Enterprise Infra Release	Supported vGPU Types	Documentation
Red Hat Enterprise Linux with KVM	All active NVIDIA AI Enterprise Infra Releases	All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.	Setting up Multi-vGPU VMs on RHEL KVM
Ubuntu with KVM	All active NVIDIA AI Enterprise Infra Releases	All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.	Setting up Multi-vGPU VMs on Ubuntu KVM
VMware vSphere	All active NVIDIA AI Enterprise Infra Releases	All NVIDIA vGPU for Compute, on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.	Setting up Multi-vGPU on VMware vSphere 8

Note

P2P CUDA transfers are not supported on Windows. Only Linux OS distros as outlined in NVIDIA AI Enterprise Infrastructure Support Matrix are supported.