Multi-vGPU and P2P#
Multi-vGPU and Peer-to-Peer (P2P) CUDA transfers are two related capabilities for multi-GPU VM workloads. Each capability is documented on its own page, and the hypervisor support table on this page applies to both.
Subpage |
Use when you need |
|---|---|
To attach several vGPU devices to one VM (time-sliced or MIG-backed, possibly across different physical GPUs) for training and inference that need multiple GPUs inside a single guest. |
|
To allow CUDA kernels to access device memory between vGPUs in the same VM over NVLink, including the supported board/vGPU profile combinations and the known A100 and UVM caveat. |
Hypervisor Platform Support for Multi-vGPU and P2P#
The hypervisor support below applies to both Multi-vGPU and P2P features.
Hypervisor Platform |
NVIDIA AI Enterprise Infra Release |
Supported vGPU Types |
Documentation |
|---|---|---|---|
Red Hat Enterprise Linux with KVM |
All active NVIDIA AI Enterprise Infra Releases |
All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported. |
|
Ubuntu with KVM |
All active NVIDIA AI Enterprise Infra Releases |
All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported. |
|
VMware vSphere |
All active NVIDIA AI Enterprise Infra Releases |
All NVIDIA vGPU for Compute, on supported GPUs, both time-sliced and MIG-backed vGPUs are supported. |
Note
P2P CUDA transfers are not supported on Windows. Only Linux OS distros as outlined in NVIDIA AI Enterprise Infrastructure Support Matrix are supported.