Multi-vGPU and P2P#

Multi vGPU#

Multi-vGPU attaches several vGPU devices to one VM. Devices may be time-sliced or MIG-backed and can sit on different physical GPUs—you are not limited to slicing one physical GPU across many VMs.

That layout suits training and inference that need multiple GPUs inside one guest: each vGPU is dedicated to that VM, so workloads in the VM do not compete with other VMs on the same physical GPU for those devices (for example, a VM with two A100-class vGPUs versus one).

vGPU Support for Multi-vGPU#

You can assign multiple vGPUs with differing amounts of frame buffer to a single VM, provided the board type and the series of all the vGPUs are the same. For example, you can assign an A40-48C vGPU and an A40-16C time-sliced vGPUs to the same VM. You can also assign an A100-4-20C vGPU and one A100-2-10C vGPU to a VM, both on MIG instances from an A100 board. However, you cannot assign an A30-8C vGPU and an A16-8C vGPU to the same VM.

Blackwell

Table 23 vGPU Support for Multi-vGPU on the NVIDIA Blackwell Architecture#
Board	vGPU [1]
NVIDIA HGX B300 279 GB	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: - All NVIDIA vGPU for Compute
NVIDIA HGX B200 180 GB	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: - All NVIDIA vGPU for Compute
NVIDIA RTX Pro 6000 Blackwell Server Edition 96 GB	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA RTX Pro 4500 Blackwell Server Edition 32 GB	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute

Hopper

Table 24 vGPU Support for Multi-vGPU on the NVIDIA Hopper GPU Architecture#
Board	vGPU [1]
NVIDIA H800 PCIe 94 GB (H800 NVL)	All NVIDIA vGPU for Compute
NVIDIA H800 PCIe 80 GB	All NVIDIA vGPU for Compute
NVIDIA H800 SXM5 80 GB	NVIDIA vGPU for Compute
NVIDIA H200 PCIe 141 GB (H200 NVL)	All NVIDIA vGPU for Compute
NVIDIA H200 SXM5 141 GB	NVIDIA vGPU for Compute
NVIDIA H100 PCIe 94 GB (H100 NVL)	All NVIDIA vGPU for Compute
NVIDIA H100 SXM5 94 GB	NVIDIA vGPU for Compute
NVIDIA H100 PCIe 80 GB	All NVIDIA vGPU for Compute
NVIDIA H100 SXM5 80 GB	NVIDIA vGPU for Compute
NVIDIA H100 SXM5 64 GB	NVIDIA vGPU for Compute
NVIDIA H20 SXM5 141 GB	NVIDIA vGPU for Compute
NVIDIA H20 SXM5 96 GB	NVIDIA vGPU for Compute

Ada Lovelace

Table 25 vGPU Support for Multi-vGPU on the NVIDIA Ada Lovelace Architecture#
Board	vGPU
NVIDIA L40	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA L40S	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA L20	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA L4	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA L2	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA RTX 6000 Ada	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA RTX 5880 Ada	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA RTX 5000 Ada	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute

Ampere

Table 26 vGPU Support for Multi-vGPU on the NVIDIA Ampere GPU Architecture#
Board	vGPU [1]
NVIDIA A800 PCIe 80 GB NVIDIA A800 PCIe 80 GB liquid-cooled NVIDIA AX800	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A800 PCIe 40 GB active-cooled	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A800 HGX 80 GB	Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A100 PCIe 80 GB NVIDIA A100 PCIe 80 GB liquid-cooled NVIDIA A100X	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A100 HGX 80 GB	Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A100 PCIe 40 GB	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A100 HGX 40 GB	Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A40	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A30 NVIDIA A30X NVIDIA A30 liquid-cooled	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A16	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA A10	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA RTX A6000	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA RTX A5500	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
NVIDIA RTX A5000	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute

Turing

Table 27 vGPU Support for Multi-vGPU on the NVIDIA Turing GPU Architecture#
Board	vGPU
Tesla T4	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
Quadro RTX 6000 passive	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
Quadro RTX 8000 passive	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute

Volta

Table 28 vGPU Support for Multi-vGPU on the NVIDIA Volta GPU Architecture#
Board	vGPU
Tesla V100 SXM2	Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
Tesla V100 SXM2 32 GB	Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
Tesla V100 PCIe	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
Tesla V100 PCIe 32 GB	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
Tesla V100S PCIe 32 GB	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute
Tesla V100 FHHL	Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: All NVIDIA vGPU for Compute Since VMware vSphere 8.0: All NVIDIA vGPU for Compute

Peer-To-Peer (P2P) CUDA Transfers#

Peer-to-Peer (P2P) CUDA transfers enable device memory between vGPUs on different GPUs that are assigned to the same VM to be accessed from within CUDA kernels. NVLink is a high-bandwidth interconnect that enables fast communication between such vGPUs.

P2P CUDA transfers over NVLink are supported only on a subset of vGPUs, hypervisor releases, and guest OS releases.

Peer-to-Peer CUDA Transfers Known Issues and Limitations#

Only time-sliced vGPUs are supported. MIG-backed vGPUs are not supported.
P2P transfers over PCIe are not supported.

vGPU Support for P2P#

Only NVIDIA vGPU for Compute time-sliced vGPUs allocated all of the physical GPU framebuffer on physical GPUs supporting NVLink are supported.

Blackwell

Table 29 vGPU Support for P2P on the NVIDIA Blackwell GPU Architecture#
Board	vGPU
NVIDIA HGX B300 279 GB	NVIDIA B300X-279C
NVIDIA HGX B200 180 GB	NVIDIA B200X-180C

Hopper

Table 30 vGPU Support for P2P on the NVIDIA Hopper GPU Architecture#
Board	vGPU
NVIDIA H800 PCIe 94 GB (H800 NVL)	H800L-94C
NVIDIA H800 PCIe 80 GB	H800-80C
NVIDIA H200 PCIe 141 GB (H200 NVL)	H200-141C
NVIDIA H200 SXM5 141 GB	H200X-141C
NVIDIA H100 PCIe 94 GB (H100 NVL)	H100L-94C
NVIDIA H100 SXM5 94 GB	H100XL-94C
NVIDIA H100 PCIe 80 GB	H100-80C
NVIDIA H100 SXM5 80 GB	H100XM-80C
NVIDIA H100 SXM5 64 GB	H100XS-64C
NVIDIA H20 SXM5 141 GB	H20X-141C
NVIDIA H20 SXM5 96 GB	H20-96C

Ampere

Table 31 vGPU Support for P2P on the NVIDIA Ampere GPU Architecture#
Board	vGPU
NVIDIA A800 PCIe 80 GB NVIDIA A800 PCIe 80 GB liquid-cooled NVIDIA AX800	A800D-80C
NVIDIA A800 PCIe 40 GB active-cooled	A800-40C
NVIDIA A800 HGX 80 GB	A800DX-80C [2]
NVIDIA A100 PCIe 80 GB NVIDIA A100 PCIe 80 GB liquid-cooled NVIDIA A100X	A100D-80C
NVIDIA A100 HGX 80 GB	A100DX-80C [2]
NVIDIA A100 PCIe 40 GB	A100-40C
NVIDIA A100 HGX 40 GB	A100X-40C [2]
NVIDIA A40	A40-48C
NVIDIA A30 NVIDIA A30X NVIDIA A30 liquid-cooled	A30-24C
NVIDIA A16	A16-16C
NVIDIA A10	A10-24C
NVIDIA RTX A6000	A6000-48C
NVIDIA RTX A5500	A5500-24C
NVIDIA RTX A5000	A5000-24C

Turing

Table 32 vGPU Support for P2P on the NVIDIA Turing GPU Architecture#
Board	vGPU
Quadro RTX 8000 passive	RTX8000P-48C
Quadro RTX 6000 passive	RTX6000P-24C

Volta

Table 33 vGPU Support for P2P on the NVIDIA Volta GPU Architecture#
Board	vGPU
Tesla V100 SXM2	V100X-16C
Tesla V100 SXM2 32 GB	V100DX-32C

Hypervisor Platform Support for Multi-vGPU and P2P#

Table 34 Hypervisor Platform Support for Multi-vGPU and P2P#
Hypervisor Platform	NVIDIA AI Enterprise Infra Release	Supported vGPU Types	Documentation
Red Hat Enterprise Linux with KVM	All active NVIDIA AI Enterprise Infra Releases	All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.	Setting up Multi-vGPU VMs on RHEL KVM
Ubuntu with KVM	All active NVIDIA AI Enterprise Infra Releases	All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.	Setting up Multi-vGPU VMs on Ubuntu KVM
VMware vSphere	All active NVIDIA AI Enterprise Infra Releases	All NVIDIA vGPU for Compute, on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.	Setting up Multi-vGPU on VMware vSphere 8

Note

P2P CUDA transfers are not supported on Windows. Only Linux OS distros as outlined in NVIDIA AI Enterprise Infrastructure Support Matrix are supported.

Footnotes