Multi-vGPU and P2P#

Multi vGPU#

Multi-vGPU attaches several vGPU devices to one VM. Devices may be time-sliced or MIG-backed and can sit on different physical GPUs—you are not limited to slicing one physical GPU across many VMs.

That layout suits training and inference that need multiple GPUs inside one guest: each vGPU is dedicated to that VM, so workloads in the VM do not compete with other VMs on the same physical GPU for those devices (for example, a VM with two A100-class vGPUs versus one).

vGPU Support for Multi-vGPU#

You can assign multiple vGPUs with differing amounts of frame buffer to a single VM, provided the board type and the series of all the vGPUs are the same. For example, you can assign an A40-48C vGPU and an A40-16C time-sliced vGPUs to the same VM. You can also assign an A100-4-20C vGPU and one A100-2-10C vGPU to a VM, both on MIG instances from an A100 board. However, you cannot assign an A30-8C vGPU and an A16-8C vGPU to the same VM.

Table 23 vGPU Support for Multi-vGPU on the NVIDIA Blackwell Architecture#

Board

vGPU [1]

NVIDIA HGX B300 279 GB

Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: - All NVIDIA vGPU for Compute

NVIDIA HGX B200 180 GB

Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu: - All NVIDIA vGPU for Compute

NVIDIA RTX Pro 6000 Blackwell Server Edition 96 GB

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX Pro 4500 Blackwell Server Edition 32 GB

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Table 24 vGPU Support for Multi-vGPU on the NVIDIA Hopper GPU Architecture#

Board

vGPU [1]

NVIDIA H800 PCIe 94 GB (H800 NVL)

All NVIDIA vGPU for Compute

NVIDIA H800 PCIe 80 GB

All NVIDIA vGPU for Compute

NVIDIA H800 SXM5 80 GB

NVIDIA vGPU for Compute

NVIDIA H200 PCIe 141 GB (H200 NVL)

All NVIDIA vGPU for Compute

NVIDIA H200 SXM5 141 GB

NVIDIA vGPU for Compute

NVIDIA H100 PCIe 94 GB (H100 NVL)

All NVIDIA vGPU for Compute

NVIDIA H100 SXM5 94 GB

NVIDIA vGPU for Compute

NVIDIA H100 PCIe 80 GB

All NVIDIA vGPU for Compute

NVIDIA H100 SXM5 80 GB

NVIDIA vGPU for Compute

NVIDIA H100 SXM5 64 GB

NVIDIA vGPU for Compute

NVIDIA H20 SXM5 141 GB

NVIDIA vGPU for Compute

NVIDIA H20 SXM5 96 GB

NVIDIA vGPU for Compute

Table 25 vGPU Support for Multi-vGPU on the NVIDIA Ada Lovelace Architecture#

Board

vGPU

NVIDIA L40

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L40S

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L20

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L4

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L2

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX 6000 Ada

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX 5880 Ada

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX 5000 Ada

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Table 26 vGPU Support for Multi-vGPU on the NVIDIA Ampere GPU Architecture#

Board

vGPU [1]

  • NVIDIA A800 PCIe 80 GB

  • NVIDIA A800 PCIe 80 GB liquid-cooled

  • NVIDIA AX800

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A800 PCIe 40 GB active-cooled

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A800 HGX 80 GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

  • NVIDIA A100 PCIe 80 GB

  • NVIDIA A100 PCIe 80 GB liquid-cooled

  • NVIDIA A100X

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A100 HGX 80 GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A100 PCIe 40 GB

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A100 HGX 40 GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A40

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

  • NVIDIA A30

  • NVIDIA A30X

  • NVIDIA A30 liquid-cooled

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A16

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A10

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX A6000

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX A5500

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX A5000

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Table 27 vGPU Support for Multi-vGPU on the NVIDIA Turing GPU Architecture#

Board

vGPU

Tesla T4

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Quadro RTX 6000 passive

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Quadro RTX 8000 passive

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Table 28 vGPU Support for Multi-vGPU on the NVIDIA Volta GPU Architecture#

Board

vGPU

Tesla V100 SXM2

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 SXM2 32 GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 PCIe

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 PCIe 32 GB

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100S PCIe 32 GB

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 FHHL

  • Generic Linux with KVM hypervisors [3], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Peer-To-Peer (P2P) CUDA Transfers#

Peer-to-Peer (P2P) CUDA transfers enable device memory between vGPUs on different GPUs that are assigned to the same VM to be accessed from within CUDA kernels. NVLink is a high-bandwidth interconnect that enables fast communication between such vGPUs.

P2P CUDA transfers over NVLink are supported only on a subset of vGPUs, hypervisor releases, and guest OS releases.

Peer-to-Peer CUDA Transfers Known Issues and Limitations#

  • Only time-sliced vGPUs are supported. MIG-backed vGPUs are not supported.

  • P2P transfers over PCIe are not supported.

vGPU Support for P2P#

Only NVIDIA vGPU for Compute time-sliced vGPUs allocated all of the physical GPU framebuffer on physical GPUs supporting NVLink are supported.

Table 29 vGPU Support for P2P on the NVIDIA Blackwell GPU Architecture#

Board

vGPU

NVIDIA HGX B300 279 GB

NVIDIA B300X-279C

NVIDIA HGX B200 180 GB

NVIDIA B200X-180C

Table 30 vGPU Support for P2P on the NVIDIA Hopper GPU Architecture#

Board

vGPU

NVIDIA H800 PCIe 94 GB (H800 NVL)

H800L-94C

NVIDIA H800 PCIe 80 GB

H800-80C

NVIDIA H200 PCIe 141 GB (H200 NVL)

H200-141C

NVIDIA H200 SXM5 141 GB

H200X-141C

NVIDIA H100 PCIe 94 GB (H100 NVL)

H100L-94C

NVIDIA H100 SXM5 94 GB

H100XL-94C

NVIDIA H100 PCIe 80 GB

H100-80C

NVIDIA H100 SXM5 80 GB

H100XM-80C

NVIDIA H100 SXM5 64 GB

H100XS-64C

NVIDIA H20 SXM5 141 GB

H20X-141C

NVIDIA H20 SXM5 96 GB

H20-96C

Table 31 vGPU Support for P2P on the NVIDIA Ampere GPU Architecture#

Board

vGPU

  • NVIDIA A800 PCIe 80 GB

  • NVIDIA A800 PCIe 80 GB liquid-cooled

  • NVIDIA AX800

A800D-80C

NVIDIA A800 PCIe 40 GB active-cooled

A800-40C

NVIDIA A800 HGX 80 GB

A800DX-80C [2]

  • NVIDIA A100 PCIe 80 GB

  • NVIDIA A100 PCIe 80 GB liquid-cooled

  • NVIDIA A100X

A100D-80C

NVIDIA A100 HGX 80 GB

A100DX-80C [2]

NVIDIA A100 PCIe 40 GB

A100-40C

NVIDIA A100 HGX 40 GB

A100X-40C [2]

NVIDIA A40

A40-48C

  • NVIDIA A30

  • NVIDIA A30X

  • NVIDIA A30 liquid-cooled

A30-24C

NVIDIA A16

A16-16C

NVIDIA A10

A10-24C

NVIDIA RTX A6000

A6000-48C

NVIDIA RTX A5500

A5500-24C

NVIDIA RTX A5000

A5000-24C

Table 32 vGPU Support for P2P on the NVIDIA Turing GPU Architecture#

Board

vGPU

Quadro RTX 8000 passive

RTX8000P-48C

Quadro RTX 6000 passive

RTX6000P-24C

Table 33 vGPU Support for P2P on the NVIDIA Volta GPU Architecture#

Board

vGPU

Tesla V100 SXM2

V100X-16C

Tesla V100 SXM2 32 GB

V100DX-32C

Hypervisor Platform Support for Multi-vGPU and P2P#

Table 34 Hypervisor Platform Support for Multi-vGPU and P2P#

Hypervisor Platform

NVIDIA AI Enterprise Infra Release

Supported vGPU Types

Documentation

Red Hat Enterprise Linux with KVM

All active NVIDIA AI Enterprise Infra Releases

All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.

Setting up Multi-vGPU VMs on RHEL KVM

Ubuntu with KVM

All active NVIDIA AI Enterprise Infra Releases

All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.

Setting up Multi-vGPU VMs on Ubuntu KVM

VMware vSphere

All active NVIDIA AI Enterprise Infra Releases

All NVIDIA vGPU for Compute, on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.

Setting up Multi-vGPU on VMware vSphere 8

Note

P2P CUDA transfers are not supported on Windows. Only Linux OS distros as outlined in NVIDIA AI Enterprise Infrastructure Support Matrix are supported.

Footnotes