NVIDIA vGPU for Compute#
NVIDIA AI Enterprise is a cloud-native suite of software tools, libraries and frameworks designed to deliver optimized performance, robust security, and stability for production AI deployments. Easy-to-use microservices optimize model performance with enterprise-grade security, support, and stability, ensuring a streamlined transition from prototype to production for enterprises that run their businesses on AI. It consists of two primary layers: the application layer and the infrastructure layer.
NVIDIA vGPU for Compute is licensed exclusively through NVIDIA AI Enterprise. NVIDIA vGPU for Compute enables multiple virtual machines (VM) to have simultaneous, direct access to a single physical GPU while offering compute capabilities required for AI model training, fine tuning, and inference workloads. By distributing GPU resources efficiently across multiple VMs, NVIDIA vGPU for Compute optimizes utilization and lowers overall hardware costs. In addition, it offers advanced monitoring and management capabilities, including Suspend/Resume, Live Migration and Warm Updates, making it ideal for Cloud Service Providers (CSPs) and organizations that require scalable, cost effective GPU acceleration.
Key Concepts#
Glossary#
Term |
Definition |
---|---|
NVIDIA Virtual GPU (vGPU) Manager |
The Virtual GPU (vGPU) Manager enables GPU virtualization by allowing multiple VMs to share a physical GPU, optimizing GPU allocation for different workloads. The NVIDIA Virtual GPU Manager is installed on the hypervisor. |
NVIDIA vGPU for Compute Guest Driver |
The NVIDIA vGPU for Compute Guest Driver is installed on each VM’s operating system, allowing it to fully leverage the virtualized GPU resources. The Guest Driver provides the necessary interface and support to ensure that applications running within the VMs can fully leverage the GPU’s capabilities, similar to how they would on a physical machine with a dedicated GPU. |
NVIDIA Licensing System |
The NVIDIA Licensing System for NVIDIA AI Enterprise manages the software licenses required to use NVIDIA’s AI tools and infrastructure. This system ensures that organizations are compliant with licensing terms while providing flexibility in managing and deploying NVIDIA AI Enterprise across their infrastructure. |
NVIDIA AI Enterprise Infra Collection |
The NVIDIA AI Enterprise Infrastructure (Infra) Collection hosted on NVIDIA AI Enterprise Infra Collection is a suite of software and tools designed to support the deployment and management of AI workloads in enterprise environments. The NVIDIA AI Enterprise Infra Collection provides a robust and scalable foundation for running AI workloads, ensuring that enterprises can leverage the full power of NVIDIA GPUs and software to accelerate their AI initiatives. |
The NVIDIA vGPU for Compute Drivers can be downloaded from the NVIDIA AI Enterprise Infra Collection.
NVIDIA vGPU Architecture Overview#
The high-level architecture of the NVIDIA vGPU is illustrated in the following diagram. Under the control of the NVIDIA Virtual GPU Manager (running on the hypervisor), a single NVIDIA physical GPU is capable of supporting multiple virtual GPU devices (vGPUs) that can be assigned directly to guest VMs, each functioning like a dedicated GPU.
Guest VMs use NVIDIA vGPUs in the same manner as a physical GPU that has been passed through by the hypervisor: the NVIDIA vGPU for Compute driver loaded in the guest VM provides direct access to the GPU for performance-critical fast paths.

Each NVIDIA vGPU is analogous to a conventional GPU with a fixed amount of GPU framebuffer/memory. The vGPU’s framebuffer is allocated out of the physical GPU’s framebuffer at the time the vGPU is created, and the vGPU retains exclusive use of that framebuffer until it is destroyed.
NVIDIA vGPU for Compute Configurations#
Depending on the physical GPU, NVIDIA vGPU for Compute supports different types of vGPU modes on a physical GPU:
Time-sliced vGPUs can be created on all NVIDIA AI Enterprise supported GPUs.
Additionally, on GPUs that support the Multi-Instance GPU (MIG) feature, the following types of MIG-backed vGPU are supported:
MIG-backed vGPUs that vGPUs that occupy an entire GPU instance
Time-sliced, MIG-backed vGPUs
vGPU Mode |
Description |
GPU Partitioning |
Isolation |
Use Cases |
---|---|---|---|---|
Time Sliced vGPU |
A time-sliced vGPU for Compute VM shares access to all of the GPU’s compute resources, including streaming multiprocessors (SMs), and GPU engines with other vGPUs on the same GPU. Processes are scheduled sequentially, with each vGPU for Compute VM gaining exclusive use of GPU engines during its time slice. |
Temporal |
Strong hardware-based memory and fault isolation. Good performance and QoS with round-robin scheduling. |
Deployments with non-strict isolation requirements, or in environments where MIG-backed vGPU is not available. Suitable for light to moderate AI workloads such as small-scale inferencing, preprocessing pipelines, and development/testing of models in a pre-training phase. |
MIG-backed vGPU |
A MIG-backed vGPU for Compute VM is created from one or more MIG slices and assigned to a VM on a MIG-capable physical GPU. Each MIG-backed vGPU for Compute VM has exclusive access to the compute resources of its GPU instance, including SMs and GPU engines. On a MIG-backed vGPU for Compute VM, processes running on one VM execute in parallel with processes running on other vGPUs on the same physical GPU. Each process runs only on its assigned vGPU, alongside processes on other vGPUs. For more information on configuring MIG-backed vGPU VMs, refer to the Virtual GPU Types for Supported GPUs. |
Spatial |
Strong hardware-based memory and fault isolation. Better performance and QoS with dedicated cache/memory bandwidth and lower scheduling latency. |
Most virtualization deployments require strong isolation, multi-tenancy, and consistent performance. Well-suited for consistent high-performance AI inferencing, multi-tenant fine-tuning jobs, or parallel execution of small to medium training tasks with predictable throughput requirements. |
Time Sliced, MIG-Backed vGPU |
A time-sliced, MIG-backed vGPU for Compute VM occupies only a fraction of a MIG instance on a MIG-capable physical GPU. On a time-sliced, MIG-backed vGPU for Compute VM, processes are scheduled sequentially as each VM shares access to the GPU instance’s compute resources, including the SMs and compute engines with all other vGPUs on the same MIG instance. This mode was introduced with the RTX Pro 6000 Blackwell Server Edition. For more information on configuring MIG-backed vGPU VMs, refer to the Virtual GPU Types for Supported GPUs. |
Spatial partitioning between MIG instances. Temporal partitioning within each MIG instance. |
Strong hardware-based memory and fault isolation. Better performance and QoS with dedicated cache/memory bandwidth and lower scheduling latency. |
Most virtualization deployments require strong isolation, multi-tenancy, and consistent performance, while maximizing GPU utilization. Ideal for high-density AI workloads such as serving multiple concurrent inferencing endpoints, hosting AI models across multiple tenants, or running light training jobs on shared GPU resources. |
Installing NVIDIA vGPU for Compute#
Prerequisites#
System Requirements#
Before proceeding, ensure the following system prerequisites are met:
At least one NVIDIA data center GPU in a single NVIDIA AI Enterprise compatible NVIDIA-Certified System. NVIDIA recommends using the following GPUs based on your infrastructure.
System Requirements Use Cases# Use Case
GPU
Adding AI to mainstream servers (single to 4-GPU NVLink)
NVIDIA A30
1- 8x NVIDIA L4
NVIDIA L40S
NVIDIA H100 NVL
NVIDIA H200 NVL
NVIDIA RTX Pro 6000 Blackwell Server Edition
AI Model Inference
NVIDIA A100
NVIDIA H200 NVL
NVIDIA RTX Pro 6000 Blackwell Server Edition
AI Model Training (Large) and Inference (HGX Scale Up and Out Server)
NVIDIA H100 HGX
NVIDIA H200 HGX
NVIDIA B200 HGX
If using GPUs based on the NVIDIA Ampere architecture or later, the following BIOS settings are enabled on your server platform:
Single Root I/O Virtualization (SR-IOV) - Enabled
VT-d/IOMMU - Enabled
NVIDIA AI Enterprise License
NVIDIA AI Enterprise Software:
NVIDIA Virtual GPU Manager
NVIDIA vGPU for Compute Guest Driver
You can leverage the NVIDIA System Management interface (NV-SMI) management and monitoring tool for testing and benchmarking.
The following server configuration details are considered best practices:
Hyperthreading - Enabled
Power Setting or System Profile - High Performance
CPU Performance - Enterprise or High Throughput (if available in the BIOS)
Memory Mapped I/O above 4-GB - Enabled (if available in the BIOS)
Installing NGC CLI#
To access the NVIDIA Virtual GPU Manager and NVIDIA vGPU for Compute Guest Driver, you must first download and install the NGC Catalog CLI. After the NGC Catalog CLI is installed, you will need to launch a command window and run the following commands to download the drivers.
To install the NGC Catalog CLI:
Login to the NVIDIA NGC Catalog.
In the top right corner, click Welcome and then select Setup from the menu.
Click Downloads under Install NGC CLI from the Setup page.
From the CLI Install page, click the Windows, Linux, or MacOS tab, according to the platform from which you will be running NGC Catalog CLI.
Follow the instructions to install the CLI.
Verify the installation by entering
ngc --version
in a terminal or command prompt. The output should beNGC Catalog CLI x.y.z
wherex.y.z
indicates the version.You must configure NGC CLI for your use so that you can run the commands. You will be prompted to enter your NGC API Key. Enter the following command:
$ ngc config set Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: (COPY/PASTE API KEY) Enter CLI output format type [ascii]. Choices: [ascii, csv, json]: ascii Enter org [no-org]. Choices: ['no-org']: Enter team [no-team]. Choices: ['no-team']: Enter ace [no-ace]. Choices: ['no-ace']: Successfully saved NGC configuration to /home/$username/.ngc/config
After the NGC Catalog CLI is installed, you will need to launch a command window and run the following commands to download the software.
NVIDIA NVIDIA Virtual GPU Manager
ngc registry resource download-version "nvidia/vgpu/vgpu-host-driver-X:X.X"
NVIDIA vGPU for Compute Guest Driver
ngc registry resource download-version "nvidia/vgpu/vgpu-guest-driver-X:X.X"
For more information on configuring the NGC CLI, refer to the Getting Started with the NGC CLI documentation.
Installing NVIDIA Virtual GPU Manager#
The process of installing the NVIDIA Virtual GPU Manager depends on the hypervisor that you are using. This section assumes the following:
You have downloaded the Virtual GPU Manager software from NVIDIA NGC Catalog
You want to deploy the NVIDIA vGPU for Compute on a single server node
Hypervisor Platform |
Installation Instructions |
---|---|
Red Hat Enterprise Linux KVM |
Installing and Configuring the NVIDIA Virtual GPU Manager for Red Hat Enterprise Linux KVM |
Ubuntu KVM |
Installing and Configuring the NVIDIA Virtual GPU Manager for Ubuntu |
VMware vSphere |
Installing and Configuring the NVIDIA Virtual GPU Manager for VMware vSphere |
After you complete this process, you can install the vGPU Guest Driver on your Guest VM.
Installing NVIDIA Fabric Manager on HGX Servers#
NVIDIA Fabric Manager must be installed in addition to the Virtual GPU Manager on NVIDIA HGX platforms in VMs to enable multi-GPU configurations required for AI training, complex simulations, and processing massive datasets. Fabric Manager is responsible for enabling and managing high-bandwidth interconnect topologies between multiple GPUs on the same node.
On Ampere, Hopper, and Blackwell HGX systems equipped with NVSwitch, Fabric Manager configures the NVSwitch memory fabric to create a unified memory fabric among all participating GPUs and monitors the supporting NVLinks, enabling the deployment of multi-GPU VMs with 1, 2, 4, or 8 GPUs.
Note
For information about NVIDIA Fabric Manager integration or support for deploying 1‑, 2‑, 4- or 8‑GPU VMs on your hypervisor, consult the documentation from your hypervisor vendor.
The Fabric Manager service must be running before creating VMs with multi-GPU configurations. Failure to enable Fabric Manager on HGX platforms may result in incomplete or non-functional GPU topologies inside the VM. For details on capabilities, configuration, and usage, refer to the NVIDIA Fabric Manager User Guide.
Installing NVIDIA vGPU Guest Driver#
The process for installing the driver is the same in a VM configured with vGPU, in a VM that is running pass-through GPU, or on a physical host in a bare-metal deployment. This section assumes the following:
You have downloaded the vGPU for Compute Guest Driver from NVIDIA NGC Catalog
The Guest VM has been created and booted on the hypervisor
Guest Operating System |
Installation Instructions |
---|---|
Ubuntu |
Installing the NVIDIA vGPU for Compute Guest Driver on Ubuntu from a Debian Package |
Red Hat |
Installing the NVIDIA vGPU for Compute Guest Driver on Red Hat Distributions from an RPM Package |
Windows |
Installing the NVIDIA vGPU for Compute Guest Driver and NVIDIA Control Panel |
Other Linux distributions |
Installing the NVIDIA vGPU for Compute Guest Driver on a Linux VM from a .run Package |
After you install the NVIDIA vGPU for Compute Guest driver, you are required to license the Guest VM. After a license from the NVIDIA License System is obtained, the Guest VM operates at full capability and can be used to run AI/ML workloads.
Licensing a NVIDIA vGPU for Compute Guest VM#
Note
The NVIDIA AI Enterprise license is enforced through software when you deploy NVIDIA vGPU for Compute VMs.
When booted on a supported GPU, a vGPU for Compute VM initially operates at full capability but its performance degrades over time if the VM fails to obtain a license. In such a scenario, the full capability of the VM is restored when the license is acquired.
Once licensing is configured, a vGPU VM automatically obtains a license from the license server when booted on a supported GPU. The VM retains the license until it is shut down. It then releases the license back to the license server. Licensing settings persist across reboots and need only be modified if the license server address changes, or the VM is switched to running GPU pass through.
For more information on how to license a vGPU for Compute VM from the NVIDIA License System, including step-by-step instructions, refer to the Virtual GPU Client Licensing User Guide.
Note
For vGPU for Compute deployments, one license per vGPU assigned to a VM is enforced through software. This license is valid for up to sixteen vGPU instances on a single GPU or for the assignment to a VM of one vGPU that is assigned all the physical GPU’s framebuffer. If multiple NVIDIA C‑series vGPUs are assigned to a single VM, a separate license must be obtained for each vGPU from the NVIDIA Licensing System, regardless of whether it is a Networked or Node‑Locked license.
Verifying the License Status of a Licensed NVIDIA vGPU for Compute Guest VM#
After configuring an NVIDIA vGPU for Compute client VM with a license, verify the license status by displaying the licensed product name and status.
To verify the license status of a licensed client, run nvidia-smi
with the –q
or --query
option from within the client VM, not the hypervisor host. If the product is licensed, the expiration date is shown in the license status.
1==============NVSMI LOG==============
2
3Timestamp : Tue Jun 17 16:49:09 2025
4Driver Version : 580.46
5CUDA Version : 13.0
6
7Attached GPUs : 2
8GPU 00000000:02:01.0
9 Product Name : NVIDIA H100-80C
10 Product Brand : NVIDIA Virtual Compute Server
11 Product Architecture : Hopper
12 Display Mode : Requested functionality has been deprecated
13 Display Attached : Yes
14 Display Active : Disabled
15 Persistence Mode : Enabled
16 Addressing Mode : HMM
17 MIG Mode
18 Current : N/A
19 Pending : N/A
20 Accounting Mode : Disabled
21 Accounting Mode Buffer Size : 4000
22 Driver Model
23 Current : N/A
24 Pending : N/A
25 Serial Number : N/A
26 GPU UUID GPU-a1833a31-1dd2-11b2-8e58-a589b8170988
27 GPU PDI : N/A
28 Minor Number : 0
29 VBIOS Version : 00.00.00.00.00
30 MultiGPU Board : No
31 Board ID : 0x201
32 Board Part Number : N/A
33 GPU Part Number : 2331-882-A1
34 FRU Part Number : N/A
35 Platform Info
36 Chassis Serial Number : N/A
37 Slot Number : N/A
38 Tray Index : N/A
39 Host ID : N/A
40 Peer Type : N/A
41 Module Id : N/A
42 GPU Fabric GUID : N/A
43 Inforom Version
44 Image Version : N/A
45 OEM Object : N/A
46 ECC Object : N/A
47 Power Management Object : N/A
48 Inforom BBX Object Flush
49 Latest Timestamp : N/A
50 Latest Duration : N/A
51 GPU Operation Mode
52 Current : N/A
53 Pending : N/A
54 GPU C2C Mode : Disabled
55 GPU Virtualization Mode
56 Virtualization Mode : VGPU
57 Host VGPU Mode : N/A
58 vGPU Heterogeneous Mode : N/A
59 vGPU Software Licensed Product
60 Product Name : NVIDIA Virtual Compute Server
61 License Status : Licensed (Expiry: 2025-6-18 8:59:55 GMT)
62….
Installing the NVIDIA GPU Operator Using a Bash Shell Script#
A bash shell script for installing the NVIDIA GPU Operator with the NVIDIA vGPU for Compute Driver is available for download from the NVIDIA AI Enterprise Infra Collection.
Note
This approach assumes there is no vGPU for Compute Driver installed on the Guest VM.The vGPU for Compute Guest driver is installed by GPU Operator.
Refer to the GPU Operator documentation for detailed instructions on deploying the NVIDIA vGPU for Compute Driver using the bash shell script.
Installing NVIDIA AI Enterprise Applications Software#
Installing NVIDIA AI Enterprise Applications Software using Docker and NVIDIA Container Toolkit#
Prerequisites#
Before you install any NVIDIA AI Enterprise container:
Ensure your vGPU for Compute Guest VM is running a supported OS distribution.
Ensure the VM has obtained a valid vGPU for Compute license from the NVIDIA License System.
Confirm that one or more NVIDIA GPU is available and recognized by your system.
Make sure the vGPU for Compute Guest Driver is installed correctly. You can verify this by running
nvidia-smi
. If you see your GPU listed, you’re ready to proceed.
Installing Docker Engine#
Refer to the official Docker Installation Guide for your vGPU for Compute Guest VM OS Linux distribution.
Installing the NVIDIA Container Toolkit#
The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to configure containers to leverage NVIDIA GPUs automatically. Complete documentation and frequently asked questions are available on the repository wiki. Refer to the Installing the NVIDIA Container Toolkit documentation to enable the Docker repository and install the NVIDIA Container Toolkit on the Guest VM.
Once the NVIDIA Container Toolkit is installed, to configure the Docker container runtime, refer to the Configuration documentation.
Verifying the Installation: Run a Sample CUDA Container#
Refer to the Running a Sample Workload documentation to run a sample CUDA container test on your GPU.
Accessing NVIDIA AI Enterprise Containers on NGC#
NVIDIA AI Enterprise Application Software is available through the NVIDIA NGC Catalog and identifiable by the NVIDIA AI Enterprise Supported label.
The container image for each application or framework contains the entire user-space software stack required to run it, namely, the CUDA libraries, cuDNN, any required Magnum IO components, TensorRT, and the framework itself.
Generate an NGC API key to access the NVIDIA AI Enterprise Software in the NGC Catalog using the URL provided to you by NVIDIA.
Authenticate with Docker to NGC Registry. In your shell, run:
docker login nvcr.io Username: $oauthtoken Password: <paste-your-NGC_API_key-here> A successful login (``Login Succeeded``) lets you pull containers from NGC.
From the NVIDIA vGPU for Compute VM, browse the NGC Catalog for containers labeled NVIDIA AI Enterprise Supported.
Copy the relevant
docker pull
command.sudo docker pull nvcr.io/nvaie/rapids-pb25h1:x.y.z-runtime
Where
x.y.z
is the version of your container.Run the container with GPU access.
sudo docker run --gpus all -it --rm nvcr.io/nvaie/rapids-pb25h1:x.y.z-runtime
Where
x.y.z
is the version of your container.This command launches an interactive container using the vGPUs available on the Guest VM.
Installing the NVIDIA AI Enterprise Software Components Using Podman#
You can use Podman (an alternative container runtime to Docker) for running NVIDIA AI Enterprise containers. The installation flow is similar to Docker. For more information, refer to the NVIDIA AI Enterprise: RHEL with KVM Deployment Guide.
Installing NVIDIA AI Enterprise Software Components Using Kubernetes and NVIDIA Cloud Native Stack#
NVIDIA provides the Cloud Native Stack (CNS), which is a collection of software to run cloud native workloads on NVIDIA GPUs. NVIDIA Cloud Native Stack is based on Ubuntu/RHEL, Kubernetes, Helm, and the NVIDIA GPU and Network Operator.
Refer to this repository for a series of installation guides with step-by-step instructions based on your OS distribution. The installation guides also offer instructions to deploy an application from the NGC Catalog to validate that GPU resources are accessible and functional.
NVIDIA vGPU for Compute Key Features#
MIG Backed vGPU#
A Multi Instance GPU (MIG)-backed vGPU is a vGPU that resides on a GPU instance in a MIG-capable physical GPU. MIG-backed vGPUs are created from individual MIG slices and assigned to virtual machines. Each MIG-backed vGPU resident on a GPU has exclusive access to the GPU instance’s engines, including the compute and video decode engines. This model combines MIG’s hardware-level spatial partitioning with the temporal partitioning capabilities of vGPU, offering flexibility in how GPU resources are shared across workloads.
In a MIG-backed vGPU, processes running on one vGPU execute in parallel with processes running on other vGPUs on the same physical GPU. Each process runs only on its assigned vGPU, alongside processes on other vGPUs.
Note
NVIDIA vGPU for Compute supports MIG-Backed vGPUs on all the GPU boards that support Multi Instance GPU (MIG).
Universal MIG technology on Blackwell enables both compute and graphics workloads to be consolidated and securely isolated on the same physical GPU.
A MIG-backed vGPU is ideal when running multiple high-priority workloads that require guaranteed, consistent performance and strong isolation, such as in multi-tenant environments, MLOps platforms, or shared research clusters. By partitioning a GPU into dedicated hardware instances, teams can run training, inference, video analytics, and data processing jobs simultaneously with consistent performance, maximizing utilization while ensuring each workload meets its SLA.
Supported MIG-Backed vGPU Configurations on a Single GPU#
NVIDIA vGPU supports both homogeneous and mixed MIG-backed virtual GPU configurations, and on GPUs with MIG time-slicing support, each MIG instance supports multiple time-sliced vGPU VMs.
On the NVIDIA RTX PRO 6000 Blackwell Server Edition, up to 4 MIG slices can be created on a single GPU. Within each MIG slice, 1 to 3 time-sliced vGPUs for Compute, with 8 GB frame buffer each can be created, depending on workload requirements and user density goals. Each of these vGPU instances can be assigned to a separate VM, enabling up to 12 virtual machines to share a single physical GPU, while still benefiting from the isolation boundaries provided by MIG.

The figure above shows how each MIG slice on the NVIDIA RTX PRO 6000 Blackwell can be time-sliced across multiple VMs - supporting up to 3 NVIDIA vGPU for Compute VMs per slice - to maximize user density while maintaining performance isolation through hardware-level partitioning.
Note
You can determine whether time-sliced, MIG-backed vGPUs are supported with your GPU on your chosen hypervisor by running the nvidia-smi -q
command.
$ nvidia-smi -q
vGPU Device Capability
MIG Time-Slicing : Supported
MIG Time-Slicing Mode : Enabled
If
MIG Time-Slicing
is shown asSupported
, the GPU supports time-sliced, MIG-backed vGPUs.If
MIG Time-Slicing Mode
is shown asEnabled
, your chosen hypervisor supports time-sliced, MIG-backed vGPUs on GPUs that also support this feature.
The Ampere NVIDIA A100 PCIe 40GB card has one physical GPU and can support several types of MIG-backed vGPU configurations. The following figure shows examples of valid homogeneous and mixed MIG-backed virtual GPU configurations on NVIDIA A100 PCIe 40GB.
A valid homogeneous configuration with 3 A100-2-10C vGPUs on 3 MIG.2g.10b GPU instances
A valid homogeneous configuration with 2 A100-3-20C vGPUs on 3 MIG.3g.20b GPU instances
A valid mixed configuration with 1 A100-4-20C vGPU on a MIG.4g.20b GPU instance, 1 A100-2-10C vGPU on a MIG.2.10b GPU instance, and 1 A100-1-5C vGPU on a MIG.1g.5b instance

Configuring MIG-Backed vGPU#
Configuring a GPU for MIG-Backed vGPUs#
To support GPU Instances with NVIDIA vGPU, a GPU must be configured with MIG mode enabled, and GPU Instances and Compute Instances must be created and configured on the physical GPU.
Prerequisites
The NVIDIA Virtual GPU Manager is installed on the hypervisor host.
You have root user privileges on your hypervisor host machine.
You have determined which GPU instances correspond to the vGPU types of the MIG-backed vGPUs you will create.
Other processes, such as CUDA applications, monitoring applications, or the
nvidia-smi
command, do not use the GPU.
Steps
-
Note
For VMware vSphere, only enabling MIG mode is required because VMware vSphere creates the GPU Instances and Compute Instances.
After configuring a GPU for MIG-backed vGPUs, create the vGPUs you need and add them to their VMs.
Enabling MIG Mode for a GPU#
Perform this task in your hypervisor command shell.
Open a command shell as the root user on your hypervisor host machine. You can use a secure shell (SSH) on all supported hypervisors. Individual hypervisors may provide additional means for logging in. For details, refer to the documentation for your hypervisor.
Determine whether MIG mode is enabled. Use the
nvidia-smi
command for this purpose. By default, MIG mode is disabled. This example shows that MIG mode is disabled on GPU 0.Note
In the output from nvidia-smi, the NVIDIA A100 HGX 40GB GPU is referred to as A100-SXM4-40GB.
$ nvidia-smi -i 0 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.16 Driver Version: 550.54.16 CUDA Version: 12.3 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-SXM4-40GB On | 00000000:36:00.0 Off | 0 | | N/A 29C P0 62W / 400W | 0MiB / 40537MiB | 6% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+
If MIG mode is disabled, enable it.
$ nvidia-smi -i [gpu-ids] -mig 1
gpu-ids
- A comma-separated list of GPU indexes, PCI bus IDs, or UUIDs specifying the GPUs you want to enable MIG mode. Ifgpu-ids
are omitted, MIG mode is enabled on all GPUs on the system.This example enables MIG mode on GPU 0.
$ nvidia-smi -i 0 -mig 1 Enabled MIG Mode for GPU 00000000:36:00.0 All done.
Note
If another process is using the GPU, this command fails and displays a warning message that MIG mode for the GPU is in the pending enable state. In this situation, stop all GPU processes and retry the command.
VMware vSphere ESXi with GPUs based only on the NVIDIA Ampere architecture: Reboot the hypervisor host. If you are using a different hypervisor or GPUs based on the NVIDIA Hopper GPU architecture or a later architecture, omit this step.
Query the GPUs on which you enabled MIG mode to confirm that MIG mode is enabled. This example queries GPU 0 for the PCI bus ID and MIG mode in comma-separated values (CSV) format.
$ nvidia-smi -i 0 --query-gpu=pci.bus_id,mig.mode.current --format=csv pci.bus_id, mig.mode.current 00000000:36:00.0, Enabled
Creating GPU Instances on a MIG-Enabled GPU#
Note
If you are using VMware vSphere, omit this task. VMware vSphere creates the GPU instances automatically.
Perform this task in your hypervisor command shell.
Open a command shell as the root user on your hypervisor host machine if necessary.
List the GPU instance profiles that are available on your GPU. When you create a profile, you must specify the profiles by their IDs, not their names.
$ nvidia-smi mig -lgip +--------------------------------------------------------------------------+ | GPU instance profiles: | | GPU Name ID Instances Memory P2P SM DEC ENC | | Free/Total GiB CE JPEG OFA | |==========================================================================| | 0 MIG 1g.5gb 19 7/7 4.95 No 14 0 0 | | 1 0 0 | +--------------------------------------------------------------------------+ | 0 MIG 2g.10gb 14 3/3 9.90 No 28 1 0 | | 2 0 0 | +--------------------------------------------------------------------------+ | 0 MIG 3g.20gb 9 2/2 19.79 No 42 2 0 | | 3 0 0 | +--------------------------------------------------------------------------+ | 0 MIG 4g.20gb 5 1/1 19.79 No 56 2 0 | | 4 0 0 | +--------------------------------------------------------------------------+ | 0 MIG 7g.40gb 0 1/1 39.59 No 98 5 0 | | 7 1 1 | +--------------------------------------------------------------------------+
Discover the GPU instance profiles that are mapped to the different vGPU types.
nvidia-smi vgpu -s -v
For example, for H100, some of the listing attributes of certain vGPU profiles look something like:
# nvidia-smi vgpu -s -v -i 1 GPU 00000000:1A:00.0 vGPU Type ID : 0x335 Name : NVIDIA H100-1-10C Class : Compute GPU Instance Profile ID : 19 ... vGPU Type ID : 0x336 Name : NVIDIA H100-2-20C Class : Compute GPU Instance Profile ID : 14 …
Create the GPU instances with a default compute instance corresponding to the vGPU types of the MIG-backed vGPUs you will create.
$ nvidia-smi mig -cgi gpu-instance-profile-ids -C
gpu-instance-profile-ids
- A comma-separated list of GPU instance profile IDs specifying the GPU instances you want to create.This example creates two GPU instances of type
2g.10gb
with profileID 14
.$ nvidia-smi mig -cgi 14,14 -C Successfully created GPU instance ID 5 on GPU 2 using profile MIG 2g.10gb (ID 14) Successfully created GPU instance ID 3 on GPU 2 using profile MIG 2g.10gb (ID 14)
Note
If you are creating a GPU Instance to support a 1:1 MIG-backed vGPU on a platform other than VMware vSphere, you can optionally create non-default Compute Instances for that vGPU, by following the steps outlined in the Creating Non-Default Compute Instances in a GPU Instance for 1:1 vGPUs section.
Creating Non-Default Compute Instances in a GPU Instance for 1:1 vGPUs#
This task is required only if you plan to use a 1:1, MIG-backed vGPU on a GPU Instance and wish to create non-default Compute Instances for that vGPU. This option is only available on platforms other than VMware vSphere.
Perform this task in your hypervisor command shell.
Open a command shell as the root user on your hypervisor host machine if necessary.
List the available GPU instances.
$ nvidia-smi mig -lgi +----------------------------------------------------+ | GPU instances: | | GPU Name Profile Instance Placement | | ID ID Start:Size | |====================================================| | 2 MIG 2g.10gb 14 3 0:2 | +----------------------------------------------------+ | 2 MIG 2g.10gb 14 5 4:2 | +----------------------------------------------------+
Create the compute instances that you need within each GPU instance.
$ nvidia-smi mig -cci -gi gpu-instance-ids
gpu-instance-ids
- A comma-separated list of GPU instance IDs that specifies the GPU instances within which you want to create the compute instances.Caution
To avoid an inconsistent state between a guest VM and the hypervisor host, do not create compute instances from the hypervisor on a GPU instance on which an active guest VM is running. Runtime changes to the vGPU’s Compute Instance configuration may be done by the guest VM itself, as explained in Modifying a MIG-Backed vGPU’s Configuration.
This example creates a compute instance on each GPU instance 3 and 5.
$ nvidia-smi mig -cci -gi 3,5 Successfully created compute instance on GPU 0 GPU instance ID 1 using profile ID 2 Successfully created compute instance on GPU 0 GPU instance ID 2 using profile ID 2
Verify that the compute instances were created within each GPU instance.
$ nvidia-smi +-----------------------------------------------------------------------------+ | MIG devices: | +------------------+----------------------+-----------+-----------------------+ | GPU GI CI MIG | Memory-Usage | Vol| Shared | | ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG| | | | ECC| | |==================+======================+===========+=======================| | 2 3 0 0 | 0MiB / 9984MiB | 28 0 | 2 0 1 0 0 | | | 0MiB / 16383MiB | | | +------------------+----------------------+-----------+-----------------------+ | 2 5 0 1 | 0MiB / 9984MiB | 28 0 | 2 0 1 0 0 | | | 0MiB / 16383MiB | | | +------------------+----------------------+-----------+-----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================|
Note
Additional Compute Instances created in a VM at runtime are destroyed when the VM is shut down or rebooted. After the shutdown or reboot, only one Compute Instance remains in the VM.
On NVIDIA B200 HGX, the following Compute Instance combinations are blocked in vGPU for Compute Guest VMs running on full-sized (7-slice) GPU Instances:
4-slice
2-slice + 2-slice + 2-slice
3-slice + 2-slice + 2-slice
2-slice + 2-slice + 3-slice
Disabling MIG Mode for One or More GPUs#
If a GPU you want to use for time-sliced vGPUs or GPU passthrough has previously been configured for MIG-backed vGPUs, disable MIG mode on the GPU.
Prerequisites
The NVIDIA Virtual GPU Manager is installed on the hypervisor host.
You have root user privileges on your hypervisor host machine.
Other processes, such as CUDA applications, monitoring applications, or the
nvidia-smi
command, do not use the GPU.
Steps
Perform this task in your hypervisor command shell.
Open a command shell as the root user on your hypervisor host machine. You can use a secure shell (SSH) on all supported hypervisors. Individual hypervisors may provide additional means for logging in. For details, refer to the documentation for your hypervisor.
Determine whether MIG mode is disabled. Use the
nvidia-smi
command for this purpose. By default, MIG mode is disabled but might have previously been enabled. This example shows that MIG mode is enabled on GPU 0.Note
In the output from
nvidia-smi
, the NVIDIA A100 HGX 40GB GPU is referred to as A100-SXM4-40GB.$ nvidia-smi -i 0 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.16 Driver Version: 550.54.16 CUDA Version: 12.3 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-SXM4-40GB Off | 00000000:36:00.0 Off | 0 | | N/A 29C P0 62W / 400W | 0MiB / 40537MiB | 6% Default | | | | Enabled | +-------------------------------+----------------------+----------------------+
If MIG mode is enabled, disable it.
$ nvidia-smi -i [gpu-ids] -mig 0
gpu-ids
- A comma-separated list of GPU indexes, PCI bus IDs, or UUIDs specifying the GPUs you want to disable MIG mode. Ifgpu-ids
are omitted, MIG mode is disabled for all GPUs in the system.This example disables MIG Mode on GPU 0.
$ sudo nvidia-smi -i 0 -mig 0 Disabled MIG Mode for GPU 00000000:36:00.0 All done.
Confirm that MIG mode was disabled. Use the
nvidia-smi
command for this purpose. This example shows that MIG mode is disabled on GPU 0.$ nvidia-smi -i 0 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.16 Driver Version: 550.54.16 CUDA Version: 12.3 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 A100-SXM4-40GB Off | 00000000:36:00.0 Off | 0 | | N/A 29C P0 62W / 400W | 0MiB / 40537MiB | 6% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+
Modifying a MIG-Backed vGPU’s Configuration From a Guest VM#
If you want to replace the compute instances created when the GPU was configured for MIG-backed vGPUs, you can delete them before adding the compute instances from within the guest VM.
Note
From within a guest VM, you can modify the configuration only of MIG-backed vGPUs that occupy an entire GPU instance. For time-sliced, MIG-backed vGPUs, you must create compute instances as explained in Create Compute Instances in a GPU instance. Creating Non-Default Compute Instances in a GPU Instance for 1:1 vGPUs.
On NVIDIA B200 HGX, the following Compute Instance combinations are blocked in vGPU for Compute Guest VMs running on full-sized (7-slice) GPU Instances:
4-slice
2-slice + 2-slice + 2-slice
3-slice + 2-slice + 2-slice
2-slice + 2-slice + 3-slice
A MIG-backed vGPU that occupies an entire GPU instance is assigned all of the instance’s framebuffer. For such vGPUs, the maximum vGPUs per GPU instance in the tables in Virtual GPU Types for Supported GPUs is always 1.
Prerequisites
You have root user privileges in the guest VM.
Other processes, such as CUDA applications, monitoring applications, or the nvidia-smi command, do not use the GPU instance.
Steps
Perform this task in a guest VM command shell.
Open a command shell as the root user in the guest VM. You can use a secure shell (SSH) on all supported hypervisors. Individual hypervisors may provide additional means for logging in. For details, refer to the documentation for your hypervisor.
List the available GPU instances
$ nvidia-smi mig -lgi +----------------------------------------------------+ | GPU instances: | | GPU Name Profile Instance Placement | | ID ID Start:Size | |====================================================| | 0 MIG 2g.10gb 0 0 0:8 | +----------------------------------------------------+
Optional: If compute instances were created when the GPU was configured for MIG-backed vGPUs that you no longer require, delete them.
$ nvidia-smi mig -dci -ci compute-instance-id -gi gpu-instance-id
compute-instance-id
- The ID of the compute instance that you want to delete.gpu-instance-id
- The ID of the GPU instance from which you want to delete the compute instance.Note
This command fails if another process is using the GPU instance. In this situation, stop all processes using the GPU instance and retry the command.
This example deletes
compute instance 0
from GPU instance0
onGPU 0
.$ nvidia-smi mig -dci -ci 0 -gi 0 Successfully destroyed compute instance ID 0 from GPU 0 GPU instance ID 0
List the compute instance profiles that are available for your GPU instance.
$ nvidia-smi mig -lcip
This example shows that one MIG 2g.10gb compute instance or two MIG 1c.2g.10gb compute instances can be created within the GPU instance.
$ nvidia-smi mig -lcip +-------------------------------------------------------------------------------+ | Compute instance profiles: | | GPU GPU Name Profile Instances Exclusive Shared | | Instance ID Free/Total SM DEC ENC OFA | | ID CE JPEG | |===============================================================================| | 0 0 MIG 1c.2g.10gb 0 2/2 14 1 0 0 | | 2 0 | +-------------------------------------------------------------------------------+ | 0 0 MIG 2g.10gb 1* 1/1 28 1 0 0 | | 2 0 | +-------------------------------------------------------------------------------+
Create the compute instances that you need within the available GPU instance. Run the following command to create each compute instance individually.
$ nvidia-smi mig -cci compute-instance-profile-id -gi gpu-instance-id
compute-instance-profile-id
- The compute instance profile ID that specifies the compute instance.gpu-instance-id
- The GPU instance ID specifies the GPU instance within which you want to create the compute instance.Note
This command fails if another process is using the GPU instance. In this situation, stop all GPU processes and retry the command.
This example creates a
MIG 2g.10gb
compute instance on GPU instance 0.$ nvidia-smi mig -cci 1 -gi 0 Successfully created compute instance ID 0 on GPU 0 GPU instance ID 0 using profile MIG 2g.10gb (ID 1)
This example creates two
MIG 1c.2g.10gb
compute instances on GPU instance 0 by running the same command twice.$ nvidia-smi mig -cci 0 -gi 0 Successfully created compute instance ID 0 on GPU 0 GPU instance ID 0 using profile MIG 1c.2g.10gb (ID 0) $ nvidia-smi mig -cci 0 -gi 0 Successfully created compute instance ID 1 on GPU 0 GPU instance ID 0 using profile MIG 1c.2g.10gb (ID 0)
Verify that the compute instances were created within the GPU instance. Use the
nvidia-smi
command for this purpose. This example confirms that aMIG 2g.10gb
compute instance was created on GPU instance 0.nvidia-smi Mon Mar 25 19:01:24 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.16 Driver Version: 550.54.16 CUDA Version: 12.3 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GRID A100X-2-10C On | 00000000:00:08.0 Off | On | | N/A N/A P0 N/A / N/A | 1058MiB / 10235MiB | N/A Default | | | Enabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | MIG devices: | +------------------+----------------------+-----------+-----------------------+ | GPU GI CI MIG | Memory-Usage | Vol| Shared | | ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG| | | | ECC| | |==================+======================+===========+=======================| | 0 0 0 0 | 1058MiB / 10235MiB | 28 0 | 2 0 1 0 0 | | | 0MiB / 4096MiB | | | +------------------+----------------------+-----------+-----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
This example confirms that two
MIG 1c.2g.10gb
compute instances were created on GPU instance 0.$ nvidia-smi Mon Mar 25 19:01:24 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.16 Driver Version: 550.54.16 CUDA Version: 12.3 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GRID A100X-2-10C On | 00000000:00:08.0 Off | On | | N/A N/A P0 N/A / N/A | 1058MiB / 10235MiB | N/A Default | | | | Enabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | MIG devices: | +------------------+----------------------+-----------+-----------------------+ | GPU GI CI MIG | Memory-Usage | Vol| Shared | | ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG| | | | ECC| | |==================+======================+===========+=======================| | 0 0 0 0 | 1058MiB / 10235MiB | 14 0 | 2 0 1 0 0 | | | 0MiB / 4096MiB | | | +------------------+ +-----------+-----------------------+ | 0 0 1 1 | | 14 0 | 2 0 1 0 0 | | | | | | +------------------+----------------------+-----------+-----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
Monitoring MIG-backed vGPU Activity#
Note
MIG-backed vGPU activity cannot be monitored on GPUs based on the NVIDIA Ampere GPU architecture because the required hardware feature is absent.
On the NVIDIA RTX Pro 6000 Blackwell Server Edition, GPM metrics are supported only for 1:1 MIG backed vGPUs and are not available for MIG-backed and timesliced vGPUs.
The
--gpm-metrics
option is supported only on MIG-backed vGPUs that are allocated all of the GPU instance’s frame buffer.
For more information, refer to the Monitoring MIG-backed vGPU Activity documentation.
Device Groups#
Device Groups provide an abstraction layer for multi-device virtual hardware provisioning. They enable platforms to automatically detect sets of physically connected devices (such as GPUs linked via NVLink or GPU-NIC pairs) at the hardware level and present them as a single logical unit to VMs. This abstraction is particularly vital to ensure that AI workloads that depend on low-latency, high-bandwidth communication, such as distributed model training, inference, and large-scale data processing, ensure maximum utilization of the underlying hardware topology.
Device groups can consist of two or more hardware devices that share a common PCIe switch or a direct interconnect. This simplifies virtual hardware assignment and enables:
Optimized Multi-GPU and GPU-NIC communication: NVLink-connected GPUs can be provisioned together to maximize peer-to-peer bandwidth and minimize latency, which is ideal for large-batch training and NCCL all-reduce-heavy workloads. Similarly, GPU-NIC pairs located under the same PCIe switch or capable of delivering optimal GPUDirect RDMA performance are grouped together, enabling high-throughput data ingestion directly into GPU memory for training or inference workloads. Adjacent NICs that do not meet the required performance thresholds are automatically excluded to avoid bottlenecks.
Topology consistency: Unlike manual device assignment, Device Groups guarantee correct placement across PCIe switches and interconnects, even after reboots or events like live migration.
Simplified and reliable provisioning: By abstracting the PCIe/NVLink topology into logical units, device groups eliminate the need for scripting or topology mapping, reducing the risk of misconfiguration and enabling faster deployment of AI clusters.

This figure illustrates how devices (GPUs and NICs) that share a common PCIe switch or a direct GPU interconnect can be presented as a device group. On the right side, we can see that although two NICs are connected to the same PCIe switch as the GPU, only one NIC is included in the device group. This is because the NVIDIA driver identifies and exposes only the GPU-NIC pairings that meet the necessary criteria like GPUDirect RDMA. Adjacent NICs that do not satisfy these requirements are excluded.
For more information regarding Hypervisor Platform support for Device Groups, refer to the vGPU Device Groups documentation.
GPUDirect RDMA and GPUDirect Storage#
NVIDIA GPUDirect Remote Direct Memory Access (RDMA) is a technology in NVIDIA GPUs that enables direct data exchange between GPUs and a third-party peer device using PCIe. GPUDirect RDMA enables network devices to access the vGPU frame buffer directly, bypassing CPU host memory altogether. The third-party devices could be network interfaces such as NVIDIA ConnectX SmartNICs or BlueField DPUs, or video acquisition adapters.
GPUDirect Storage (GDS) enables a direct data path between local or remote storage, such as NFS servers or NVMe/NVMe over Fabric (NVMe-oF), and GPU memory. GDS performs direct memory access (DMA) transfers between GPU memory and storage. DMA avoids a bounce buffer through the CPU. This direct path increases system bandwidth and decreases the latency and utilization load on the CPU.
GPUDirect technology is supported only on a subset of vGPUs and guest OS releases.
GPUDirect RDMA and GPUDirect Storage Known Issues and Limitations#
Starting with GPUDirect Storage technology release 1.7.2, the following limitations apply:
GPUDirect Storage technology is not supported on GPUs based on the NVIDIA Ampere GPU architecture.
On GPUs based on the NVIDIA Ada Lovelace, Hopper, and Blackwell GPU architectures, GPUDirect Storage technology is supported only with the guest driver for Linux based on NVIDIA Linux open GPU kernel modules.
GPUDirect Storage technology releases before 1.7.2 are supported only with guest drivers with Linux kernel versions earlier than 6.6.
GPUDirect Storage technology is supported only on the following guest OS releases:
Red Hat Enterprise Linux 8.8+
Ubuntu 22.04 LTS
Ubuntu 24.04 LTS
Hypervisor Platform Support for GPUDirect RDMA and GPUDirect Storage#
Hypervisor Platform |
Version |
---|---|
Red Hat Enterprise Linux with KVM |
8.8+ |
Ubuntu |
|
VMware vSphere |
|
vGPU Support for GPUDirect RDMA and GPUDirect Storage#
GPUDirect RDMA and GPUDirect Storage technology are supported on all time-sliced and MIG-backed NVIDIA vGPU for Compute on physical GPUs that support single root I/O virtualization (SR-IOV).
For a list of supported GPUs, refer to the Supported NVIDIA GPUs and Networking section in the NVIDIA AI Enterprise Infra Support Matrix.
Guest OS Releases Support for GPUDirect RDMA and GPUDirect Storage#
Linux only. GPUDirect technology is not supported on Windows.
Network Interface Cards Support for GPUDirect RDMA and GPUDirect Storage#
GPUDirect technology is supported on the following network interface cards:
NVIDIA ConnectX- 8 SmartNIC
NVIDIA ConnectX- 7 SmartNIC
Mellanox Connect-X 6 SmartNIC
Mellanox Connect-X 5 Ethernet adapter card
Heterogeneous vGPU#
Heterogeneous vGPU allows a single physical GPU to simultaneously support multiple vGPU profiles with different memory allocations (framebuffer sizes). This configuration is particularly beneficial for environments where VMs have diverse GPU resource requirements. By enabling the same physical GPU to host vGPUs of varying sizes, heterogeneous vGPU optimizes overall resource usage, ensuring VMs access only the necessary GPU resources and preventing underutilization.
When a GPU is configured for heterogeneous vGPU, its behavior during events like a host reboot, NVIDIA Virtual GPU Manager reload, or GPU reset varies by hypervisor. This configuration only supports the Best Effort and Equal Share schedulers.
Heterogeneous vGPU is supported on Volta and later GPUs. For additional information and operational instructions across different hypervisors, refer to the Heterogeneous vGPU documentation.
Platform Support for Heterogeneous vGPUs#
Hypervisor Platform |
NVIDIA AI Enterprise Infra Release |
Documentation |
---|---|---|
Red Hat Enterprise Linux with KVM |
|
|
Canonical Ubuntu with KVM |
|
|
VMware vSphere |
|
Live Migration#
Live migration enables the seamless transfer of VMs configured with NVIDIA vGPUs from one physical host to another without downtime. This capability enables enterprises to maintain continuous operations during infrastructure changes, balancing workloads, or reallocating resources with minimal disruption. Live migration offers significant operational benefits, including enhanced business continuity, scalability, and agility.
For additional information about this feature and instructions on how to perform the operation across different hypervisors, refer to the vGPU Live Migration documentation.
Live Migration Known Issues and Limitations#
Hypervisor Platform |
Documentation |
---|---|
Red Hat Enterprise Linux with KVM |
Known Issues and Limitations with NVIDIA vGPU for Compute Migration on RHEL KVM |
Ubuntu with KVM |
Known Issues and Limitations with NVIDIA vGPU for Compute Migration on Ubuntu KVM |
VMware vSphere |
Known Issues and Limitations with NVIDIA vGPU for Compute Migration on VMware vSphere |
Platform Support for Live Migration#
Hypervisor Platform |
Version |
NVIDIA AI Enterprise Infra Release |
Documentation |
---|---|---|---|
Red Hat Enterprise Linux with KVM |
|
|
Migrating a VM Configured with NVIDIA vGPU for Compute on RHEL KVM |
Ubuntu with KVM |
24.04 LTS |
|
Migrating a VM Configured with NVIDIA vGPU for Compute on Linux KVM |
VMware vSphere |
|
All active NVIDIA AI Enterprise Infra Releases |
Migrating a VM Configured with NVIDIA vGPU for Compute on VMware vSphere |
Note
Live Migration is not supported between RHEL 10 and RHEL 9.4.
vGPU Support for Live Migration#
For a list of supported GPUs, refer to the Supported NVIDIA GPUs and Networking section in the NVIDIA AI Enterprise Infra Support Matrix.
Note
Live Migration is not supported between 80GB PCIe and 94GB NVL variants of GPU Boards
Live Migration is not supported between H200 / H800 / H100 GPU Boards
Multi-vGPU and P2P#
Multi-vGPU technology allows a single VM to simultaneously leverage multiple vGPUs, significantly enhancing its computational capabilities. Unlike standard vGPU configurations that virtualize a single physical GPU for sharing across multiple VMs, Multi-vGPU presents resources from several vGPU devices into a single VM. These vGPU devices are not required to reside on the same physical GPU; they can be distributed across separate physical GPUs, pooling their collective power to meet the demands of high-performance workloads.
This technology is particularly advantageous for AI training and inference workloads that require extensive computational power. It optimizes resource allocation by enabling applications within a VM to access dedicated GPU resources. For instance, a VM configured with two NVIDIA A100 GPUs using Multi-vGPU can run large-scale AI models more efficiently than with a single GPU. This dedicated assignment eliminates resource contention between different AI processes within the same VM, ensuring optimal and predictable performance for critical tasks. The ability to aggregate computational power from multiple vGPUs makes Multi-vGPU an ideal solution for scaling complex AI model development and deployment.
Peer-To-Peer (P2P) CUDA Transfers#
Peer-to-Peer (P2P) CUDA transfers enable device memory between vGPUs on different GPUs that are assigned to the same VM to be accessed from within the CUDA kernels. NVLink is a high-bandwidth interconnect that enables fast communication between such vGPUs.
P2P CUDA transfers over NVLink are supported only on a subset of vGPUs, hypervisor releases, and guest OS releases.
Peer-to-Peer CUDA Transfers Known Issues and Limitations#
Only time-sliced vGPUs are supported. MIG-backed vGPUs are not supported.
P2P transfers over PCIe are not supported.
Hypervisor Platform Support for Multi-vGPU and P2P#
Hypervisor Platform |
NVIDIA AI Enterprise Infra Release |
Supported vGPU Types |
Documentation |
---|---|---|---|
Red Hat Enterprise Linux with KVM |
All active NVIDIA AI Enterprise Infra Releases |
All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported. |
|
Ubuntu with KVM |
All active NVIDIA AI Enterprise Infra Releases |
All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported. |
|
VMware vSphere |
All active NVIDIA AI Enterprise Infra Releases |
All NVIDIA vGPU for Compute, on supported GPUs, both time-sliced and MIG-backed vGPUs are supported. |
Note
P2P CUDA transfers are not supported on Windows. Only Linux OS distros as outlined in NVIDIA AI Enterprise Infrastructure Support Matrix are supported.
vGPU Support for Multi-vGPU#
You can assign multiple vGPUs with differing amounts of frame buffer to a single VM, provided the board type and the series of all the vGPUs are the same. For example, you can assign an A40-48C vGPU and an A40-16C vGPU to the same VM. However, you cannot assign an A30-8C vGPU and an A16-8C vGPU to the same VM.
Board |
vGPU [1] |
---|---|
NVIDIA HGX B200 180GB |
Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu: - All NVIDIA vGPU for Compute |
NVIDIA RTX PRO 6000 Blackwell SE 96GB |
|
Board |
vGPU [1] |
---|---|
NVIDIA H800 PCIe 94GB (H800 NVL) |
All NVIDIA vGPU for Compute |
NVIDIA H800 PCIe 80GB |
All NVIDIA vGPU for Compute |
NVIDIA H800 SXM5 80GB |
NVIDIA vGPU for Compute [5] |
NVIDIA H200 PCIe 141GB (H200 NVL) |
All NVIDIA vGPU for Compute |
NVIDIA H200 SXM5 141GB |
NVIDIA vGPU for Compute [5] |
NVIDIA H100 PCIe 94GB (H100 NVL) |
All NVIDIA vGPU for Compute |
NVIDIA H100 SXM5 94GB |
NVIDIA vGPU for Compute [5] |
NVIDIA H100 PCIe 80GB |
All NVIDIA vGPU for Compute |
NVIDIA H100 SXM5 80GB |
NVIDIA vGPU for Compute [5] |
NVIDIA H100 SXM5 64GB |
NVIDIA vGPU for Compute [5] |
NVIDIA H20 SXM5 141GB |
NVIDIA vGPU for Compute [5] |
NVIDIA H20 SXM5 96GB |
NVIDIA vGPU for Compute [5] |
Board |
vGPU |
---|---|
NVIDIA L40 |
|
NVIDIA L40S |
|
NVIDIA L20 |
|
NVIDIA L4 |
|
NVIDIA L2 |
|
NVIDIA RTX 6000 Ada |
|
NVIDIA RTX 5880 Ada |
|
NVIDIA RTX 5000 Ada |
|
Board |
vGPU [1] |
---|---|
|
|
NVIDIA A800 PCIe 40GB active-cooled |
|
NVIDIA A800 HGX 80GB |
|
|
|
NVIDIA A100 HGX 80GB |
|
NVIDIA A100 PCIe 40GB |
|
NVIDIA A100 HGX 40GB |
|
NVIDIA A40 |
|
|
|
NVIDIA A16 |
|
NVIDIA A10 |
|
NVIDIA RTX A6000 |
|
NVIDIA RTX A5500 |
|
NVIDIA RTX A5000 |
|
Board |
vGPU |
---|---|
Tesla T4 |
|
Quadro RTX 6000 passive |
|
Quadro RTX 8000 passive |
|
Board |
vGPU |
---|---|
Tesla V100 SXM2 |
|
Tesla V100 SXM2 32GB |
|
Tesla V100 PCIe |
|
Tesla V100 PCIe 32GB |
|
Tesla V100S PCIe 32GB |
|
Tesla V100 FHHL |
|
vGPU Support for P2P#
Only NVIDIA vGPU for Compute time-sliced vGPUs allocated all of the physical GPU framebuffer on physical GPUs that support NVLink are supported.
Board |
vGPU |
---|---|
NVIDIA HGX B200 180GB |
NVIDIA B200X-180C |
Board |
vGPU |
---|---|
NVIDIA H800 PCIe 94GB (H800 NVL) |
H800L-94C |
NVIDIA H800 PCIe 80GB |
H800-80C |
NVIDIA H200 PCIe 141GB (H200 NVL) |
H200-141C |
NVIDIA H200 SXM5 141GB |
H200X-141C |
NVIDIA H100 PCIe 94GB (H100 NVL) |
H100L-94C |
NVIDIA H100 SXM5 94GB |
H100XL-94C |
NVIDIA H100 PCIe 80GB |
H100-80C |
NVIDIA H100 SXM5 80GB |
H100XM-80C |
NVIDIA H100 SXM5 64GB |
H100XS-64C |
NVIDIA H20 SXM5 141GB |
H20X-141C |
NVIDIA H20 SXM5 96GB |
H20-96C |
Board |
vGPU |
---|---|
|
A800D-80C |
NVIDIA A800 PCIe 40GB active-cooled |
A800-40C |
NVIDIA A800 HGX 80GB |
A800DX-80C [2] |
|
A100D-80C |
NVIDIA A100 HGX 80GB |
A100DX-80C [2] |
NVIDIA A100 PCIe 40GB |
A100-40C |
NVIDIA A100 HGX 40GB |
A100X-40C [2] |
NVIDIA A40 |
A40-48C |
|
A30-24C |
NVIDIA A16 |
A16-16C |
NVIDIA A10 |
A10-24C |
NVIDIA RTX A6000 |
A6000-48C |
NVIDIA RTX A5500 |
A5500-24C |
NVIDIA RTX A5000 |
A5000-24C |
Board |
vGPU |
---|---|
Quadro RTX 8000 passive |
RTX8000P-48C |
Quadro RTX 6000 passive |
RTX6000P-24C |
Board |
vGPU |
---|---|
Tesla V100 SXM2 |
V100X-16C |
Tesla V100 SXM2 32GB |
V100DX-32C |
NVIDIA NVSwitch#
NVIDIA NVSwitch provides a high-bandwidth, low-latency interconnect fabric that enables seamless, direct communication between multiple GPUs within a system. NVIDIA NVSwitch enables peer-to-peer vGPU communication within a single node over the NVLink fabric. The NVSwitch acts as a high-speed crossbar, allowing any GPU to communicate with any other GPU at full NVLink speed, significantly improving communication efficiency and bandwidth compared to traditional PCIe-based interconnections. It facilitates the creation of large GPU clusters, enabling AI and deep learning applications to efficiently utilize pooled GPU memory and compute resources for complex, computationally intensive tasks. It is supported only on a subset of hardware platforms, vGPUs, hypervisor software releases, and guest OS releases.
For information about using the NVSwitch, refer to the NVIDIA Fabric Manager documentation.
Platform Support for NVIDIA NVSwitch#
NVIDIA HGX B200 8-GPU baseboard
NVIDIA HGX H200 8-GPU baseboard
NVIDIA HGX H100 8-GPU baseboard
NVIDIA HGX H800 8-GPU baseboard
NVIDIA HGX A100 8-GPU baseboard
NVIDIA NVSwitch Limitations#
Only time-sliced vGPUs are supported. MIG-backed vGPUs are not supported.
GPU passthrough is not supported on NVIDIA Systems that include NVSwitch when using VMware vSphere.
All vGPUs communicating peer-to-peer must be assigned to the same VM.
On GPUs based on the NVIDIA Hopper and Blackwell GPU architectures, multicast is supported when unified memory (UVM) is enabled.
VMware vSphere is not supported on NVIDIA HGX B200.
Hypervisor Platform Support for NVSwitch#
Consult the documentation from your hypervisor vendor for information about which generic Linux with KVM hypervisor software releases supports NVIDIA NVSwitch.
All supported Red Hat Enterprise Linux KVM and Ubuntu KVM releases support NVIDIA NVSwitch.
The earliest VMware vSphere Hypervisor (ESXi) release that supports NVIDIA NVSwitch depends on the GPU architecture.
GPU Architecture |
Earliest Supported VMware vSphere Hypervisor (ESXi) Release |
---|---|
NVIDIA Blackwell |
Not supported on VMware |
NVIDIA Hopper |
VMware vSphere Hypervisor (ESXi) 8 update 2 |
NVIDIA Ampere |
VMware vSphere Hypervisor (ESXi) 8 update 1 |
vGPU Support for NVSwitch#
Only the following vGPU for Compute time-sliced vGPUs that are allocated all of the physical GPU’s framebuffer are supported:
NVIDIA A800
NVIDIA A100 HGX
NVIDIA B200 HGX
NVIDIA H800
NVIDIA H200 HGX
NVIDIA H100 SXM5
NVIDIA H20
Board |
vGPU |
---|---|
NVIDIA A800 HGX 80GB |
A800DX-80C |
NVIDIA A100 HGX 80GB |
A100DX-80C |
NVIDIA A100 HGX 40GB |
A100X-40C |
Board |
vGPU |
---|---|
NVIDIA B200 HGX 180GB |
B200X-180C |
Board |
vGPU |
---|---|
NVIDIA H800 SXM5 80GB |
H800XM-80C |
NVIDIA H200 SXM5 141GB |
H200X-141C |
NVIDIA H100 SXM5 80GB |
H100XM-80C |
NVIDIA H20 SXM5 141GB |
H20X-141C |
NVIDIA H20 SXM5 96GB |
H20-96C |
Guest OS Releases Support for NVSwitch#
Linux only. NVIDIA NVSwitch is not supported on Windows.
NVLink Multicast#
NVLink multicast support requires that unified memory is enabled. For more information about enabling unified memory, refer to the Enabling Unified Memory for a vGPU documentation.
vGPU Support for NVLink Multicast#
Only full-sized, time-sliced NVIDIA vGPU for Compute support NVLink multicast.
Scheduling Policies#
NVIDIA vGPU for Compute offers a range of scheduling policies that allow administrators to customize resource allocation based on workload intensity and organizational priorities, ensuring optimal resource utilization and alignment with business needs. These policies determine how GPU resources are shared across multiple VMs and directly impacts factors like latency, throughput, and performance stability in multi-tenant environments.
For workloads with varying demands, time slicing plays a critical role in determining scheduling efficiency. The vGPU scheduler time slice represents the duration a VM’s work is allowed to run on the GPU before it is preempted. A longer time slice maximizes throughput for compute-heavy workloads, such as CUDA applications, by minimizing context switching. In contrast, a shorter time slice reduces latency, making it ideal for latency-sensitive tasks like graphics applications.
NVIDIA provides three scheduling modes: Best Effort, Equal Share, and Fixed Share, each designed to different workload requirements and environments. For more information, refer to the vGPU Schedulers documentation.
Refer to the Changing Scheduling Behavior for Time-Sliced vGPUs documentation for how to configure and adjust scheduling policies to meet specific resource distribution needs.
Suspend-Resume#
The suspend-resume feature allows NVIDIA vGPU-configured VMs to be temporarily paused and later resumed without losing their operational state. During suspension, the entire VM state, including GPU and compute resources, is saved to disk, thereby freeing these resources on the host. Upon resumption, the state is fully restored, enabling seamless workload continuation.
This capability provides operational flexibility and optimizes resource utilization. It is valuable for planned host maintenance, freeing up resources by pausing non-critical workloads, and ensuring consistent environments for development and testing.
Unlike live migration, suspend-resume involves downtime during both suspension and resumption. Cross-host operations require strict compatibility across hosts, encompassing GPU type, Virtual GPU manager version, memory configuration, and NVLink topology.
Suspend-resume is supported on all GPUs that enable vGPU functionality; however, compatibility varies by hypervisor, NVIDIA vGPU software release, and guest operating system.
For additional information and operational instructions across different hypervisors, refer to the vGPU Suspend-Resume documentation.
Suspend-Resume Known Issues and Limitations#
Hypervisor Platform |
Documentation |
---|---|
VMware vSphere |
Known Issues and Limitations with Suspend Resume on VMware vSphere |
Note
While live migration generally allows resuming a suspended VM on any compatible vGPU host manager, a current bug in Red Hat Enterprise Linux 9.4 and Ubuntu 24.04 LTS limits suspend, resume, and migration to hosts with an identical vGPU manager version. The issue has been resolved in Red Hat Enterprise Linux 9.6 and later.
Platform Support for Suspend-Resume#
Suspend-resume is supported on all GPUs that support NVIDIA vGPU for Compute, but compatibility varies by hypervisor, release version, and guest operating system.
Hypervisor Platform |
Version |
NVIDIA AI Enterprise Infra Release |
Documentation |
---|---|---|---|
Red Hat Enterprise Linux with KVM |
|
|
Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on RHEL KVM |
Ubuntu with KVM |
24.04 LTS |
|
Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on Ubuntu KVM |
VMware vSphere |
|
All active NVIDIA AI Enterprise Infra Releases |
Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on VMware vSphere |
vGPU Support for Suspend-Resume#
For a list of supported GPUs, refer to the Supported NVIDIA GPUs and Networking section in the NVIDIA AI Enterprise Infra Support Matrix.
Unified Virtual Memory (UVM)#
Unified Virtual Memory (UVM) provides a single, cohesive memory address space accessible by both the CPUs and GPUs within a system. This feature creates a managed memory pool, allowing data to be allocated and accessed by code executing on either processor. The primary benefit is the simplification of programming and enhanced performance for GPU-accelerated workloads, as it eliminates the need for applications to explicitly manage data transfers between CPU and GPU memory. For additional information about this feature, refer to the Unified Virtual Memory documentation.

UVM Known Issues and Limitations#
Unified Virtual Memory (UVM) is restricted to 1:1 time-sliced and MIG vGPU for Compute profiles that allocate the entire framebuffer of a compatible physical GPU or GPU Instance. Fractional time-sliced vGPUs do not support UVM.
UVM is only supported on Linux Guest OS distros. Windows Guest OS is not supported.
Enabling UVM disables vGPU migration for the VM, which may reduce operational flexibility in environments reliant on live migration.
UVM is disabled by default and must be explicitly enabled for each vGPU that requires it by setting a specific vGPU plugin parameter for the VM.
When deploying NVIDIA NIM, if UVM is enabled and an optimized engine is available, the model will run on the TensorRT-LLM (TRT-LLM) backend. Otherwise, it will typically run on the vLLM backend.
Hypervisor Platform Support for UVM#
Unified Virtual Memory (UVM) is disabled by default. If used, you must enable unified memory individually for each vGPU for Compute VM that requires it by setting a vGPU plugin parameter. How to enable UVM for a vGPU VM depends on the hypervisor that you are using.
Hypervisor Platform |
Documentation |
---|---|
Red Hat Enterprise Linux with KVM |
Enabling Unified Memory for NVIDIA vGPU for Compute VM on Red Hat Enterprise Linux KVM |
Ubuntu with KVM |
Enabling Unified Memory for NVIDIA vGPU for Compute VM on Ubuntu KVM |
VMware vSphere |
Enabling Unified Memory for NVIDIA vGPU for Compute VM on VMware vSphere |
vGPU Support for UVM#
UVM is supported on 1:1 MIG-backed and time sliced vGPUs. These vGPUs have the entire framebuffer of a MIG GPU Instance or physical GPU assigned to a single vGPU.
Board |
vGPU |
---|---|
NVIDIA HGX B200 SXM |
|
NVIDIA RTX PRO 6000 Blackwell SE |
|
Board |
vGPU |
---|---|
NVIDIA H800 PCIe 94GB (H800 NVL) |
|
NVIDIA H800 PCIe 80GB |
|
NVIDIA H800 SXM5 80GB |
|
NVIDIA H200 SXM5 |
|
NVIDIA H200 NVL |
|
NVIDIA H100 PCIe 94GB (H100 NVL) |
|
NVIDIA H100 SXM5 94GB |
|
NVIDIA H100 PCIe 80GB |
|
NVIDIA H100 SXM5 80GB |
|
NVIDIA H100 SXM5 64GB |
|
NVIDIA H20 SXM5 141GB |
|
NVIDIA H20 SXM5 96GB |
|
Board |
vGPU |
---|---|
NVIDIA L40 |
L40-48C |
NVIDIA L40S |
L40S-48C |
|
L20-48C |
NVIDIA L4 |
L4-24C |
NVIDIA L2 |
L2-24C |
NVIDIA RTX 6000 Ada |
RTX 6000 Ada-48C |
NVIDIA RTX 5880 Ada |
RTX 5880 Ada-48C |
NVIDIA RTX 5000 Ada |
RTX 6000 Ada-32C |
Board |
vGPU |
---|---|
|
|
NVIDIA A800 PCIe 40GB active-cooled |
|
NVIDIA A800 HGX 80GB |
|
|
|
NVIDIA A100 HGX 80GB |
|
NVIDIA A100 PCIe 40GB |
|
NVIDIA A100 HGX 40GB |
|
NVIDIA A40 |
A40-48C |
|
|
NVIDIA A16 |
A16-16C |
NVIDIA A10 |
A10-24C |
NVIDIA RTX A6000 |
A6000-48C |
NVIDIA RTX A5500 |
A5500-24C |
NVIDIA RTX A5000 |
A5000-24C |
Product Limitations and Known Issues#
Red Hat Enterprise Linux with KVM Limitations and Known Issues#
Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.
Ubuntu KVM Limitations and Known Issues#
Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.
VMware vSphere Limitations and Known Issues#
Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.
Requirements for Using vGPU for Compute on VMware vSphere for GPUs Requiring 64 GB+ of MMIO Space with Large-Memory VMs#
Some GPUs require 64 GB or more of MMIO space. When a vGPU on a GPU that requires 64 GB or more of MMIO space is assigned to a VM with 32 GB or more of memory on ESXi , the VM’s MMIO space must be increased to the amount of MMIO space that the GPU requires.
For detailed information about this limitation, refer to the Requirements for Using vGPU on GPUs Requiring 64 GB or More of MMIO Space with Large-Memory VMs documentation.
GPU |
MMIO Space Required |
---|---|
NVIDIA B200 |
768GB |
NVIDIA H200 (all variants) |
512GB |
NVIDIA H100 (all variants) |
256GB |
NVIDIA H800 (all variants) |
256GB |
NVIDIA H20 141GB |
512GB |
NVIDIA H20 96GB |
256GB |
NVIDIA L40 |
128GB |
NVIDIA L20 |
128GB |
NVIDIA L4 |
64GB |
NVIDIA L2 |
64GB |
NVIDIA RTX 6000 Ada |
128GB |
NVIDIA RTX 5000 Ada |
64GB |
NVIDIA A40 |
128GB |
NVIDIA A30 |
64GB |
NVIDIA A10 |
64GB |
NVIDIA A100 80GB (all variants) |
256GB |
NVIDIA A100 40GB (all variants) |
128GB |
NVIDIA RTX A6000 |
128GB |
NVIDIA RTX A5500 |
64GB |
NVIDIA RTX A5000 |
64GB |
Quadro RTX 8000 Passive |
64GB |
Quadro RTX 6000 Passive |
64GB |
Tesla V100 (all variants) |
64GB |
Microsoft Windows Server Limitations and Known Issues#
Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.
NVIDIA AI Enterprise supports only the Tesla Compute Cluster (TCC) driver model for Windows guest drivers.
Windows guest OS support is limited to running applications natively in Windows VMs without containers. NVIDIA AI Enterprise features that depend on the containerization of applications are not supported on Windows guest operating systems.
If you are using a generic Linux supported by the KVM hypervisor, consult the documentation from your hypervisor vendor for information about Windows releases supported as a guest OS.
For more information, refer to the Non-containerized Applications on Hypervisors and Guest Operating Systems Supported with vGPU table.
Virtual GPU Types for Supported GPUs#
NVIDIA Blackwell GPU Architecture#
MIG-Backed and Time-Sliced NVIDIA vGPU for Compute for the NVIDIA Blackwell Architecture#
Physical GPUs per board: 1
The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.
Required license edition: NVIDIA AI Enterprise
MIG-Backed NVIDIA vGPU for Compute
For details on GPU instance profiles, refer to the NVIDIA Multi-Instance GPU User Guide.
Time-Sliced NVIDIA vGPU for Compute
Intended use cases:
vGPUs with more than 40 GB of framebuffer: Training Workloads
vGPUs with 40 GB of framebuffer: Inference Workloads
These vGPU types support a single display with a fixed maximum resolution.
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
B200X-7-180C |
180 |
1 |
7 |
7 |
MIG 7g.180gb |
B200X-4-90C |
90 |
1 |
4 |
4 |
MIG 4g.90gb |
B200X-3-90C |
90 |
2 |
3 |
3 |
MIG 3g.90gb |
B200X-2-45C |
45 |
3 |
2 |
2 |
MIG 2g.45gb |
B200X-1-45C |
45 |
4 |
1 |
1 |
MIG 1g.45gb |
B200X-1-23C |
22.5 |
7 |
1 |
1 |
MIG 1g.23gb |
B200X-1-23CME |
22.5 |
1 |
1 |
1 |
MIG 1g.23gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
DC-4-96C |
96 |
1 |
4 |
4 |
MIG 4g.96gb |
DC-4-48C |
48 |
2 |
4 |
1 |
MIG 4g.48gb |
DC-2-48C |
48 |
1 |
2 |
2 |
MIG 2g.48gb |
DC-4-32C |
32 |
3 |
4 |
1 |
MIG 4g.32gb |
DC-4-24C |
24 |
4 |
4 |
1 |
MIG 4g.24gb |
DC-2-24C |
24 |
2 |
2 |
1 |
MIG 2g.24gb |
DC-1-24C |
24 |
1 |
1 |
1 |
MIG 1g.24gb |
DC-2-16C |
16 |
3 |
2 |
1 |
MIG 2g.16gb |
DC-2-12C |
12 |
4 |
2 |
1 |
MIG 2g.12gb |
DC-1-12C |
12 |
2 |
1 |
1 |
MIG 1g.12gb |
DC-1-8C |
8 |
3 |
1 |
1 |
MIG 1g.8gb |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
DC-96C |
96 |
1 |
1 |
3840x2400 |
1 |
DC-48C |
48 |
2 |
2 |
3840x2400 |
1 |
DC-32C |
32 |
3 |
3 |
3840x2400 |
1 |
DC-24C |
24 |
4 |
4 |
3840x2400 |
1 |
DC-16C |
16 |
6 |
6 |
3840x2400 |
1 |
DC-12C |
12 |
8 |
8 |
3840x2400 |
1 |
DC-8C |
8 |
12 |
12 |
3840x2400 |
1 |
NVIDIA Hopper GPU Architecture#
MIG-Backed and Time-Sliced NVIDIA vGPU for Compute for the NVIDIA Ampere GPU Architecture#
Physical GPUs per board: 1
The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.
Required license edition: NVIDIA AI Enterprise
MIG-Backed NVIDIA vGPU for Compute
For details on GPU instance profiles, refer to the NVIDIA Multi-Instance GPU User Guide.
Time-Sliced NVIDIA vGPU for Compute
Intended use cases:
vGPUs with more than 40 GB of framebuffer: Training Workloads
vGPUs with 40 GB of framebuffer: Inference Workloads
These vGPU types support a single display with a fixed maximum resolution.
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H800L-7-94C |
96 |
1 |
7 |
7 |
MIG 7g.94gb |
H800L-4-47C |
48 |
1 |
4 |
4 |
MIG 4g.47gb |
H800L-3-47C |
48 |
2 |
3 |
3 |
MIG 3g.47gb |
H800L-2-24C |
24 |
3 |
2 |
2 |
MIG 2g.24gb |
H800L-1-24C |
24 |
4 |
1 |
1 |
MIG 1g.24gb |
H800L-1-12C |
12 |
7 |
1 |
1 |
MIG 1g.12gb |
H800L-1-12CME [3] |
12 |
1 |
1 |
1 |
MIG 1g.12gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H800L-94C |
96 |
1 |
1 |
3840x2400 |
1 |
H800L-47C |
48 |
2 |
2 |
3840x2400 |
1 |
H800L-23C |
23 |
4 |
4 |
3840x2400 |
1 |
H800L-15C |
15 |
6 |
4 |
3840x2400 |
1 |
H800L-11C |
11 |
8 |
8 |
3840x2400 |
1 |
H800L-6C |
6 |
15 |
8 |
3840x2400 |
1 |
H800L-4C |
4 |
23 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H800-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
H800-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
H800-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
H800-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
H800-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
H800-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
H800-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H800-80C |
81 |
1 |
1 |
3840x2400 |
1 |
H800-40C |
40 |
2 |
2 |
3840x2400 |
1 |
H800-20C |
20 |
4 |
4 |
3840x2400 |
1 |
H800-16C |
16 |
5 |
4 |
3840x2400 |
1 |
H800-10C |
10 |
8 |
8 |
3840x2400 |
1 |
H800-8C |
8 |
10 |
8 |
3840x2400 |
1 |
H800-5C |
5 |
16 |
16 |
3840x2400 |
1 |
H800-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H800XM-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
H800XM-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
H800XM-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
H800XM-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
H800XM-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
H800XM-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
H800XM-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H800XM-80C |
81 |
1 |
1 |
3840x2400 |
1 |
H800XM-40C |
40 |
2 |
2 |
3840x2400 |
1 |
H800XM-20C |
20 |
4 |
4 |
3840x2400 |
1 |
H800XM-16C |
16 |
5 |
4 |
3840x2400 |
1 |
H800XM-10C |
10 |
8 |
8 |
3840x2400 |
1 |
H800XM-8C |
8 |
10 |
8 |
3840x2400 |
1 |
H800XM-5C |
5 |
16 |
16 |
3840x2400 |
1 |
H800XM-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H200-7-141C |
144 |
1 |
7 |
7 |
MIG 7g.141gb |
H200-4-71C |
72 |
1 |
4 |
4 |
MIG 4g.71gb |
H200-3-71C |
72 |
2 |
3 |
3 |
MIG 3g.71gb |
H200-2-35C |
35 |
3 |
2 |
2 |
MIG 2g.35gb |
H200-1-35C [3] |
35 |
4 |
1 |
1 |
MIG 1g.35gb |
H200-1-18C |
18 |
7 |
1 |
1 |
MIG 1g.18gb |
H200-1-18CME [3] |
18 |
1 |
1 |
1 |
MIG 1g.18gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H200-141C |
144 |
1 |
1 |
3840x2400 |
1 |
H200-70C |
71 |
2 |
2 |
3840x2400 |
1 |
H200-35C |
35 |
4 |
4 |
3840x2400 |
1 |
H200-28C |
28 |
5 |
5 |
3840x2400 |
1 |
H200-17C |
17 |
8 |
8 |
3840x2400 |
1 |
H200-14C |
14 |
10 |
10 |
3840x2400 |
1 |
H200-8C |
8 |
16 |
16 |
3840x2400 |
1 |
H200-7C |
7 |
20 |
20 |
3840x2400 |
1 |
H200-4C |
4 |
32 |
32 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H200X-7-141C |
144 |
1 |
7 |
7 |
MIG 7g.141gb |
H200X-4-71C |
72 |
1 |
4 |
4 |
MIG 4g.71gb |
H200X-3-71C |
72 |
2 |
3 |
3 |
MIG 3g.71gb |
H200X-2-35C |
35 |
3 |
2 |
2 |
MIG 2g.35gb |
H200X-1-35C [3] |
35 |
4 |
1 |
1 |
MIG 1g.35gb |
H200X-1-18C |
18 |
7 |
1 |
1 |
MIG 1g.18gb |
H200X-1-18CME [3] |
18 |
1 |
1 |
1 |
MIG 1g.18gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H200X-141C |
144 |
1 |
1 |
3840x2400 |
1 |
H200X-70C |
71 |
2 |
2 |
3840x2400 |
1 |
H200X-35C |
35 |
4 |
4 |
3840x2400 |
1 |
H200X-28C |
28 |
5 |
5 |
3840x2400 |
1 |
H200X-17C |
17 |
8 |
8 |
3840x2400 |
1 |
H200X-14C |
14 |
10 |
10 |
3840x2400 |
1 |
H200X-8C |
8 |
16 |
16 |
3840x2400 |
1 |
H200X-7C |
7 |
20 |
20 |
3840x2400 |
1 |
H200X-4C |
4 |
32 |
32 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H100L-7-94C |
96 |
1 |
7 |
7 |
MIG 7g.94gb |
H100L-4-47C |
48 |
1 |
4 |
4 |
MIG 4g.47gb |
H100L-3-47C |
48 |
2 |
3 |
3 |
MIG 3g.47gb |
H100L-2-24C |
24 |
3 |
2 |
2 |
MIG 2g.24gb |
H100L-1-24C [3] |
24 |
4 |
1 |
1 |
MIG 1g.24gb |
H100L-1-12C |
12 |
7 |
1 |
1 |
MIG 1g.12gb |
H100L-1-12CME [3] |
12 |
1 |
1 |
1 |
MIG 1g.12gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H100L-94C |
96 |
1 |
1 |
3840x2400 |
1 |
H100L-47C |
48 |
2 |
2 |
3840x2400 |
1 |
H100L-23C |
23 |
4 |
4 |
3840x2400 |
1 |
H100L-15C |
15 |
6 |
4 |
3840x2400 |
1 |
H100L-11C |
11 |
8 |
8 |
3840x2400 |
1 |
H100L-6C |
6 |
15 |
8 |
3840x2400 |
1 |
H100L-4C |
4 |
23 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H100XL-7-94C |
96 |
1 |
7 |
7 |
MIG 7g.94gb |
H100XL-4-47C |
48 |
1 |
4 |
4 |
MIG 4g.47gb |
H100XL-3-47C |
48 |
2 |
3 |
3 |
MIG 3g.47gb |
H100XL-2-24C |
24 |
3 |
2 |
2 |
MIG 2g.24gb |
H100XL-1-24C [3] |
24 |
4 |
1 |
1 |
MIG 1g.24gb |
H100XL-1-12C |
12 |
7 |
1 |
1 |
MIG 1g.12gb |
H100XL-1-12CME [3] |
12 |
1 |
1 |
1 |
MIG 1g.12gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H100XL-94C |
96 |
1 |
1 |
3840x2400 |
1 |
H100XL-47C |
48 |
2 |
2 |
3840x2400 |
1 |
H100XL-23C |
23 |
4 |
4 |
3840x2400 |
1 |
H100XL-15C |
15 |
6 |
4 |
3840x2400 |
1 |
H100XL-11C |
11 |
8 |
8 |
3840x2400 |
1 |
H100XL-6C |
6 |
15 |
8 |
3840x2400 |
1 |
H100XL-4C |
4 |
23 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H100-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
H100-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
H100-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
H100-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
H100-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
H100-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
H100-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H100-80 |
81 |
1 |
1 |
3840x2400 |
1 |
H100-40C |
40 |
2 |
2 |
3840x2400 |
1 |
H100-20C |
20 |
4 |
4 |
3840x2400 |
1 |
H100-16C |
16 |
6 |
4 |
3840x2400 |
1 |
H100-10C |
10 |
8 |
8 |
3840x2400 |
1 |
H100-8C |
8 |
10 |
8 |
3840x2400 |
1 |
H100-5C |
5 |
16 |
16 |
3840x2400 |
1 |
H100-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H100XM-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
H100XM-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
H100XM-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
H100XM-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
H100XM-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
H100XM-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
H100XM-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H100XM-80C |
81 |
1 |
1 |
3840x2400 |
1 |
H100XM-40C |
40 |
2 |
2 |
3840x2400 |
1 |
H100XM-20C |
20 |
4 |
4 |
3840x2400 |
1 |
H100XM-16C |
16 |
6 |
4 |
3840x2400 |
1 |
H100XM-10C |
10 |
8 |
8 |
3840x2400 |
1 |
H100XM-8C |
8 |
10 |
8 |
3840x2400 |
1 |
H100XM-5C |
5 |
16 |
16 |
3840x2400 |
1 |
H100XM-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H100XS-7-64C |
65 |
1 |
7 |
7 |
MIG 7g.64gb |
H100XS-4-32C |
32 |
1 |
4 |
4 |
MIG 4g.32gb |
H100XS-3-32C |
32 |
2 |
3 |
3 |
MIG 3g.32gb |
H100XS-2-16C |
16 |
3 |
2 |
2 |
MIG 2g.16gb |
H100XS-1-16C [3] |
16 |
4 |
1 |
1 |
MIG 1g.16gb |
H100XS-1-8C |
8 |
7 |
1 |
1 |
MIG 1g.8gb |
H100XS-1-8CME [3] |
8 |
1 |
1 |
1 |
MIG 1g.8gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H100XS-64C |
65 |
1 |
1 |
3840x2400 |
1 |
H100XS-32C |
32 |
2 |
2 |
3840x2400 |
1 |
H100XS-16C |
16 |
4 |
4 |
3840x2400 |
1 |
H100XS-8C |
8 |
8 |
8 |
3840x2400 |
1 |
H100XS-4C |
4 |
16 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H20X-7-141C |
144 |
1 |
7 |
7 |
MIG 7g.141gb |
H20X-4-71C |
72 |
1 |
4 |
4 |
MIG 4g.71gb |
H20X-3-71C |
72 |
2 |
3 |
3 |
MIG 3g.71gb |
H20X-2-35C |
35 |
3 |
2 |
2 |
MIG 2g.35gb |
H20X-1-35C [3] |
35 |
4 |
1 |
1 |
MIG 1g.35gb |
H20X-1-18C |
18 |
7 |
1 |
1 |
MIG 1g.18gb |
H20X-1-18CME [3] |
18 |
1 |
1 |
1 |
MIG 1g.18gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H20X-141C |
144 |
1 |
1 |
3840x2400 |
1 |
H20X-70C |
71 |
2 |
2 |
3840x2400 |
1 |
H20X-35C |
35 |
4 |
4 |
3840x2400 |
1 |
H20X-28C |
28 |
5 |
5 |
3840x2400 |
1 |
H20X-17C |
17 |
8 |
8 |
3840x2400 |
1 |
H20X-14C |
14 |
10 |
10 |
3840x2400 |
1 |
H20X-8C |
8 |
16 |
16 |
3840x2400 |
1 |
H20X-7C |
7 |
20 |
20 |
3840x2400 |
1 |
H20X-4C |
4 |
32 |
32 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
H20-7-96C |
98 |
1 |
7 |
7 |
MIG 7g.96gb |
H20-4-48C |
49 |
1 |
4 |
4 |
MIG 4g.48gb |
H20-3-48C |
49 |
2 |
3 |
3 |
MIG 3g.48gb |
H20-2-24C |
24 |
3 |
2 |
2 |
MIG 2g.24gb |
H20-1-24C [3] |
24 |
4 |
1 |
1 |
MIG 1g.24gb |
H20-1-12C |
12 |
7 |
1 |
1 |
MIG 1g.12gb |
H20-1-12CME [3] |
12 |
1 |
1 |
1 |
MIG 1g.12gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
H20-96C |
98 |
1 |
1 |
3840x2400 |
1 |
H20-48C |
49 |
2 |
2 |
3840x2400 |
1 |
H20-24C |
24 |
4 |
2 |
3840x2400 |
1 |
H20-16C |
16 |
6 |
4 |
3840x2400 |
1 |
H20-12C |
12 |
8 |
4 |
3840x2400 |
1 |
H20-6C |
6 |
16 |
8 |
3840x2400 |
1 |
H20-4C |
4 |
24 |
8 |
3840x2400 |
1 |
NVIDIA Ada Lovelace GPU Architecture#
Physical GPUs per board: 1
The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.
Required license edition: NVIDIA AI Enterprise
Intended use cases:
vGPUs with more than 40 GB of framebuffer: Training Workloads
vGPUs with 40 GB of framebuffer: Inference Workloads
These vGPU types support a single display with a fixed maximum resolution.
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
L40-48C |
49 |
1 |
1 |
3840x2400 |
1 |
L40-24C |
24 |
2 |
2 |
3840x2400 |
1 |
L40-16C |
16 |
3 |
2 |
3840x2400 |
1 |
L40-12C |
12 |
4 |
4 |
3840x2400 |
1 |
L40-8C |
8 |
6 |
4 |
3840x2400 |
1 |
L40-6C |
6 |
8 |
8 |
3840x2400 |
1 |
L40-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
L40S-48C |
49 |
1 |
1 |
3840x2400 |
1 |
L40S-24C |
24 |
2 |
2 |
3840x2400 |
1 |
L40S-16C |
16 |
3 |
2 |
3840x2400 |
1 |
L40S-12C |
12 |
4 |
4 |
3840x2400 |
1 |
L40S-8C |
8 |
6 |
4 |
3840x2400 |
1 |
L40S-6C |
6 |
8 |
8 |
3840x2400 |
1 |
L40S-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
L20-48C |
49 |
1 |
1 |
3840x2400 |
1 |
L20-24C |
24 |
2 |
2 |
3840x2400 |
1 |
L20-16C |
16 |
3 |
2 |
3840x2400 |
1 |
L20-12C |
12 |
4 |
4 |
3840x2400 |
1 |
L20-8C |
8 |
6 |
4 |
3840x2400 |
1 |
L20-6C |
6 |
8 |
8 |
3840x2400 |
1 |
L20-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
L20-48C |
49 |
1 |
1 |
3840x2400 |
1 |
L20-24C |
24 |
2 |
2 |
3840x2400 |
1 |
L20-16C |
16 |
3 |
2 |
3840x2400 |
1 |
L20-12C |
12 |
4 |
4 |
3840x2400 |
1 |
L20-8C |
8 |
6 |
4 |
3840x2400 |
1 |
L20-6C |
6 |
8 |
8 |
3840x2400 |
1 |
L20-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
L4-24C |
24 |
1 |
1 |
3840x2400 |
1 |
L4-12C |
12 |
2 |
2 |
3840x2400 |
1 |
L4-8C |
8 |
3 |
2 |
3840x2400 |
1 |
L4-6C |
6 |
4 |
4 |
3840x2400 |
1 |
L4-4C |
4 |
6 |
4 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
L2-24C |
24 |
1 |
1 |
3840x2400 |
1 |
L2-12C |
12 |
2 |
2 |
3840x2400 |
1 |
L2-8C |
8 |
3 |
2 |
3840x2400 |
1 |
L2-6C |
6 |
4 |
4 |
3840x2400 |
1 |
L2-4C |
4 |
6 |
4 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
RTX 6000 Ada-48C |
49 |
1 |
1 |
3840x2400 |
1 |
RTX 6000 Ada-24C |
24 |
2 |
2 |
3840x2400 |
1 |
RTX 6000 Ada-16C |
16 |
3 |
2 |
3840x2400 |
1 |
RTX 6000 Ada-12C |
12 |
4 |
4 |
3840x2400 |
1 |
RTX 6000 Ada-8C |
8 |
6 |
4 |
3840x2400 |
1 |
RTX 6000 Ada-6C |
6 |
8 |
8 |
3840x2400 |
1 |
RTX 6000 Ada-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
RTX 5880 Ada-48C |
49 |
1 |
1 |
3840x2400 |
1 |
RTX 5880 Ada-24C |
24 |
2 |
2 |
3840x2400 |
1 |
RTX 5880 Ada-16C |
16 |
3 |
2 |
3840x2400 |
1 |
RTX 5880 Ada-12C |
12 |
4 |
4 |
3840x2400 |
1 |
RTX 5880 Ada-8C |
8 |
6 |
4 |
3840x2400 |
1 |
RTX 5880 Ada-6C |
6 |
8 |
8 |
3840x2400 |
1 |
RTX 5880 Ada-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
RTX 5000 Ada-32C |
32 |
1 |
1 |
3840x2400 |
1 |
RTX 5000 Ada-16C |
16 |
2 |
2 |
3840x2400 |
1 |
RTX 5000 Ada-8C |
8 |
4 |
4 |
3840x2400 |
1 |
RTX 5000 Ada-4C |
4 |
8 |
8 |
3840x2400 |
1 |
NVIDIA Ampere GPU Architecture#
Physical GPUs per board: 1 (with the exception of NVIDIA A16)
The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.
Required license edition: NVIDIA AI Enterprise
Intended use cases:
vGPUs with more than 40 GB of framebuffer: Training Workloads
vGPUs with 40 GB of framebuffer: Inference Workloads
These vGPU types support a single display with a fixed maximum resolution.
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A40-48C |
49 |
1 |
1 |
3840x2400 |
1 |
A40-24C |
24 |
2 |
2 |
3840x2400 |
1 |
A40-16C |
16 |
3 |
2 |
3840x2400 |
1 |
A40-12C |
12 |
4 |
4 |
3840x2400 |
1 |
A40-8C |
8 |
6 |
4 |
3840x2400 |
1 |
A40-6C |
6 |
8 |
8 |
3840x2400 |
1 |
A40-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Physical GPUs per board: 4
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A10-24C |
24 |
1 |
1 |
3840x2400 |
1 |
A10-12C |
12 |
2 |
2 |
3840x2400 |
1 |
A10-8C |
8 |
3 |
2 |
3840x2400 |
1 |
A10-6C |
6 |
4 |
4 |
3840x2400 |
1 |
A10-4C |
4 |
6 |
4 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
RTXA6000-48C |
49 |
1 |
1 |
3840x2400 |
1 |
RTXA6000-24C |
24 |
2 |
2 |
3840x2400 |
1 |
RTXA6000-16C |
16 |
3 |
2 |
3840x2400 |
1 |
RTXA6000-12C |
12 |
4 |
4 |
3840x2400 |
1 |
RTXA6000-8C |
8 |
6 |
4 |
3840x2400 |
1 |
RTXA6000-6C |
6 |
8 |
8 |
3840x2400 |
1 |
RTXA6000-4C |
4 |
12 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
RTXA5500-24C |
24 |
1 |
1 |
3840x2400 |
1 |
RTXA5500-12C |
12 |
2 |
2 |
3840x2400 |
1 |
RTXA5500-8C |
8 |
3 |
2 |
3840x2400 |
1 |
RTXA5500-6C |
6 |
4 |
4 |
3840x2400 |
1 |
RTXA5500-4C |
4 |
6 |
4 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
RTXA5000-24 |
24 |
1 |
1 |
3840x2400 |
1 |
RTXA5000-12C |
12 |
2 |
2 |
3840x2400 |
1 |
RTXA5000-8C |
8 |
3 |
2 |
3840x2400 |
1 |
RTXA5000-6C |
6 |
4 |
4 |
3840x2400 |
1 |
RTXA5000-4C |
4 |
6 |
4 |
3840x2400 |
1 |
MIG-Backed and Time-Sliced NVIDIA vGPU for Compute for the NVIDIA Ampere GPU Architecture#
Physical GPUs per board: 1
The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.
Required license edition: NVIDIA AI Enterprise
MIG-Backed NVIDIA vGPU for Compute
For details on GPU instance profiles, refer to the NVIDIA Multi-Instance GPU User Guide.
Time-Sliced NVIDIA vGPU for Compute
Intended use cases:
vGPUs with more than 40 GB of framebuffer: Training Workloads
vGPUs with 40 GB of framebuffer: Inference Workloads
These vGPU types support a single display with a fixed maximum resolution.
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A800D-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A800D-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A800D-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A800D-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A800D-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A800D-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A800D-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A800D-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A800D-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A800D-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A800D-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A800D-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A800D-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A800D-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A800D-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A800D-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A800D-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A800D-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A800D-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A800D-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A800D-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A800D-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A800D-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A800D-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A800D-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A800D-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A800D-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A800D-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A800D-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A800D-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A800D-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A800D-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A800D-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A800D-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A800D-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A800D-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A800D-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A800D-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A800D-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A800D-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A800D-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A800D-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A800-7-40C |
40 |
1 |
7 |
7 |
MIG 7g.40gb |
A800-4-20C |
20 |
1 |
4 |
4 |
MIG 4g.20gb |
A800-3-20C |
20 |
2 |
3 |
3 |
MIG 3g.20gb |
A800-2-10C |
10 |
3 |
2 |
2 |
MIG 2g.10gb |
A800-1-10C [3] |
10 |
4 |
1 |
1 |
MIG 1g.10gb |
A800-1-5C |
5 |
7 |
1 |
1 |
MIG 1g.5gb |
A800-1-5CME [3] |
5 |
1 |
1 |
1 |
MIG 1g.5gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A800-40C |
40 |
1 |
1 |
3840x2400 |
1 |
A800-20C |
20 |
2 |
2 |
3840x2400 |
1 |
A800-10C |
10 |
4 |
4 |
3840x2400 |
1 |
A800-8C |
8 |
5 |
4 |
3840x2400 |
1 |
A800-5C |
5 |
8 |
8 |
3840x2400 |
1 |
A800-4C |
4 |
10 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A800DX-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A800DX-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A800DX-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A800DX-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A800DX-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A800DX-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A800DX-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A800DX-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A800DX-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A800DX-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A800DX-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A800DX-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A800DX-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A800DX-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A100D-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A100D-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A100D-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A100D-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A100D-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A100D-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A100D-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A100D-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A100D-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A100D-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A100D-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A100D-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A100D-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A100D-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A100D-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A100D-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A100D-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A100D-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A100D-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A100D-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A100D-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A100D-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A100D-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A100D-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A100D-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A100D-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A100D-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A100D-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A100D-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A100D-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A100D-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A100D-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A100D-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A100D-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A100D-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A100D-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A100D-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A100D-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A100D-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A100D-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A100D-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A100D-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A100DX-7-80C |
81 |
1 |
7 |
7 |
MIG 7g.80gb |
A100DX-4-40C |
40 |
1 |
4 |
4 |
MIG 4g.40gb |
A100DX-3-40C |
40 |
2 |
3 |
3 |
MIG 3g.40gb |
A100DX-2-20C |
20 |
3 |
2 |
2 |
MIG 2g.20gb |
A100DX-1-20C [3] |
20 |
4 |
1 |
1 |
MIG 1g.20gb |
A100DX-1-10C |
10 |
7 |
1 |
1 |
MIG 1g.10gb |
A100DX-1-10CME [3] |
10 |
1 |
1 |
1 |
MIG 1g.10gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A100DX-80C |
81 |
1 |
1 |
3840x2400 |
1 |
A100DX-40C |
40 |
2 |
2 |
3840x2400 |
1 |
A100DX-20C |
20 |
4 |
4 |
3840x2400 |
1 |
A100DX-16C |
16 |
5 |
4 |
3840x2400 |
1 |
A100DX-10C |
10 |
8 |
8 |
3840x2400 |
1 |
A100DX-8C |
8 |
10 |
8 |
3840x2400 |
1 |
A100DX-4C |
4 |
20 |
16 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A100-7-40C |
40 |
1 |
7 |
7 |
MIG 7g.40gb |
A100-4-20C |
20 |
1 |
4 |
4 |
MIG 4g.20gb |
A100-3-20C |
20 |
2 |
3 |
3 |
MIG 3g.20gb |
A100-2-10C |
10 |
3 |
2 |
2 |
MIG 2g.10gb |
A100-1-10C [3] |
10 |
4 |
1 |
1 |
MIG 1g.10gb |
A100-1-5C |
5 |
7 |
1 |
1 |
MIG 1g.5gb |
A100-1-5CME [3] |
5 |
1 |
1 |
1 |
MIG 1g.5gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A100-40C |
40 |
1 |
1 |
3840x2400 |
1 |
A100-20C |
20 |
2 |
2 |
3840x2400 |
1 |
A100-10C |
10 |
4 |
4 |
3840x2400 |
1 |
A100-8C |
8 |
5 |
4 |
3840x2400 |
1 |
A100-5C |
5 |
8 |
8 |
3840x2400 |
1 |
A100-4C |
4 |
10 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A100X-7-40C |
40 |
1 |
7 |
7 |
MIG 7g.40gb |
A100X-4-20C |
20 |
1 |
4 |
4 |
MIG 4g.20gb |
A100X-3-20C |
20 |
2 |
3 |
3 |
MIG 3g.20gb |
A100X-2-10C |
10 |
3 |
2 |
2 |
MIG 2g.10gb |
A100X-1-10C [3] |
10 |
4 |
1 |
1 |
MIG 1g.10gb |
A100X-1-5C |
5 |
7 |
1 |
1 |
MIG 1g.5gb |
A100X-1-5CME [3] |
5 |
1 |
1 |
1 |
MIG 1g.5gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A100X-40C |
40 |
1 |
1 |
3840x2400 |
1 |
A100X-20C |
20 |
2 |
2 |
3840x2400 |
1 |
A100X-10C |
10 |
4 |
4 |
3840x2400 |
1 |
A100X-8C |
8 |
5 |
4 |
3840x2400 |
1 |
A100X-5C |
5 |
8 |
8 |
3840x2400 |
1 |
A100X-4C |
4 |
10 |
8 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A30-4-24C |
24 |
1 |
4 |
4 |
MIG 4g.24gb |
A30-2-12C |
12 |
2 |
2 |
2 |
MIG 2g.12gb |
A30-2-12CME [3] |
12 |
1 |
2 |
2 |
MIG 2g.12gb+me |
A30-1-6C |
6 |
4 |
1 |
1 |
MIG 1g.6gb |
A30-1-6CME [3] |
6 |
1 |
1 |
1 |
MIG 1g.6gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A30-24C |
24 |
1 |
1 |
3840x2400 |
1 |
A30-12C |
12 |
2 |
2 |
3840x2400 |
1 |
A30-8C |
8 |
3 |
2 |
3840x2400 |
1 |
A30-6C |
6 |
4 |
4 |
3840x2400 |
1 |
A30-4C |
4 |
6 |
4 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A30-4-24C |
24 |
1 |
4 |
4 |
MIG 4g.24gb |
A30-2-12C |
12 |
2 |
2 |
2 |
MIG 2g.12gb |
A30-2-12CME [3] |
12 |
1 |
2 |
2 |
MIG 2g.12gb+me |
A30-1-6C |
6 |
4 |
1 |
1 |
MIG 1g.6gb |
A30-1-6CME [3] |
6 |
1 |
1 |
1 |
MIG 1g.6gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A30-24C |
24 |
1 |
1 |
3840x2400 |
1 |
A30-12C |
12 |
2 |
2 |
3840x2400 |
1 |
A30-8C |
8 |
3 |
2 |
3840x2400 |
1 |
A30-6C |
6 |
4 |
4 |
3840x2400 |
1 |
A30-4C |
4 |
6 |
4 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Slices per vGPU |
Compute Instances per vGPU |
Corresponding GPU Instance Profile |
---|---|---|---|---|---|
A30-4-24C |
24 |
1 |
4 |
4 |
MIG 4g.24gb |
A30-2-12C |
12 |
2 |
2 |
2 |
MIG 2g.12gb |
A30-2-12CME [3] |
12 |
1 |
2 |
2 |
MIG 2g.12gb+me |
A30-1-6C |
6 |
4 |
1 |
1 |
MIG 1g.6gb |
A30-1-6CME [3] |
6 |
1 |
1 |
1 |
MIG 1g.6gb+me |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU in Equal-Size Mode |
Maximum vGPUs per GPU in Mixed-Size Mode |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|
A30-24C |
24 |
1 |
1 |
3840x2400 |
1 |
A30-12C |
12 |
2 |
2 |
3840x2400 |
1 |
A30-8C |
8 |
3 |
2 |
3840x2400 |
1 |
A30-6C |
6 |
4 |
4 |
3840x2400 |
1 |
A30-4C |
4 |
6 |
4 |
3840x2400 |
1 |
NVIDIA Turing GPU Architecture#
Physical GPUs per board: 1
The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.
This GPU does not support mixed-size mode.
Intended use cases:
vGPUs with more than 40 GB of framebuffer: Training Workloads
vGPUs with 40 GB of framebuffer: Inference Workloads
Required license edition: NVIDIA AI Enterprise
These vGPU types support a single display with a fixed maximum resolution.
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|
RTX6000P-24C |
24 |
1 |
3840x2400 |
1 |
RTX6000P-12C |
12 |
2 |
3840x2400 |
1 |
RTX6000P-8C |
8 |
3 |
3840x2400 |
1 |
RTX6000P-6C |
6 |
4 |
3840x2400 |
1 |
RTX6000P-4C |
4 |
6 |
3840x2400 |
1 |
Virtual GPU Type |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|
RTX8000P-48C |
49 |
1 |
3840x2400 |
1 |
RTX8000P-24C |
24 |
2 |
3840x2400 |
1 |
RTX8000P-16C |
16 |
3 |
3840x2400 |
1 |
RTX8000P-12C |
12 |
4 |
3840x2400 |
1 |
RTX8000P-8C |
8 |
6 |
3840x2400 |
1 |
RTX8000P-6C |
6 |
8 |
3840x2400 |
1 |
RTX8000P-4C |
4 |
8 |
3840x2400 |
1 |
NVIDIA Volta GPU Architecture#
Physical GPUs per board: 1
The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.
This GPU does not support mixed-size mode.
Intended use cases:
vGPUs with more than 40 GB of framebuffer: Training Workloads
vGPUs with 40 GB of framebuffer: Inference Workloads
Required license edition: NVIDIA AI Enterprise
These vGPU types support a single display with a fixed maximum resolution.
Virtual GPU Type |
Intended Use Case |
Framebuffer (GB) |
Maximum vGPUs per GPU |
Maximum vGPUs per Board |
Maximum Display Resolution [4] |
Virtual Displays per vGPU |
---|---|---|---|---|---|---|
V100L-16C |
Training Workloads |
16 |
1 |
1 |
3840x2400 |
1 |
V100L-8C |
Training Workloads |
8 |
2 |
2 |
3840x2400 |
1 |
V100L-4C |
Inference Workloads |
4 |
4 |
4 |
3840x2400 |
1 |
vGPU for Compute FAQs#
Q. What are the differences between NVIDIA vGPU for Compute and GPU passthrough?
NVIDIA vGPU for Compute and GPU passthrough are two different approaches to deploying NVIDIA GPUs in a virtualized environment supported by NVIDIA AI Enterprise. NVIDIA vGPU for Compute enables multiple VMs to share a single physical GPU concurrently. This approach is highly cost-effective and scalable because GPU resources are efficiently distributed among various workloads. It also delivers excellent compute performance while utilizing NVIDIA drivers. vGPU deployments offer live migration and suspend/resume capabilities, providing greater flexibility in VM management. In contrast, GPU passthrough dedicates an entire physical GPU to a single VM. While this provides maximum performance as the VM has exclusive access to the GPU, it does not support live migration or suspend/resume features. Since the GPU cannot be shared with other VMs, passthrough is less scalable and is typically more suitable for workloads that demand dedicated GPU power.
Q. Where do I download the NVIDIA vGPU for Compute from?
NVIDIA vGPU for Compute is available to download from the NVIDIA AI Enterprise Infra Collection, which you can access by logging in to the NVIDIA NGC Catalog. If you have not already purchased NVIDIA AI Enterprise and want to try it, you can obtain a NVIDIA AI Enterprise 90 Day Trial License.
Q. What is the difference between vGPU and MIG?
The fundamental distinction between vGPU and MIG lies in their approach to GPU resource partitioning.
MIG (Multi-Instance GPU) employs spatial partitioning, dividing a single GPU into several independent, isolated instances. Each MIG instance possesses its own dedicated compute cores, memory, and resources, operating simultaneously and independently. This architecture guarantees predictable performance by eliminating resource contention. While an entire MIG-enabled GPU can be passed through to a single VM, individual MIG instances cannot be directly assigned to multiple VMs without the integration of vGPU. For multi-tenancy across VMs utilizing MIG, vGPU is essential. It empowers the hypervisor to manage and allocate distinct MIG-backed vGPUs to different virtual machines. Once assigned, each MIG instance functions as a separate, isolated GPU, delivering strict resource isolation and consistent performance for workloads. For more information on using vGPU with MIG, refer to the technical brief.
vGPU (Virtual GPU) utilizes temporal partitioning. This method allows multiple virtual machines to share GPU resources by alternating access through a time-slicing mechanism. The GPU scheduler dynamically assigns time slices to each VM, effectively balancing workload demands. While this approach offers greater flexibility and higher GPU utilization, performance can vary based on the specific demands of the concurrent workloads. To enable multi-tenancy, where multiple VMs share a single physical GPU, vGPU is a prerequisite. Without vGPU, a GPU can only be assigned to one VM at a time, thereby limiting scalability and overall resource efficiency.
Q. What is the difference between time-sliced vGPUs and MIG-backed vGPUs?
Time-sliced vGPUs and MIG-backed vGPUs are two different approaches to sharing GPU resources in virtualized environments. Here are the key differences:
Differences Between Time-Sliced and MIG-Backed vGPUs# Time-sliced vGPUs
MIG-backed vGPUs
Share the entire GPU among multiple VMs.
Partition the GPU into smaller, dedicated instances.
Each vGPU gets full access to all streaming multiprocessors (SMs) and engines, but only for a specific time slice.
Each vGPU gets exclusive access to a portion of the GPU’s memory and compute resources.
Processes run in series, with each vGPU waiting while others use the GPU.
Processes run in parallel on dedicated hardware slices.
The number of VMs per GPU is limited only by framebuffer size.
Depending on the number of MIG instances supported on a GPU, this can range from 4 to 7 VMs per GPU.
Better for workloads that require occasional bursts of full GPU power.
Provides better performance isolation and more consistent latency.
Q. Where can I find more information on the NVIDIA License System (NLS), which is the licensing solution for vGPU for Compute?
You can refer to the NVIDIA License System documentation and the NLS FAQ.
Footnotes
NVIDIA HGX A100 4-GPU baseboard with four fully connected GPUs
NVIDIA HGX A100 8-GPU baseboards with eight fully connected GPUs
Fully connected means that each GPU is connected to every other GPU on the baseboard.
These vGPU types are supported on ESXi, starting with vSphere 8.0 update 3.
NVIDIA vGPU for Compute is optimized for compute-intensive workloads. As a result, they support only a single display head and do not provide Quadro graphics acceleration.
The SXM GPU Boards listed are supported only on VMware vSphere 7.x and 8.x.
Refers to Linux with KVM hypervisors listed in the NVIDIA AI Enterprise Infrastructure Support Matrix.
NVSwitch for B200 HGX is currently supported only on Linux with KVM hypervisors.