NVIDIA vGPU for Compute#

NVIDIA AI Enterprise is a cloud-native suite of software tools, libraries and frameworks designed to deliver optimized performance, robust security, and stability for production AI deployments. Easy-to-use microservices optimize model performance with enterprise-grade security, support, and stability, ensuring a streamlined transition from prototype to production for enterprises that run their businesses on AI. It consists of two primary layers: the application layer and the infrastructure layer.

NVIDIA vGPU for Compute is licensed exclusively through NVIDIA AI Enterprise. NVIDIA vGPU for Compute enables multiple virtual machines (VM) to have simultaneous, direct access to a single physical GPU while offering compute capabilities required for AI model training, fine tuning, and inference workloads. By distributing GPU resources efficiently across multiple VMs, NVIDIA vGPU for Compute optimizes utilization and lowers overall hardware costs. In addition, it offers advanced monitoring and management capabilities, including Suspend/Resume, Live Migration and Warm Updates, making it ideal for Cloud Service Providers (CSPs) and organizations that require scalable, cost effective GPU acceleration.

Key Concepts#

Glossary#

Commonly Used Terms#

Term

Definition

NVIDIA Virtual GPU (vGPU) Manager

The Virtual GPU (vGPU) Manager enables GPU virtualization by allowing multiple VMs to share a physical GPU, optimizing GPU allocation for different workloads. The NVIDIA Virtual GPU Manager is installed on the hypervisor.

NVIDIA vGPU for Compute Guest Driver

The NVIDIA vGPU for Compute Guest Driver is installed on each VM’s operating system, allowing it to fully leverage the virtualized GPU resources. The Guest Driver provides the necessary interface and support to ensure that applications running within the VMs can fully leverage the GPU’s capabilities, similar to how they would on a physical machine with a dedicated GPU.

NVIDIA Licensing System

The NVIDIA Licensing System for NVIDIA AI Enterprise manages the software licenses required to use NVIDIA’s AI tools and infrastructure. This system ensures that organizations are compliant with licensing terms while providing flexibility in managing and deploying NVIDIA AI Enterprise across their infrastructure.

NVIDIA AI Enterprise Infra Collection

The NVIDIA AI Enterprise Infrastructure (Infra) Collection hosted on NVIDIA AI Enterprise Infra Collection is a suite of software and tools designed to support the deployment and management of AI workloads in enterprise environments. The NVIDIA AI Enterprise Infra Collection provides a robust and scalable foundation for running AI workloads, ensuring that enterprises can leverage the full power of NVIDIA GPUs and software to accelerate their AI initiatives.

The NVIDIA vGPU for Compute Drivers can be downloaded from the NVIDIA AI Enterprise Infra Collection.

NVIDIA vGPU Architecture Overview#

The high-level architecture of the NVIDIA vGPU is illustrated in the following diagram. Under the control of the NVIDIA Virtual GPU Manager (running on the hypervisor), a single NVIDIA physical GPU is capable of supporting multiple virtual GPU devices (vGPUs) that can be assigned directly to guest VMs, each functioning like a dedicated GPU.

Guest VMs use NVIDIA vGPUs in the same manner as a physical GPU that has been passed through by the hypervisor: the NVIDIA vGPU for Compute driver loaded in the guest VM provides direct access to the GPU for performance-critical fast paths.

MIG-Backed Time-Sliced vGPU

Each NVIDIA vGPU is analogous to a conventional GPU with a fixed amount of GPU framebuffer/memory. The vGPU’s framebuffer is allocated out of the physical GPU’s framebuffer at the time the vGPU is created, and the vGPU retains exclusive use of that framebuffer until it is destroyed.

NVIDIA vGPU for Compute Configurations#

Depending on the physical GPU, NVIDIA vGPU for Compute supports different types of vGPU modes on a physical GPU:

  • Time-sliced vGPUs can be created on all NVIDIA AI Enterprise supported GPUs.

  • Additionally, on GPUs that support the Multi-Instance GPU (MIG) feature, the following types of MIG-backed vGPU are supported:

    • MIG-backed vGPUs that vGPUs that occupy an entire GPU instance

    • Time-sliced, MIG-backed vGPUs

Supported vGPU Modes#

vGPU Mode

Description

GPU Partitioning

Isolation

Use Cases

Time Sliced vGPU

A time-sliced vGPU for Compute VM shares access to all of the GPU’s compute resources, including streaming multiprocessors (SMs), and GPU engines with other vGPUs on the same GPU. Processes are scheduled sequentially, with each vGPU for Compute VM gaining exclusive use of GPU engines during its time slice.

Temporal

Strong hardware-based memory and fault isolation. Good performance and QoS with round-robin scheduling.

Deployments with non-strict isolation requirements, or in environments where MIG-backed vGPU is not available. Suitable for light to moderate AI workloads such as small-scale inferencing, preprocessing pipelines, and development/testing of models in a pre-training phase.

MIG-backed vGPU

A MIG-backed vGPU for Compute VM is created from one or more MIG slices and assigned to a VM on a MIG-capable physical GPU. Each MIG-backed vGPU for Compute VM has exclusive access to the compute resources of its GPU instance, including SMs and GPU engines. On a MIG-backed vGPU for Compute VM, processes running on one VM execute in parallel with processes running on other vGPUs on the same physical GPU. Each process runs only on its assigned vGPU, alongside processes on other vGPUs. For more information on configuring MIG-backed vGPU VMs, refer to the Virtual GPU Types for Supported GPUs.

Spatial

Strong hardware-based memory and fault isolation. Better performance and QoS with dedicated cache/memory bandwidth and lower scheduling latency.

Most virtualization deployments require strong isolation, multi-tenancy, and consistent performance. Well-suited for consistent high-performance AI inferencing, multi-tenant fine-tuning jobs, or parallel execution of small to medium training tasks with predictable throughput requirements.

Time Sliced, MIG-Backed vGPU

A time-sliced, MIG-backed vGPU for Compute VM occupies only a fraction of a MIG instance on a MIG-capable physical GPU. On a time-sliced, MIG-backed vGPU for Compute VM, processes are scheduled sequentially as each VM shares access to the GPU instance’s compute resources, including the SMs and compute engines with all other vGPUs on the same MIG instance. This mode was introduced with the RTX Pro 6000 Blackwell Server Edition. For more information on configuring MIG-backed vGPU VMs, refer to the Virtual GPU Types for Supported GPUs.

Spatial partitioning between MIG instances. Temporal partitioning within each MIG instance.

Strong hardware-based memory and fault isolation. Better performance and QoS with dedicated cache/memory bandwidth and lower scheduling latency.

Most virtualization deployments require strong isolation, multi-tenancy, and consistent performance, while maximizing GPU utilization. Ideal for high-density AI workloads such as serving multiple concurrent inferencing endpoints, hosting AI models across multiple tenants, or running light training jobs on shared GPU resources.

Installing NVIDIA vGPU for Compute#

Prerequisites#

System Requirements#

Before proceeding, ensure the following system prerequisites are met:

  • At least one NVIDIA data center GPU in a single NVIDIA AI Enterprise compatible NVIDIA-Certified System. NVIDIA recommends using the following GPUs based on your infrastructure.

    System Requirements Use Cases#

    Use Case

    GPU

    Adding AI to mainstream servers (single to 4-GPU NVLink)

    • NVIDIA A30

    • 1- 8x NVIDIA L4

    • NVIDIA L40S

    • NVIDIA H100 NVL

    • NVIDIA H200 NVL

    • NVIDIA RTX Pro 6000 Blackwell Server Edition

    AI Model Inference

    • NVIDIA A100

    • NVIDIA H200 NVL

    • NVIDIA RTX Pro 6000 Blackwell Server Edition

    AI Model Training (Large) and Inference (HGX Scale Up and Out Server)

    • NVIDIA H100 HGX

    • NVIDIA H200 HGX

    • NVIDIA B200 HGX

  • If using GPUs based on the NVIDIA Ampere architecture or later, the following BIOS settings are enabled on your server platform:

    • Single Root I/O Virtualization (SR-IOV) - Enabled

    • VT-d/IOMMU - Enabled

  • NVIDIA AI Enterprise License

  • NVIDIA AI Enterprise Software:

    • NVIDIA Virtual GPU Manager

    • NVIDIA vGPU for Compute Guest Driver

You can leverage the NVIDIA System Management interface (NV-SMI) management and monitoring tool for testing and benchmarking.

The following server configuration details are considered best practices:

  • Hyperthreading - Enabled

  • Power Setting or System Profile - High Performance

  • CPU Performance - Enterprise or High Throughput (if available in the BIOS)

  • Memory Mapped I/O above 4-GB - Enabled (if available in the BIOS)

Installing NGC CLI#

To access the NVIDIA Virtual GPU Manager and NVIDIA vGPU for Compute Guest Driver, you must first download and install the NGC Catalog CLI. After the NGC Catalog CLI is installed, you will need to launch a command window and run the following commands to download the drivers.

To install the NGC Catalog CLI:

  1. Login to the NVIDIA NGC Catalog.

  2. In the top right corner, click Welcome and then select Setup from the menu.

  3. Click Downloads under Install NGC CLI from the Setup page.

  4. From the CLI Install page, click the Windows, Linux, or MacOS tab, according to the platform from which you will be running NGC Catalog CLI.

  5. Follow the instructions to install the CLI.

  6. Verify the installation by entering ngc --version in a terminal or command prompt. The output should be NGC Catalog CLI x.y.z where x.y.z indicates the version.

  7. You must configure NGC CLI for your use so that you can run the commands. You will be prompted to enter your NGC API Key. Enter the following command:

    $ ngc config set
    
    Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: (COPY/PASTE API KEY)
    
    Enter CLI output format type [ascii]. Choices: [ascii, csv, json]: ascii
    
    Enter org [no-org]. Choices: ['no-org']:
    
    Enter team [no-team]. Choices: ['no-team']:
    
    Enter ace [no-ace]. Choices: ['no-ace']:
    
    Successfully saved NGC configuration to /home/$username/.ngc/config
    
  8. After the NGC Catalog CLI is installed, you will need to launch a command window and run the following commands to download the software.

    • NVIDIA NVIDIA Virtual GPU Manager

      ngc registry resource download-version "nvidia/vgpu/vgpu-host-driver-X:X.X"
      
    • NVIDIA vGPU for Compute Guest Driver

      ngc registry resource download-version "nvidia/vgpu/vgpu-guest-driver-X:X.X"
      

For more information on configuring the NGC CLI, refer to the Getting Started with the NGC CLI documentation.

Installing NVIDIA Virtual GPU Manager#

The process of installing the NVIDIA Virtual GPU Manager depends on the hypervisor that you are using. This section assumes the following:

  • You have downloaded the Virtual GPU Manager software from NVIDIA NGC Catalog

  • You want to deploy the NVIDIA vGPU for Compute on a single server node

Hypervisor Platform Installation Instructions for the NVIDIA Virtual GPU Manager#

Hypervisor Platform

Installation Instructions

Red Hat Enterprise Linux KVM

Installing and Configuring the NVIDIA Virtual GPU Manager for Red Hat Enterprise Linux KVM

Ubuntu KVM

Installing and Configuring the NVIDIA Virtual GPU Manager for Ubuntu

VMware vSphere

Installing and Configuring the NVIDIA Virtual GPU Manager for VMware vSphere

After you complete this process, you can install the vGPU Guest Driver on your Guest VM.

Installing NVIDIA Fabric Manager on HGX Servers#

NVIDIA Fabric Manager must be installed in addition to the Virtual GPU Manager on NVIDIA HGX platforms in VMs to enable multi-GPU configurations required for AI training, complex simulations, and processing massive datasets. Fabric Manager is responsible for enabling and managing high-bandwidth interconnect topologies between multiple GPUs on the same node.

On Ampere, Hopper, and Blackwell HGX systems equipped with NVSwitch, Fabric Manager configures the NVSwitch memory fabric to create a unified memory fabric among all participating GPUs and monitors the supporting NVLinks, enabling the deployment of multi-GPU VMs with 1, 2, 4, or 8 GPUs.

Note

  • For information about NVIDIA Fabric Manager integration or support for deploying 1‑, 2‑, 4- or 8‑GPU VMs on your hypervisor, consult the documentation from your hypervisor vendor.

  • The Fabric Manager service must be running before creating VMs with multi-GPU configurations. Failure to enable Fabric Manager on HGX platforms may result in incomplete or non-functional GPU topologies inside the VM. For details on capabilities, configuration, and usage, refer to the NVIDIA Fabric Manager User Guide.

Installing NVIDIA vGPU Guest Driver#

The process for installing the driver is the same in a VM configured with vGPU, in a VM that is running pass-through GPU, or on a physical host in a bare-metal deployment. This section assumes the following:

  • You have downloaded the vGPU for Compute Guest Driver from NVIDIA NGC Catalog

  • The Guest VM has been created and booted on the hypervisor

After you install the NVIDIA vGPU for Compute Guest driver, you are required to license the Guest VM. After a license from the NVIDIA License System is obtained, the Guest VM operates at full capability and can be used to run AI/ML workloads.

Licensing a NVIDIA vGPU for Compute Guest VM#

Note

The NVIDIA AI Enterprise license is enforced through software when you deploy NVIDIA vGPU for Compute VMs.

When booted on a supported GPU, a vGPU for Compute VM initially operates at full capability but its performance degrades over time if the VM fails to obtain a license. In such a scenario, the full capability of the VM is restored when the license is acquired.

Once licensing is configured, a vGPU VM automatically obtains a license from the license server when booted on a supported GPU. The VM retains the license until it is shut down. It then releases the license back to the license server. Licensing settings persist across reboots and need only be modified if the license server address changes, or the VM is switched to running GPU pass through.

For more information on how to license a vGPU for Compute VM from the NVIDIA License System, including step-by-step instructions, refer to the Virtual GPU Client Licensing User Guide.

Note

For vGPU for Compute deployments, one license per vGPU assigned to a VM is enforced through software. This license is valid for up to sixteen vGPU instances on a single GPU or for the assignment to a VM of one vGPU that is assigned all the physical GPU’s framebuffer. If multiple NVIDIA C‑series vGPUs are assigned to a single VM, a separate license must be obtained for each vGPU from the NVIDIA Licensing System, regardless of whether it is a Networked or Node‑Locked license.

Verifying the License Status of a Licensed NVIDIA vGPU for Compute Guest VM#

After configuring an NVIDIA vGPU for Compute client VM with a license, verify the license status by displaying the licensed product name and status.

To verify the license status of a licensed client, run nvidia-smi with the –q or --query option from within the client VM, not the hypervisor host. If the product is licensed, the expiration date is shown in the license status.

 1==============NVSMI LOG==============
 2
 3Timestamp                                 : Tue Jun 17 16:49:09 2025
 4Driver Version                            : 580.46
 5CUDA Version                              : 13.0
 6
 7Attached GPUs                             : 2
 8GPU 00000000:02:01.0
 9    Product Name                          : NVIDIA H100-80C
10    Product Brand                         : NVIDIA Virtual Compute Server
11    Product Architecture                  : Hopper
12    Display Mode                          : Requested functionality has been deprecated
13    Display Attached                      : Yes
14    Display Active                        : Disabled
15    Persistence Mode                      : Enabled
16    Addressing Mode                       : HMM
17    MIG Mode
18        Current                           : N/A
19        Pending                           : N/A
20    Accounting Mode                       : Disabled
21    Accounting Mode Buffer Size           : 4000
22    Driver Model
23        Current                           : N/A
24        Pending                           : N/A
25    Serial Number                         : N/A
26    GPU UUID                                  GPU-a1833a31-1dd2-11b2-8e58-a589b8170988
27    GPU PDI                               : N/A
28    Minor Number                          : 0
29    VBIOS Version                         : 00.00.00.00.00
30    MultiGPU Board                        : No
31    Board ID                              : 0x201
32    Board Part Number                     : N/A
33    GPU Part Number                       : 2331-882-A1
34    FRU Part Number                       : N/A
35    Platform Info
36        Chassis Serial Number             : N/A
37        Slot Number                       : N/A
38        Tray Index                        : N/A
39        Host ID                           : N/A
40        Peer Type                         : N/A
41        Module Id                         : N/A
42        GPU Fabric GUID                   : N/A
43    Inforom Version
44        Image Version                     : N/A
45        OEM Object                        : N/A
46        ECC Object                        : N/A
47        Power Management Object           : N/A
48    Inforom BBX Object Flush
49        Latest Timestamp                  : N/A
50        Latest Duration                   : N/A
51    GPU Operation Mode
52        Current                           : N/A
53        Pending                           : N/A
54    GPU C2C Mode                          : Disabled
55    GPU Virtualization Mode
56        Virtualization Mode               : VGPU
57        Host VGPU Mode                    : N/A
58        vGPU Heterogeneous Mode           : N/A
59    vGPU Software Licensed Product
60        Product Name                      : NVIDIA Virtual Compute Server
61        License Status                    : Licensed (Expiry: 2025-6-18 8:59:55 GMT)
62….

Installing the NVIDIA GPU Operator Using a Bash Shell Script#

A bash shell script for installing the NVIDIA GPU Operator with the NVIDIA vGPU for Compute Driver is available for download from the NVIDIA AI Enterprise Infra Collection.

Note

This approach assumes there is no vGPU for Compute Driver installed on the Guest VM.The vGPU for Compute Guest driver is installed by GPU Operator.

Refer to the GPU Operator documentation for detailed instructions on deploying the NVIDIA vGPU for Compute Driver using the bash shell script.

Installing NVIDIA AI Enterprise Applications Software#

Installing NVIDIA AI Enterprise Applications Software using Docker and NVIDIA Container Toolkit#

Prerequisites#

Before you install any NVIDIA AI Enterprise container:

  • Ensure your vGPU for Compute Guest VM is running a supported OS distribution.

  • Ensure the VM has obtained a valid vGPU for Compute license from the NVIDIA License System.

  • Confirm that one or more NVIDIA GPU is available and recognized by your system.

  • Make sure the vGPU for Compute Guest Driver is installed correctly. You can verify this by running nvidia-smi. If you see your GPU listed, you’re ready to proceed.

Installing Docker Engine#

Refer to the official Docker Installation Guide for your vGPU for Compute Guest VM OS Linux distribution.

Installing the NVIDIA Container Toolkit#

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to configure containers to leverage NVIDIA GPUs automatically. Complete documentation and frequently asked questions are available on the repository wiki. Refer to the Installing the NVIDIA Container Toolkit documentation to enable the Docker repository and install the NVIDIA Container Toolkit on the Guest VM.

Once the NVIDIA Container Toolkit is installed, to configure the Docker container runtime, refer to the Configuration documentation.

Verifying the Installation: Run a Sample CUDA Container#

Refer to the Running a Sample Workload documentation to run a sample CUDA container test on your GPU.

Accessing NVIDIA AI Enterprise Containers on NGC#

NVIDIA AI Enterprise Application Software is available through the NVIDIA NGC Catalog and identifiable by the NVIDIA AI Enterprise Supported label.

The container image for each application or framework contains the entire user-space software stack required to run it, namely, the CUDA libraries, cuDNN, any required Magnum IO components, TensorRT, and the framework itself.

  1. Generate an NGC API key to access the NVIDIA AI Enterprise Software in the NGC Catalog using the URL provided to you by NVIDIA.

  2. Authenticate with Docker to NGC Registry. In your shell, run:

    docker login nvcr.io
    Username: $oauthtoken
    Password: <paste-your-NGC_API_key-here>
    
    A successful login (``Login Succeeded``) lets you pull containers from NGC.
    
  3. From the NVIDIA vGPU for Compute VM, browse the NGC Catalog for containers labeled NVIDIA AI Enterprise Supported.

  4. Copy the relevant docker pull command.

    sudo docker pull nvcr.io/nvaie/rapids-pb25h1:x.y.z-runtime
    

    Where x.y.z is the version of your container.

  5. Run the container with GPU access.

    sudo docker run --gpus all -it --rm nvcr.io/nvaie/rapids-pb25h1:x.y.z-runtime
    

    Where x.y.z is the version of your container.

    This command launches an interactive container using the vGPUs available on the Guest VM.

Installing the NVIDIA AI Enterprise Software Components Using Podman#

You can use Podman (an alternative container runtime to Docker) for running NVIDIA AI Enterprise containers. The installation flow is similar to Docker. For more information, refer to the NVIDIA AI Enterprise: RHEL with KVM Deployment Guide.

Installing NVIDIA AI Enterprise Software Components Using Kubernetes and NVIDIA Cloud Native Stack#

NVIDIA provides the Cloud Native Stack (CNS), which is a collection of software to run cloud native workloads on NVIDIA GPUs. NVIDIA Cloud Native Stack is based on Ubuntu/RHEL, Kubernetes, Helm, and the NVIDIA GPU and Network Operator.

Refer to this repository for a series of installation guides with step-by-step instructions based on your OS distribution. The installation guides also offer instructions to deploy an application from the NGC Catalog to validate that GPU resources are accessible and functional.

NVIDIA vGPU for Compute Key Features#

MIG Backed vGPU#

A Multi Instance GPU (MIG)-backed vGPU is a vGPU that resides on a GPU instance in a MIG-capable physical GPU. MIG-backed vGPUs are created from individual MIG slices and assigned to virtual machines. Each MIG-backed vGPU resident on a GPU has exclusive access to the GPU instance’s engines, including the compute and video decode engines. This model combines MIG’s hardware-level spatial partitioning with the temporal partitioning capabilities of vGPU, offering flexibility in how GPU resources are shared across workloads.

In a MIG-backed vGPU, processes running on one vGPU execute in parallel with processes running on other vGPUs on the same physical GPU. Each process runs only on its assigned vGPU, alongside processes on other vGPUs.

Note

  1. NVIDIA vGPU for Compute supports MIG-Backed vGPUs on all the GPU boards that support Multi Instance GPU (MIG).

  2. Universal MIG technology on Blackwell enables both compute and graphics workloads to be consolidated and securely isolated on the same physical GPU.

A MIG-backed vGPU is ideal when running multiple high-priority workloads that require guaranteed, consistent performance and strong isolation, such as in multi-tenant environments, MLOps platforms, or shared research clusters. By partitioning a GPU into dedicated hardware instances, teams can run training, inference, video analytics, and data processing jobs simultaneously with consistent performance, maximizing utilization while ensuring each workload meets its SLA.

Supported MIG-Backed vGPU Configurations on a Single GPU#

NVIDIA vGPU supports both homogeneous and mixed MIG-backed virtual GPU configurations, and on GPUs with MIG time-slicing support, each MIG instance supports multiple time-sliced vGPU VMs.

On the NVIDIA RTX PRO 6000 Blackwell Server Edition, up to 4 MIG slices can be created on a single GPU. Within each MIG slice, 1 to 3 time-sliced vGPUs for Compute, with 8 GB frame buffer each can be created, depending on workload requirements and user density goals. Each of these vGPU instances can be assigned to a separate VM, enabling up to 12 virtual machines to share a single physical GPU, while still benefiting from the isolation boundaries provided by MIG.

MIG-Backed Time-Sliced vGPU

The figure above shows how each MIG slice on the NVIDIA RTX PRO 6000 Blackwell can be time-sliced across multiple VMs - supporting up to 3 NVIDIA vGPU for Compute VMs per slice - to maximize user density while maintaining performance isolation through hardware-level partitioning.

Note

You can determine whether time-sliced, MIG-backed vGPUs are supported with your GPU on your chosen hypervisor by running the nvidia-smi -q command.

$ nvidia-smi -q
vGPU Device Capability
    MIG Time-Slicing                  : Supported
    MIG Time-Slicing Mode             : Enabled
  • If MIG Time-Slicing is shown as Supported, the GPU supports time-sliced, MIG-backed vGPUs.

  • If MIG Time-Slicing Mode is shown as Enabled, your chosen hypervisor supports time-sliced, MIG-backed vGPUs on GPUs that also support this feature.

The Ampere NVIDIA A100 PCIe 40GB card has one physical GPU and can support several types of MIG-backed vGPU configurations. The following figure shows examples of valid homogeneous and mixed MIG-backed virtual GPU configurations on NVIDIA A100 PCIe 40GB.

  • A valid homogeneous configuration with 3 A100-2-10C vGPUs on 3 MIG.2g.10b GPU instances

  • A valid homogeneous configuration with 2 A100-3-20C vGPUs on 3 MIG.3g.20b GPU instances

  • A valid mixed configuration with 1 A100-4-20C vGPU on a MIG.4g.20b GPU instance, 1 A100-2-10C vGPU on a MIG.2.10b GPU instance, and 1 A100-1-5C vGPU on a MIG.1g.5b instance

Valid MIG-Backed Virtual GPU Configurations on a Single GPU

Configuring MIG-Backed vGPU#

Configuring a GPU for MIG-Backed vGPUs#

To support GPU Instances with NVIDIA vGPU, a GPU must be configured with MIG mode enabled, and GPU Instances and Compute Instances must be created and configured on the physical GPU.

Prerequisites

  • The NVIDIA Virtual GPU Manager is installed on the hypervisor host.

  • You have root user privileges on your hypervisor host machine.

  • You have determined which GPU instances correspond to the vGPU types of the MIG-backed vGPUs you will create.

  • Other processes, such as CUDA applications, monitoring applications, or the nvidia-smi command, do not use the GPU.

Steps

  1. Enable MIG mode for a GPU.

    Note

    For VMware vSphere, only enabling MIG mode is required because VMware vSphere creates the GPU Instances and Compute Instances.

  2. Create GPU instances on a MIG-enabled GPU.

  3. Create Compute instances in a GPU instance.

After configuring a GPU for MIG-backed vGPUs, create the vGPUs you need and add them to their VMs.

Enabling MIG Mode for a GPU#

Perform this task in your hypervisor command shell.

  1. Open a command shell as the root user on your hypervisor host machine. You can use a secure shell (SSH) on all supported hypervisors. Individual hypervisors may provide additional means for logging in. For details, refer to the documentation for your hypervisor.

  2. Determine whether MIG mode is enabled. Use the nvidia-smi command for this purpose. By default, MIG mode is disabled. This example shows that MIG mode is disabled on GPU 0.

    Note

    In the output from nvidia-smi, the NVIDIA A100 HGX 40GB GPU is referred to as A100-SXM4-40GB.

    $ nvidia-smi -i 0
        +-----------------------------------------------------------------------------+
        | NVIDIA-SMI 550.54.16   Driver Version: 550.54.16    CUDA Version:  12.3     |
        |-------------------------------+----------------------+----------------------+
        | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
        | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
        |                               |                      |               MIG M. |
        |===============================+======================+======================|
        |   0  A100-SXM4-40GB      On   | 00000000:36:00.0 Off |                    0 |
        | N/A   29C    P0    62W / 400W |      0MiB / 40537MiB |      6%      Default |
        |                               |                      |             Disabled |
        +-------------------------------+----------------------+----------------------+
    
  3. If MIG mode is disabled, enable it.

    $ nvidia-smi -i [gpu-ids] -mig 1
    

    gpu-ids - A comma-separated list of GPU indexes, PCI bus IDs, or UUIDs specifying the GPUs you want to enable MIG mode. If gpu-ids are omitted, MIG mode is enabled on all GPUs on the system.

    This example enables MIG mode on GPU 0.

    $ nvidia-smi -i 0 -mig 1
    Enabled MIG Mode for GPU 00000000:36:00.0
    All done.
    

    Note

    If another process is using the GPU, this command fails and displays a warning message that MIG mode for the GPU is in the pending enable state. In this situation, stop all GPU processes and retry the command.

  4. VMware vSphere ESXi with GPUs based only on the NVIDIA Ampere architecture: Reboot the hypervisor host. If you are using a different hypervisor or GPUs based on the NVIDIA Hopper GPU architecture or a later architecture, omit this step.

  5. Query the GPUs on which you enabled MIG mode to confirm that MIG mode is enabled. This example queries GPU 0 for the PCI bus ID and MIG mode in comma-separated values (CSV) format.

    $ nvidia-smi -i 0 --query-gpu=pci.bus_id,mig.mode.current --format=csv
    pci.bus_id, mig.mode.current
    00000000:36:00.0, Enabled
    
Creating GPU Instances on a MIG-Enabled GPU#

Note

If you are using VMware vSphere, omit this task. VMware vSphere creates the GPU instances automatically.

Perform this task in your hypervisor command shell.

  1. Open a command shell as the root user on your hypervisor host machine if necessary.

  2. List the GPU instance profiles that are available on your GPU. When you create a profile, you must specify the profiles by their IDs, not their names.

    $ nvidia-smi mig -lgip
        +--------------------------------------------------------------------------+
        | GPU instance profiles:                                                   |
        | GPU   Name          ID    Instances   Memory     P2P    SM    DEC   ENC  |
        |                           Free/Total   GiB              CE    JPEG  OFA  |
        |==========================================================================|
        |   0  MIG 1g.5gb     19     7/7        4.95       No     14     0     0   |
        |                                                          1     0     0   |
        +--------------------------------------------------------------------------+
        |   0  MIG 2g.10gb    14     3/3        9.90       No     28     1     0   |
        |                                                          2     0     0   |
        +--------------------------------------------------------------------------+
        |   0  MIG 3g.20gb     9     2/2        19.79      No     42     2     0   |
        |                                                          3     0     0   |
        +--------------------------------------------------------------------------+
        |   0  MIG 4g.20gb     5     1/1        19.79      No     56     2     0   |
        |                                                          4     0     0   |
        +--------------------------------------------------------------------------+
        |   0  MIG 7g.40gb     0     1/1        39.59      No     98     5     0   |
        |                                                          7     1     1   |
        +--------------------------------------------------------------------------+
    
  3. Discover the GPU instance profiles that are mapped to the different vGPU types.

    nvidia-smi vgpu -s -v
    

    For example, for H100, some of the listing attributes of certain vGPU profiles look something like:

    # nvidia-smi vgpu -s -v -i 1
    GPU 00000000:1A:00.0
    vGPU Type ID : 0x335
    Name : NVIDIA H100-1-10C
    Class : Compute
    GPU Instance Profile ID : 19
    ...
    vGPU Type ID : 0x336
    Name : NVIDIA H100-2-20C
    Class : Compute
    GPU Instance Profile ID : 14
  4. Create the GPU instances with a default compute instance corresponding to the vGPU types of the MIG-backed vGPUs you will create.

    $ nvidia-smi mig -cgi gpu-instance-profile-ids -C
    

    gpu-instance-profile-ids - A comma-separated list of GPU instance profile IDs specifying the GPU instances you want to create.

    This example creates two GPU instances of type 2g.10gb with profile ID 14.

    $ nvidia-smi mig -cgi 14,14 -C
    Successfully created GPU instance ID  5 on GPU  2 using profile MIG 2g.10gb (ID 14)
    Successfully created GPU instance ID  3 on GPU  2 using profile MIG 2g.10gb (ID 14)
    

Note

If you are creating a GPU Instance to support a 1:1 MIG-backed vGPU on a platform other than VMware vSphere, you can optionally create non-default Compute Instances for that vGPU, by following the steps outlined in the Creating Non-Default Compute Instances in a GPU Instance for 1:1 vGPUs section.

Creating Non-Default Compute Instances in a GPU Instance for 1:1 vGPUs#

This task is required only if you plan to use a 1:1, MIG-backed vGPU on a GPU Instance and wish to create non-default Compute Instances for that vGPU. This option is only available on platforms other than VMware vSphere.

Perform this task in your hypervisor command shell.

  1. Open a command shell as the root user on your hypervisor host machine if necessary.

  2. List the available GPU instances.

    $ nvidia-smi mig -lgi
        +----------------------------------------------------+
        | GPU instances:                                     |
        | GPU   Name          Profile  Instance   Placement  |
        |                       ID       ID       Start:Size |
        |====================================================|
        |   2  MIG 2g.10gb      14        3          0:2     |
        +----------------------------------------------------+
        |   2  MIG 2g.10gb      14        5          4:2     |
        +----------------------------------------------------+
    
  3. Create the compute instances that you need within each GPU instance.

    $ nvidia-smi mig -cci -gi gpu-instance-ids
    

    gpu-instance-ids - A comma-separated list of GPU instance IDs that specifies the GPU instances within which you want to create the compute instances.

    Caution

    To avoid an inconsistent state between a guest VM and the hypervisor host, do not create compute instances from the hypervisor on a GPU instance on which an active guest VM is running. Runtime changes to the vGPU’s Compute Instance configuration may be done by the guest VM itself, as explained in Modifying a MIG-Backed vGPU’s Configuration.

    This example creates a compute instance on each GPU instance 3 and 5.

    $ nvidia-smi mig -cci -gi 3,5
    Successfully created compute instance on GPU  0 GPU instance ID  1 using profile ID  2
    Successfully created compute instance on GPU  0 GPU instance ID  2 using profile ID  2
    
  4. Verify that the compute instances were created within each GPU instance.

    $ nvidia-smi
        +-----------------------------------------------------------------------------+
        | MIG devices:                                                                |
        +------------------+----------------------+-----------+-----------------------+
        | GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
        |      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
        |                  |                      |        ECC|                       |
        |==================+======================+===========+=======================|
        |  2    3   0   0  |      0MiB /  9984MiB | 28      0 |  2   0    1    0    0 |
        |                  |      0MiB / 16383MiB |           |                       |
        +------------------+----------------------+-----------+-----------------------+
        |  2    5   0   1  |      0MiB /  9984MiB | 28      0 |  2   0    1    0    0 |
        |                  |      0MiB / 16383MiB |           |                       |
        +------------------+----------------------+-----------+-----------------------+
    
        +-----------------------------------------------------------------------------+
        | Processes:                                                                  |
        |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
        |        ID   ID                                                   Usage      |
        |=============================================================================|
    

    Note

    • Additional Compute Instances created in a VM at runtime are destroyed when the VM is shut down or rebooted. After the shutdown or reboot, only one Compute Instance remains in the VM.

    • On NVIDIA B200 HGX, the following Compute Instance combinations are blocked in vGPU for Compute Guest VMs running on full-sized (7-slice) GPU Instances:

      • 4-slice

      • 2-slice + 2-slice + 2-slice

      • 3-slice + 2-slice + 2-slice

      • 2-slice + 2-slice + 3-slice

Disabling MIG Mode for One or More GPUs#

If a GPU you want to use for time-sliced vGPUs or GPU passthrough has previously been configured for MIG-backed vGPUs, disable MIG mode on the GPU.

Prerequisites

  • The NVIDIA Virtual GPU Manager is installed on the hypervisor host.

  • You have root user privileges on your hypervisor host machine.

  • Other processes, such as CUDA applications, monitoring applications, or the nvidia-smi command, do not use the GPU.

Steps

Perform this task in your hypervisor command shell.

  1. Open a command shell as the root user on your hypervisor host machine. You can use a secure shell (SSH) on all supported hypervisors. Individual hypervisors may provide additional means for logging in. For details, refer to the documentation for your hypervisor.

  2. Determine whether MIG mode is disabled. Use the nvidia-smi command for this purpose. By default, MIG mode is disabled but might have previously been enabled. This example shows that MIG mode is enabled on GPU 0.

    Note

    In the output from nvidia-smi, the NVIDIA A100 HGX 40GB GPU is referred to as A100-SXM4-40GB.

    $ nvidia-smi -i 0
        +-----------------------------------------------------------------------------+
        | NVIDIA-SMI 550.54.16    Driver Version: 550.54.16   CUDA Version:  12.3     |
        |-------------------------------+----------------------+----------------------+
        | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
        | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
        |                               |                      |               MIG M. |
        |===============================+======================+======================|
        |   0  A100-SXM4-40GB      Off  | 00000000:36:00.0 Off |                    0 |
        | N/A   29C    P0    62W / 400W |      0MiB / 40537MiB |      6%      Default |
        |                               |                      |              Enabled |
        +-------------------------------+----------------------+----------------------+
    
  3. If MIG mode is enabled, disable it.

    $ nvidia-smi -i [gpu-ids] -mig 0
    

    gpu-ids - A comma-separated list of GPU indexes, PCI bus IDs, or UUIDs specifying the GPUs you want to disable MIG mode. If gpu-ids are omitted, MIG mode is disabled for all GPUs in the system.

    This example disables MIG Mode on GPU 0.

    $ sudo nvidia-smi -i 0 -mig 0
    Disabled MIG Mode for GPU 00000000:36:00.0
    All done.
    
  4. Confirm that MIG mode was disabled. Use the nvidia-smi command for this purpose. This example shows that MIG mode is disabled on GPU 0.

    $ nvidia-smi -i 0
        +-----------------------------------------------------------------------------+
        | NVIDIA-SMI 550.54.16    Driver Version: 550.54.16   CUDA Version:  12.3     |
        |-------------------------------+----------------------+----------------------+
        | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
        | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
        |                               |                      |               MIG M. |
        |===============================+======================+======================|
        |   0  A100-SXM4-40GB      Off  | 00000000:36:00.0 Off |                    0 |
        | N/A   29C    P0    62W / 400W |      0MiB / 40537MiB |      6%      Default |
        |                               |                      |             Disabled |
        +-------------------------------+----------------------+----------------------+
    

Modifying a MIG-Backed vGPU’s Configuration From a Guest VM#

If you want to replace the compute instances created when the GPU was configured for MIG-backed vGPUs, you can delete them before adding the compute instances from within the guest VM.

Note

  • From within a guest VM, you can modify the configuration only of MIG-backed vGPUs that occupy an entire GPU instance. For time-sliced, MIG-backed vGPUs, you must create compute instances as explained in Create Compute Instances in a GPU instance. Creating Non-Default Compute Instances in a GPU Instance for 1:1 vGPUs.

  • On NVIDIA B200 HGX, the following Compute Instance combinations are blocked in vGPU for Compute Guest VMs running on full-sized (7-slice) GPU Instances:

    • 4-slice

    • 2-slice + 2-slice + 2-slice

    • 3-slice + 2-slice + 2-slice

    • 2-slice + 2-slice + 3-slice

A MIG-backed vGPU that occupies an entire GPU instance is assigned all of the instance’s framebuffer. For such vGPUs, the maximum vGPUs per GPU instance in the tables in Virtual GPU Types for Supported GPUs is always 1.

Prerequisites

  • You have root user privileges in the guest VM.

  • Other processes, such as CUDA applications, monitoring applications, or the nvidia-smi command, do not use the GPU instance.

Steps

Perform this task in a guest VM command shell.

  1. Open a command shell as the root user in the guest VM. You can use a secure shell (SSH) on all supported hypervisors. Individual hypervisors may provide additional means for logging in. For details, refer to the documentation for your hypervisor.

  2. List the available GPU instances

    $ nvidia-smi mig -lgi
        +----------------------------------------------------+
        | GPU instances:                                     |
        | GPU   Name          Profile  Instance   Placement  |
        |                       ID       ID       Start:Size |
        |====================================================|
        |   0  MIG 2g.10gb       0        0          0:8     |
        +----------------------------------------------------+
    
  3. Optional: If compute instances were created when the GPU was configured for MIG-backed vGPUs that you no longer require, delete them.

    $ nvidia-smi mig -dci -ci compute-instance-id -gi gpu-instance-id
    

    compute-instance-id - The ID of the compute instance that you want to delete.

    gpu-instance-id - The ID of the GPU instance from which you want to delete the compute instance.

    Note

    This command fails if another process is using the GPU instance. In this situation, stop all processes using the GPU instance and retry the command.

    This example deletes compute instance 0 from GPU instance 0 on GPU 0.

    $ nvidia-smi mig -dci -ci 0 -gi 0
    Successfully destroyed compute instance ID  0 from GPU  0 GPU instance ID  0
    
  4. List the compute instance profiles that are available for your GPU instance.

    $ nvidia-smi mig -lcip
    

    This example shows that one MIG 2g.10gb compute instance or two MIG 1c.2g.10gb compute instances can be created within the GPU instance.

    $ nvidia-smi mig -lcip
        +-------------------------------------------------------------------------------+
        | Compute instance profiles:                                                    |
        | GPU    GPU      Name          Profile  Instances   Exclusive      Shared      |
        |      Instance                   ID     Free/Total     SM      DEC   ENC   OFA |
        |        ID                                                     CE    JPEG      |
        |===============================================================================|
        |   0     0       MIG 1c.2g.10gb   0      2/2           14       1     0     0  |
        |                                                                2     0        |
        +-------------------------------------------------------------------------------+
        |   0     0       MIG 2g.10gb      1*     1/1           28       1     0     0  |
        |                                                                2     0        |
        +-------------------------------------------------------------------------------+
    
  5. Create the compute instances that you need within the available GPU instance. Run the following command to create each compute instance individually.

    $ nvidia-smi mig -cci compute-instance-profile-id -gi gpu-instance-id
    

    compute-instance-profile-id - The compute instance profile ID that specifies the compute instance.

    gpu-instance-id - The GPU instance ID specifies the GPU instance within which you want to create the compute instance.

    Note

    This command fails if another process is using the GPU instance. In this situation, stop all GPU processes and retry the command.

    This example creates a MIG 2g.10gb compute instance on GPU instance 0.

    $ nvidia-smi mig -cci 1 -gi 0
    Successfully created compute instance ID  0 on GPU  0 GPU instance ID  0 using profile MIG 2g.10gb (ID  1)
    

    This example creates two MIG 1c.2g.10gb compute instances on GPU instance 0 by running the same command twice.

    $ nvidia-smi mig -cci 0 -gi 0
    Successfully created compute instance ID  0 on GPU  0 GPU instance ID  0 using profile MIG 1c.2g.10gb (ID  0)
    $ nvidia-smi mig -cci 0 -gi 0
    Successfully created compute instance ID  1 on GPU  0 GPU instance ID  0 using profile MIG 1c.2g.10gb (ID  0)
    
  6. Verify that the compute instances were created within the GPU instance. Use the nvidia-smi command for this purpose. This example confirms that a MIG 2g.10gb compute instance was created on GPU instance 0.

    nvidia-smi
      Mon Mar 25 19:01:24 2024
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 550.54.16    Driver Version: 550.54.16   CUDA Version:  12.3     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  GRID A100X-2-10C     On  | 00000000:00:08.0 Off |                   On |
      | N/A   N/A    P0    N/A /  N/A |   1058MiB / 10235MiB |     N/A      Default |
                                       |                      |              Enabled |
      +-------------------------------+----------------------+----------------------+
    
      +-----------------------------------------------------------------------------+
      | MIG devices:                                                                |
      +------------------+----------------------+-----------+-----------------------+
      | GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
      |      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
      |                  |                      |        ECC|                       |
      |==================+======================+===========+=======================|
      |  0    0   0   0  |   1058MiB / 10235MiB | 28      0 |  2   0    1    0    0 |
      |                  |      0MiB /  4096MiB |           |                       |
      +------------------+----------------------+-----------+-----------------------+
    
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |  No running processes found                                                 |
      +-----------------------------------------------------------------------------+
    

    This example confirms that two MIG 1c.2g.10gb compute instances were created on GPU instance 0.

    $ nvidia-smi
        Mon Mar 25 19:01:24 2024
        +-----------------------------------------------------------------------------+
        | NVIDIA-SMI 550.54.16    Driver Version: 550.54.16   CUDA Version:  12.3     |
        |-------------------------------+----------------------+----------------------+
        | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
        | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
        |                               |                      |               MIG M. |
        |===============================+======================+======================|
        |   0  GRID A100X-2-10C     On  | 00000000:00:08.0 Off |                   On |
        | N/A   N/A    P0    N/A /  N/A |   1058MiB / 10235MiB |     N/A      Default |
        |                               |                      |              Enabled |
        +-------------------------------+----------------------+----------------------+
    
        +-----------------------------------------------------------------------------+
        | MIG devices:                                                                |
        +------------------+----------------------+-----------+-----------------------+
        | GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
        |      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
        |                  |                      |        ECC|                       |
        |==================+======================+===========+=======================|
        |  0    0   0   0  |   1058MiB / 10235MiB | 14      0 |  2   0    1    0    0 |
        |                  |      0MiB /  4096MiB |           |                       |
        +------------------+                      +-----------+-----------------------+
        |  0    0   1   1  |                      | 14      0 |  2   0    1    0    0 |
        |                  |                      |           |                       |
        +------------------+----------------------+-----------+-----------------------+
    
        +-----------------------------------------------------------------------------+
        | Processes:                                                                  |
        |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
        |        ID   ID                                                   Usage      |
        |=============================================================================|
        |  No running processes found                                                 |
        +-----------------------------------------------------------------------------+
    

Monitoring MIG-backed vGPU Activity#

Note

  1. MIG-backed vGPU activity cannot be monitored on GPUs based on the NVIDIA Ampere GPU architecture because the required hardware feature is absent.

  2. On the NVIDIA RTX Pro 6000 Blackwell Server Edition, GPM metrics are supported only for 1:1 MIG backed vGPUs and are not available for MIG-backed and timesliced vGPUs.

  3. The --gpm-metrics option is supported only on MIG-backed vGPUs that are allocated all of the GPU instance’s frame buffer.

For more information, refer to the Monitoring MIG-backed vGPU Activity documentation.

Device Groups#

Device Groups provide an abstraction layer for multi-device virtual hardware provisioning. They enable platforms to automatically detect sets of physically connected devices (such as GPUs linked via NVLink or GPU-NIC pairs) at the hardware level and present them as a single logical unit to VMs. This abstraction is particularly vital to ensure that AI workloads that depend on low-latency, high-bandwidth communication, such as distributed model training, inference, and large-scale data processing, ensure maximum utilization of the underlying hardware topology.

Device groups can consist of two or more hardware devices that share a common PCIe switch or a direct interconnect. This simplifies virtual hardware assignment and enables:

  • Optimized Multi-GPU and GPU-NIC communication: NVLink-connected GPUs can be provisioned together to maximize peer-to-peer bandwidth and minimize latency, which is ideal for large-batch training and NCCL all-reduce-heavy workloads. Similarly, GPU-NIC pairs located under the same PCIe switch or capable of delivering optimal GPUDirect RDMA performance are grouped together, enabling high-throughput data ingestion directly into GPU memory for training or inference workloads. Adjacent NICs that do not meet the required performance thresholds are automatically excluded to avoid bottlenecks.

  • Topology consistency: Unlike manual device assignment, Device Groups guarantee correct placement across PCIe switches and interconnects, even after reboots or events like live migration.

  • Simplified and reliable provisioning: By abstracting the PCIe/NVLink topology into logical units, device groups eliminate the need for scripting or topology mapping, reducing the risk of misconfiguration and enabling faster deployment of AI clusters.

MIG-Backed Time-Sliced vGPU

This figure illustrates how devices (GPUs and NICs) that share a common PCIe switch or a direct GPU interconnect can be presented as a device group. On the right side, we can see that although two NICs are connected to the same PCIe switch as the GPU, only one NIC is included in the device group. This is because the NVIDIA driver identifies and exposes only the GPU-NIC pairings that meet the necessary criteria like GPUDirect RDMA. Adjacent NICs that do not satisfy these requirements are excluded.

For more information regarding Hypervisor Platform support for Device Groups, refer to the vGPU Device Groups documentation.

GPUDirect RDMA and GPUDirect Storage#

NVIDIA GPUDirect Remote Direct Memory Access (RDMA) is a technology in NVIDIA GPUs that enables direct data exchange between GPUs and a third-party peer device using PCIe. GPUDirect RDMA enables network devices to access the vGPU frame buffer directly, bypassing CPU host memory altogether. The third-party devices could be network interfaces such as NVIDIA ConnectX SmartNICs or BlueField DPUs, or video acquisition adapters.

GPUDirect Storage (GDS) enables a direct data path between local or remote storage, such as NFS servers or NVMe/NVMe over Fabric (NVMe-oF), and GPU memory. GDS performs direct memory access (DMA) transfers between GPU memory and storage. DMA avoids a bounce buffer through the CPU. This direct path increases system bandwidth and decreases the latency and utilization load on the CPU.

GPUDirect technology is supported only on a subset of vGPUs and guest OS releases.

GPUDirect RDMA and GPUDirect Storage Known Issues and Limitations#

Starting with GPUDirect Storage technology release 1.7.2, the following limitations apply:

  • GPUDirect Storage technology is not supported on GPUs based on the NVIDIA Ampere GPU architecture.

  • On GPUs based on the NVIDIA Ada Lovelace, Hopper, and Blackwell GPU architectures, GPUDirect Storage technology is supported only with the guest driver for Linux based on NVIDIA Linux open GPU kernel modules.

GPUDirect Storage technology releases before 1.7.2 are supported only with guest drivers with Linux kernel versions earlier than 6.6.

GPUDirect Storage technology is supported only on the following guest OS releases:

  • Red Hat Enterprise Linux 8.8+

  • Ubuntu 22.04 LTS

  • Ubuntu 24.04 LTS

Hypervisor Platform Support for GPUDirect RDMA and GPUDirect Storage#

Hypervisor Platform Support for GPUDirect RDMA and GPUDirect Storage#

Hypervisor Platform

Version

Red Hat Enterprise Linux with KVM

8.8+

Ubuntu

  • 22.04

  • 24.04

VMware vSphere

  • 8

  • 9

vGPU Support for GPUDirect RDMA and GPUDirect Storage#

GPUDirect RDMA and GPUDirect Storage technology are supported on all time-sliced and MIG-backed NVIDIA vGPU for Compute on physical GPUs that support single root I/O virtualization (SR-IOV).

For a list of supported GPUs, refer to the Supported NVIDIA GPUs and Networking section in the NVIDIA AI Enterprise Infra Support Matrix.

Guest OS Releases Support for GPUDirect RDMA and GPUDirect Storage#

Linux only. GPUDirect technology is not supported on Windows.

Network Interface Cards Support for GPUDirect RDMA and GPUDirect Storage#

GPUDirect technology is supported on the following network interface cards:

  • NVIDIA ConnectX- 8 SmartNIC

  • NVIDIA ConnectX- 7 SmartNIC

  • Mellanox Connect-X 6 SmartNIC

  • Mellanox Connect-X 5 Ethernet adapter card

Heterogeneous vGPU#

Heterogeneous vGPU allows a single physical GPU to simultaneously support multiple vGPU profiles with different memory allocations (framebuffer sizes). This configuration is particularly beneficial for environments where VMs have diverse GPU resource requirements. By enabling the same physical GPU to host vGPUs of varying sizes, heterogeneous vGPU optimizes overall resource usage, ensuring VMs access only the necessary GPU resources and preventing underutilization.

When a GPU is configured for heterogeneous vGPU, its behavior during events like a host reboot, NVIDIA Virtual GPU Manager reload, or GPU reset varies by hypervisor. This configuration only supports the Best Effort and Equal Share schedulers.

Heterogeneous vGPU is supported on Volta and later GPUs. For additional information and operational instructions across different hypervisors, refer to the Heterogeneous vGPU documentation.

Platform Support for Heterogeneous vGPUs#

Platform Support for Heterogeneous vGPUs#

Hypervisor Platform

NVIDIA AI Enterprise Infra Release

Documentation

Red Hat Enterprise Linux with KVM

  • NVIDIA AI Enterprise Infra 6.x

  • NVIDIA AI Enterprise Infra 7.0

Configuring a GPU for Heterogeneous vGPU on RHEL KVM

Canonical Ubuntu with KVM

  • NVIDIA AI Enterprise Infra 6.x

  • NVIDIA AI Enterprise Infra 7.0

Configuring a GPU for Heterogeneous vGPU on Linux KVM

VMware vSphere

  • NVIDIA AI Enterprise Infra 6.x

  • NVIDIA AI Enterprise Infra 7.0

Configuring a GPU for Heterogeneous vGPU on VMware vSphere

Live Migration#

Live migration enables the seamless transfer of VMs configured with NVIDIA vGPUs from one physical host to another without downtime. This capability enables enterprises to maintain continuous operations during infrastructure changes, balancing workloads, or reallocating resources with minimal disruption. Live migration offers significant operational benefits, including enhanced business continuity, scalability, and agility.

For additional information about this feature and instructions on how to perform the operation across different hypervisors, refer to the vGPU Live Migration documentation.

Live Migration Known Issues and Limitations#

Platform Support for Live Migration#

Live Migration Known Issues and Limitations#

Hypervisor Platform

Version

NVIDIA AI Enterprise Infra Release

Documentation

Red Hat Enterprise Linux with KVM

  • 9.4

  • 9.6

  • 10.0

  • NVIDIA AI Enterprise Infra 6.x

  • NVIDIA AI Enterprise Infra 7.0

Migrating a VM Configured with NVIDIA vGPU for Compute on RHEL KVM

Ubuntu with KVM

24.04 LTS

  • NVIDIA AI Enterprise Infra 6.x

  • NVIDIA AI Enterprise Infra 7.0

Migrating a VM Configured with NVIDIA vGPU for Compute on Linux KVM

VMware vSphere

  • 8

  • 9

All active NVIDIA AI Enterprise Infra Releases

Migrating a VM Configured with NVIDIA vGPU for Compute on VMware vSphere

Note

Live Migration is not supported between RHEL 10 and RHEL 9.4.

vGPU Support for Live Migration#

For a list of supported GPUs, refer to the Supported NVIDIA GPUs and Networking section in the NVIDIA AI Enterprise Infra Support Matrix.

Note

  • Live Migration is not supported between 80GB PCIe and 94GB NVL variants of GPU Boards

  • Live Migration is not supported between H200 / H800 / H100 GPU Boards

Multi-vGPU and P2P#

Multi-vGPU technology allows a single VM to simultaneously leverage multiple vGPUs, significantly enhancing its computational capabilities. Unlike standard vGPU configurations that virtualize a single physical GPU for sharing across multiple VMs, Multi-vGPU presents resources from several vGPU devices into a single VM. These vGPU devices are not required to reside on the same physical GPU; they can be distributed across separate physical GPUs, pooling their collective power to meet the demands of high-performance workloads.

This technology is particularly advantageous for AI training and inference workloads that require extensive computational power. It optimizes resource allocation by enabling applications within a VM to access dedicated GPU resources. For instance, a VM configured with two NVIDIA A100 GPUs using Multi-vGPU can run large-scale AI models more efficiently than with a single GPU. This dedicated assignment eliminates resource contention between different AI processes within the same VM, ensuring optimal and predictable performance for critical tasks. The ability to aggregate computational power from multiple vGPUs makes Multi-vGPU an ideal solution for scaling complex AI model development and deployment.

Peer-To-Peer (P2P) CUDA Transfers#

Peer-to-Peer (P2P) CUDA transfers enable device memory between vGPUs on different GPUs that are assigned to the same VM to be accessed from within the CUDA kernels. NVLink is a high-bandwidth interconnect that enables fast communication between such vGPUs.

P2P CUDA transfers over NVLink are supported only on a subset of vGPUs, hypervisor releases, and guest OS releases.

Peer-to-Peer CUDA Transfers Known Issues and Limitations#

  • Only time-sliced vGPUs are supported. MIG-backed vGPUs are not supported.

  • P2P transfers over PCIe are not supported.

Hypervisor Platform Support for Multi-vGPU and P2P#

Hypervisor Platform Support for Multi-vGPU and P2P#

Hypervisor Platform

NVIDIA AI Enterprise Infra Release

Supported vGPU Types

Documentation

Red Hat Enterprise Linux with KVM

All active NVIDIA AI Enterprise Infra Releases

All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.

Setting up Multi-vGPU VMs on RHEL KVM

Ubuntu with KVM

All active NVIDIA AI Enterprise Infra Releases

All NVIDIA vGPU for Compute with PCIe GPUs; on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.

Setting up Multi-vGPU VMs on Ubuntu KVM

VMware vSphere

All active NVIDIA AI Enterprise Infra Releases

All NVIDIA vGPU for Compute, on supported GPUs, both time-sliced and MIG-backed vGPUs are supported.

Setting up Multi-vGPU on VMware vSphere 8

Note

P2P CUDA transfers are not supported on Windows. Only Linux OS distros as outlined in NVIDIA AI Enterprise Infrastructure Support Matrix are supported.

vGPU Support for Multi-vGPU#

You can assign multiple vGPUs with differing amounts of frame buffer to a single VM, provided the board type and the series of all the vGPUs are the same. For example, you can assign an A40-48C vGPU and an A40-16C vGPU to the same VM. However, you cannot assign an A30-8C vGPU and an A16-8C vGPU to the same VM.

vGPU Support for Multi-vGPU on the NVIDIA Blackwell Architecture#

Board

vGPU [1]

NVIDIA HGX B200 180GB

Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu: - All NVIDIA vGPU for Compute

NVIDIA RTX PRO 6000 Blackwell SE 96GB

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

vGPU Support for Multi-vGPU on the NVIDIA Hopper GPU Architecture#

Board

vGPU [1]

NVIDIA H800 PCIe 94GB (H800 NVL)

All NVIDIA vGPU for Compute

NVIDIA H800 PCIe 80GB

All NVIDIA vGPU for Compute

NVIDIA H800 SXM5 80GB

NVIDIA vGPU for Compute [5]

NVIDIA H200 PCIe 141GB (H200 NVL)

All NVIDIA vGPU for Compute

NVIDIA H200 SXM5 141GB

NVIDIA vGPU for Compute [5]

NVIDIA H100 PCIe 94GB (H100 NVL)

All NVIDIA vGPU for Compute

NVIDIA H100 SXM5 94GB

NVIDIA vGPU for Compute [5]

NVIDIA H100 PCIe 80GB

All NVIDIA vGPU for Compute

NVIDIA H100 SXM5 80GB

NVIDIA vGPU for Compute [5]

NVIDIA H100 SXM5 64GB

NVIDIA vGPU for Compute [5]

NVIDIA H20 SXM5 141GB

NVIDIA vGPU for Compute [5]

NVIDIA H20 SXM5 96GB

NVIDIA vGPU for Compute [5]

vGPU Support for Multi-vGPU on the NVIDIA Ada Lovelace Architecture#

Board

vGPU

NVIDIA L40

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L40S

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L20

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L4

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA L2

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX 6000 Ada

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX 5880 Ada

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX 5000 Ada

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

vGPU Support for Multi-vGPU on the NVIDIA Ampere GPU Architecture#

Board

vGPU [1]

  • NVIDIA A800 PCIe 80GB

  • NVIDIA A800 PCIe 80GB liquid-cooled

  • NVIDIA AX800

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A800 PCIe 40GB active-cooled

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A800 HGX 80GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

  • NVIDIA A100 PCIe 80GB

  • NVIDIA A100 PCIe 80GB liquid-cooled

  • NVIDIA A100X

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A100 HGX 80GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A100 PCIe 40GB

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A100 HGX 40GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A40

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

  • NVIDIA A30

  • NVIDIA A30X

  • NVIDIA A30 liquid-cooled

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A16

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA A10

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX A6000

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX A5500

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

NVIDIA RTX A5000

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

vGPU Support for Multi-vGPU on the NVIDIA Turing GPU Architecture#

Board

vGPU

Tesla T4

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Quadro RTX 6000 passive

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Quadro RTX 8000 passive

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

vGPU Support for Multi-vGPU on the NVIDIA Volta GPU Architecture#

Board

vGPU

Tesla V100 SXM2

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 SXM2 32GB

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 PCIe

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 PCIe 32GB

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100S PCIe 32GB

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

Tesla V100 FHHL

  • Generic Linux with KVM hypervisors [6], Red Hat Enterprise Linux KVM, and Ubuntu:
    • All NVIDIA vGPU for Compute

  • Since VMware vSphere 8.0:
    • All NVIDIA vGPU for Compute

vGPU Support for P2P#

Only NVIDIA vGPU for Compute time-sliced vGPUs allocated all of the physical GPU framebuffer on physical GPUs that support NVLink are supported.

vGPU Support for P2P on the NVIDIA Blackwell GPU Architecture#

Board

vGPU

NVIDIA HGX B200 180GB

NVIDIA B200X-180C

vGPU Support for P2P on the NVIDIA Hopper GPU Architecture#

Board

vGPU

NVIDIA H800 PCIe 94GB (H800 NVL)

H800L-94C

NVIDIA H800 PCIe 80GB

H800-80C

NVIDIA H200 PCIe 141GB (H200 NVL)

H200-141C

NVIDIA H200 SXM5 141GB

H200X-141C

NVIDIA H100 PCIe 94GB (H100 NVL)

H100L-94C

NVIDIA H100 SXM5 94GB

H100XL-94C

NVIDIA H100 PCIe 80GB

H100-80C

NVIDIA H100 SXM5 80GB

H100XM-80C

NVIDIA H100 SXM5 64GB

H100XS-64C

NVIDIA H20 SXM5 141GB

H20X-141C

NVIDIA H20 SXM5 96GB

H20-96C

vGPU Support for P2P on the NVIDIA Ampere GPU Architecture#

Board

vGPU

  • NVIDIA A800 PCIe 80GB

  • NVIDIA A800 PCIe 80GB liquid-cooled

  • NVIDIA AX800

A800D-80C

NVIDIA A800 PCIe 40GB active-cooled

A800-40C

NVIDIA A800 HGX 80GB

A800DX-80C [2]

  • NVIDIA A100 PCIe 80GB

  • NVIDIA A100 PCIe 80GB liquid-cooled

  • NVIDIA A100X

A100D-80C

NVIDIA A100 HGX 80GB

A100DX-80C [2]

NVIDIA A100 PCIe 40GB

A100-40C

NVIDIA A100 HGX 40GB

A100X-40C [2]

NVIDIA A40

A40-48C

  • NVIDIA A30

  • NVIDIA A30X

  • NVIDIA A30 liquid-cooled

A30-24C

NVIDIA A16

A16-16C

NVIDIA A10

A10-24C

NVIDIA RTX A6000

A6000-48C

NVIDIA RTX A5500

A5500-24C

NVIDIA RTX A5000

A5000-24C

vGPU Support for P2P on the NVIDIA Turing GPU Architecture#

Board

vGPU

Quadro RTX 8000 passive

RTX8000P-48C

Quadro RTX 6000 passive

RTX6000P-24C

vGPU Support for P2P on the NVIDIA Volta GPU Architecture#

Board

vGPU

Tesla V100 SXM2

V100X-16C

Tesla V100 SXM2 32GB

V100DX-32C

NVIDIA NVSwitch#

NVIDIA NVSwitch provides a high-bandwidth, low-latency interconnect fabric that enables seamless, direct communication between multiple GPUs within a system. NVIDIA NVSwitch enables peer-to-peer vGPU communication within a single node over the NVLink fabric. The NVSwitch acts as a high-speed crossbar, allowing any GPU to communicate with any other GPU at full NVLink speed, significantly improving communication efficiency and bandwidth compared to traditional PCIe-based interconnections. It facilitates the creation of large GPU clusters, enabling AI and deep learning applications to efficiently utilize pooled GPU memory and compute resources for complex, computationally intensive tasks. It is supported only on a subset of hardware platforms, vGPUs, hypervisor software releases, and guest OS releases.

For information about using the NVSwitch, refer to the NVIDIA Fabric Manager documentation.

Platform Support for NVIDIA NVSwitch#

  • NVIDIA HGX B200 8-GPU baseboard

  • NVIDIA HGX H200 8-GPU baseboard

  • NVIDIA HGX H100 8-GPU baseboard

  • NVIDIA HGX H800 8-GPU baseboard

  • NVIDIA HGX A100 8-GPU baseboard

NVIDIA NVSwitch Limitations#

  • Only time-sliced vGPUs are supported. MIG-backed vGPUs are not supported.

  • GPU passthrough is not supported on NVIDIA Systems that include NVSwitch when using VMware vSphere.

  • All vGPUs communicating peer-to-peer must be assigned to the same VM.

  • On GPUs based on the NVIDIA Hopper and Blackwell GPU architectures, multicast is supported when unified memory (UVM) is enabled.

  • VMware vSphere is not supported on NVIDIA HGX B200.

Hypervisor Platform Support for NVSwitch#

Consult the documentation from your hypervisor vendor for information about which generic Linux with KVM hypervisor software releases supports NVIDIA NVSwitch.

  • All supported Red Hat Enterprise Linux KVM and Ubuntu KVM releases support NVIDIA NVSwitch.

  • The earliest VMware vSphere Hypervisor (ESXi) release that supports NVIDIA NVSwitch depends on the GPU architecture.

VMware vSphere Releases that Support NVIDIA NVSwitch#

GPU Architecture

Earliest Supported VMware vSphere Hypervisor (ESXi) Release

NVIDIA Blackwell

Not supported on VMware

NVIDIA Hopper

VMware vSphere Hypervisor (ESXi) 8 update 2

NVIDIA Ampere

VMware vSphere Hypervisor (ESXi) 8 update 1

vGPU Support for NVSwitch#

Only the following vGPU for Compute time-sliced vGPUs that are allocated all of the physical GPU’s framebuffer are supported:

  • NVIDIA A800

  • NVIDIA A100 HGX

  • NVIDIA B200 HGX

  • NVIDIA H800

  • NVIDIA H200 HGX

  • NVIDIA H100 SXM5

  • NVIDIA H20

NVIDIA NVSwitch Support on the NVIDIA Ampere GPU Architecture#

Board

vGPU

NVIDIA A800 HGX 80GB

A800DX-80C

NVIDIA A100 HGX 80GB

A100DX-80C

NVIDIA A100 HGX 40GB

A100X-40C

NVIDIA NVSwitch Support on the NVIDIA Blackwell GPU Architecture#

Board

vGPU

NVIDIA B200 HGX 180GB

B200X-180C

NVIDIA NVSwitch Support on the NVIDIA Hopper GPU Architecture#

Board

vGPU

NVIDIA H800 SXM5 80GB

H800XM-80C

NVIDIA H200 SXM5 141GB

H200X-141C

NVIDIA H100 SXM5 80GB

H100XM-80C

NVIDIA H20 SXM5 141GB

H20X-141C

NVIDIA H20 SXM5 96GB

H20-96C

Guest OS Releases Support for NVSwitch#

Linux only. NVIDIA NVSwitch is not supported on Windows.

Scheduling Policies#

NVIDIA vGPU for Compute offers a range of scheduling policies that allow administrators to customize resource allocation based on workload intensity and organizational priorities, ensuring optimal resource utilization and alignment with business needs. These policies determine how GPU resources are shared across multiple VMs and directly impacts factors like latency, throughput, and performance stability in multi-tenant environments.

For workloads with varying demands, time slicing plays a critical role in determining scheduling efficiency. The vGPU scheduler time slice represents the duration a VM’s work is allowed to run on the GPU before it is preempted. A longer time slice maximizes throughput for compute-heavy workloads, such as CUDA applications, by minimizing context switching. In contrast, a shorter time slice reduces latency, making it ideal for latency-sensitive tasks like graphics applications.

NVIDIA provides three scheduling modes: Best Effort, Equal Share, and Fixed Share, each designed to different workload requirements and environments. For more information, refer to the vGPU Schedulers documentation.

Refer to the Changing Scheduling Behavior for Time-Sliced vGPUs documentation for how to configure and adjust scheduling policies to meet specific resource distribution needs.

Suspend-Resume#

The suspend-resume feature allows NVIDIA vGPU-configured VMs to be temporarily paused and later resumed without losing their operational state. During suspension, the entire VM state, including GPU and compute resources, is saved to disk, thereby freeing these resources on the host. Upon resumption, the state is fully restored, enabling seamless workload continuation.

This capability provides operational flexibility and optimizes resource utilization. It is valuable for planned host maintenance, freeing up resources by pausing non-critical workloads, and ensuring consistent environments for development and testing.

Unlike live migration, suspend-resume involves downtime during both suspension and resumption. Cross-host operations require strict compatibility across hosts, encompassing GPU type, Virtual GPU manager version, memory configuration, and NVLink topology.

Suspend-resume is supported on all GPUs that enable vGPU functionality; however, compatibility varies by hypervisor, NVIDIA vGPU software release, and guest operating system.

For additional information and operational instructions across different hypervisors, refer to the vGPU Suspend-Resume documentation.

Suspend-Resume Known Issues and Limitations#

Suspend-Resume Known Issues and Limitations#

Hypervisor Platform

Documentation

VMware vSphere

Known Issues and Limitations with Suspend Resume on VMware vSphere

Note

While live migration generally allows resuming a suspended VM on any compatible vGPU host manager, a current bug in Red Hat Enterprise Linux 9.4 and Ubuntu 24.04 LTS limits suspend, resume, and migration to hosts with an identical vGPU manager version. The issue has been resolved in Red Hat Enterprise Linux 9.6 and later.

Platform Support for Suspend-Resume#

Suspend-resume is supported on all GPUs that support NVIDIA vGPU for Compute, but compatibility varies by hypervisor, release version, and guest operating system.

Platform Support for Suspend-Resume#

Hypervisor Platform

Version

NVIDIA AI Enterprise Infra Release

Documentation

Red Hat Enterprise Linux with KVM

  • 9.4

  • 9.6

  • 10.0

  • NVIDIA AI Enterprise Infra 6.x

  • NVIDIA AI Enterprise Infra 7.0

Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on RHEL KVM

Ubuntu with KVM

24.04 LTS

  • NVIDIA AI Enterprise Infra 6.x

  • NVIDIA AI Enterprise Infra 7.0

Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on Ubuntu KVM

VMware vSphere

  • 8

  • 9

All active NVIDIA AI Enterprise Infra Releases

Suspending and Resuming a VM Configured with NVIDIA vGPU for Compute on VMware vSphere

vGPU Support for Suspend-Resume#

For a list of supported GPUs, refer to the Supported NVIDIA GPUs and Networking section in the NVIDIA AI Enterprise Infra Support Matrix.

Unified Virtual Memory (UVM)#

Unified Virtual Memory (UVM) provides a single, cohesive memory address space accessible by both the CPUs and GPUs within a system. This feature creates a managed memory pool, allowing data to be allocated and accessed by code executing on either processor. The primary benefit is the simplification of programming and enhanced performance for GPU-accelerated workloads, as it eliminates the need for applications to explicitly manage data transfers between CPU and GPU memory. For additional information about this feature, refer to the Unified Virtual Memory documentation.

Unified Virtual Memory (UVM)

UVM Known Issues and Limitations#

  • Unified Virtual Memory (UVM) is restricted to 1:1 time-sliced and MIG vGPU for Compute profiles that allocate the entire framebuffer of a compatible physical GPU or GPU Instance. Fractional time-sliced vGPUs do not support UVM.

  • UVM is only supported on Linux Guest OS distros. Windows Guest OS is not supported.

  • Enabling UVM disables vGPU migration for the VM, which may reduce operational flexibility in environments reliant on live migration.

  • UVM is disabled by default and must be explicitly enabled for each vGPU that requires it by setting a specific vGPU plugin parameter for the VM.

  • When deploying NVIDIA NIM, if UVM is enabled and an optimized engine is available, the model will run on the TensorRT-LLM (TRT-LLM) backend. Otherwise, it will typically run on the vLLM backend.

Hypervisor Platform Support for UVM#

Unified Virtual Memory (UVM) is disabled by default. If used, you must enable unified memory individually for each vGPU for Compute VM that requires it by setting a vGPU plugin parameter. How to enable UVM for a vGPU VM depends on the hypervisor that you are using.

vGPU Support for UVM#

UVM is supported on 1:1 MIG-backed and time sliced vGPUs. These vGPUs have the entire framebuffer of a MIG GPU Instance or physical GPU assigned to a single vGPU.

vGPU Support for UVM on the NVIDIA Blackwell GPU Architecture#

Board

vGPU

NVIDIA HGX B200 SXM

  • B200X-7-180C

  • MIG-backed 1:1 vGPUs

NVIDIA RTX PRO 6000 Blackwell SE

  • DC-4-96C

  • MIG-backed 1:1 vGPUs

vGPU Support for UVM on the NVIDIA Hopper GPU Architecture#

Board

vGPU

NVIDIA H800 PCIe 94GB (H800 NVL)

  • H800L-94C

  • All MIG-backed vGPUs

NVIDIA H800 PCIe 80GB

  • H800-80C

  • All MIG-backed vGPUs

NVIDIA H800 SXM5 80GB

  • H800XM-80C

  • All MIG-backed vGPUs

NVIDIA H200 SXM5

  • H200X-141C

  • All MIG-backed vGPUs

NVIDIA H200 NVL

  • H200-141C

  • All MIG-backed vGPUs

NVIDIA H100 PCIe 94GB (H100 NVL)

  • H100L-94C

  • All MIG-backed vGPUs

NVIDIA H100 SXM5 94GB

  • H100XL-94C

  • All MIG-backed vGPUs

NVIDIA H100 PCIe 80GB

  • H100-80C

  • All MIG-backed vGPUs

NVIDIA H100 SXM5 80GB

  • H100XM-80C

  • All MIG-backed vGPUs

NVIDIA H100 SXM5 64GB

  • H100XS-64C

  • All MIG-backed vGPUs

NVIDIA H20 SXM5 141GB

  • H20X-141C

  • All MIG-backed vGPUs

NVIDIA H20 SXM5 96GB

  • H20-96C

  • All MIG-backed vGPUs

vGPU Support for UVM on the NVIDIA Ada Lovelace GPU Architecture#

Board

vGPU

NVIDIA L40

L40-48C

NVIDIA L40S

L40S-48C

  • NVIDIA L20

  • NVIDIA L20 liquid-cooled

L20-48C

NVIDIA L4

L4-24C

NVIDIA L2

L2-24C

NVIDIA RTX 6000 Ada

RTX 6000 Ada-48C

NVIDIA RTX 5880 Ada

RTX 5880 Ada-48C

NVIDIA RTX 5000 Ada

RTX 6000 Ada-32C

vGPU Support for UVM on the NVIDIA Ampere GPU Architecture#

Board

vGPU

  • NVIDIA A800 PCIe 80GB

  • NVIDIA A800 PCIe 80GB liquid-cooled

  • NVIDIA AX800

  • A800D-80C

  • All MIG-backed vGPUs

NVIDIA A800 PCIe 40GB active-cooled

  • A800-40C

  • All MIG-backed vGPUs

NVIDIA A800 HGX 80GB

  • A800DX-80C

  • All MIG-backed vGPUs

  • NVIDIA A100 PCIe 80GB

  • NVIDIA A100 PCIe 80GB liquid-cooled

  • NVIDIA A100X

  • A100D-80C

  • All MIG-backed vGPUs

NVIDIA A100 HGX 80GB

  • A100DX-80C

  • All MIG-backed vGPUs

NVIDIA A100 PCIe 40GB

  • A100-40C

  • All MIG-backed vGPUs

NVIDIA A100 HGX 40GB

  • A100X-40C

  • All MIG-backed vGPUs

NVIDIA A40

A40-48C

  • NVIDIA A30

  • NVIDIA A30X

  • NVIDIA A30 liquid-cooled

  • A30-24C

  • All MIG-backed vGPUs

NVIDIA A16

A16-16C

NVIDIA A10

A10-24C

NVIDIA RTX A6000

A6000-48C

NVIDIA RTX A5500

A5500-24C

NVIDIA RTX A5000

A5000-24C

Product Limitations and Known Issues#

Red Hat Enterprise Linux with KVM Limitations and Known Issues#

Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.

Ubuntu KVM Limitations and Known Issues#

Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.

VMware vSphere Limitations and Known Issues#

Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.

Requirements for Using vGPU for Compute on VMware vSphere for GPUs Requiring 64 GB+ of MMIO Space with Large-Memory VMs#

Some GPUs require 64 GB or more of MMIO space. When a vGPU on a GPU that requires 64 GB or more of MMIO space is assigned to a VM with 32 GB or more of memory on ESXi , the VM’s MMIO space must be increased to the amount of MMIO space that the GPU requires.

For detailed information about this limitation, refer to the Requirements for Using vGPU on GPUs Requiring 64 GB or More of MMIO Space with Large-Memory VMs documentation.

GPUs Requiring 64GB or More of MMIO Space with Large-Memory VMs#

GPU

MMIO Space Required

NVIDIA B200

768GB

NVIDIA H200 (all variants)

512GB

NVIDIA H100 (all variants)

256GB

NVIDIA H800 (all variants)

256GB

NVIDIA H20 141GB

512GB

NVIDIA H20 96GB

256GB

NVIDIA L40

128GB

NVIDIA L20

128GB

NVIDIA L4

64GB

NVIDIA L2

64GB

NVIDIA RTX 6000 Ada

128GB

NVIDIA RTX 5000 Ada

64GB

NVIDIA A40

128GB

NVIDIA A30

64GB

NVIDIA A10

64GB

NVIDIA A100 80GB (all variants)

256GB

NVIDIA A100 40GB (all variants)

128GB

NVIDIA RTX A6000

128GB

NVIDIA RTX A5500

64GB

NVIDIA RTX A5000

64GB

Quadro RTX 8000 Passive

64GB

Quadro RTX 6000 Passive

64GB

Tesla V100 (all variants)

64GB

Microsoft Windows Server Limitations and Known Issues#

Refer to the following lists of known Red Hat Enterprise Linux with KVM product limitations and product issues.

NVIDIA AI Enterprise supports only the Tesla Compute Cluster (TCC) driver model for Windows guest drivers.

Windows guest OS support is limited to running applications natively in Windows VMs without containers. NVIDIA AI Enterprise features that depend on the containerization of applications are not supported on Windows guest operating systems.

If you are using a generic Linux supported by the KVM hypervisor, consult the documentation from your hypervisor vendor for information about Windows releases supported as a guest OS.

For more information, refer to the Non-containerized Applications on Hypervisors and Guest Operating Systems Supported with vGPU table.

Virtual GPU Types for Supported GPUs#

NVIDIA Blackwell GPU Architecture#

MIG-Backed and Time-Sliced NVIDIA vGPU for Compute for the NVIDIA Blackwell Architecture#

Physical GPUs per board: 1

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

Required license edition: NVIDIA AI Enterprise

MIG-Backed NVIDIA vGPU for Compute

For details on GPU instance profiles, refer to the NVIDIA Multi-Instance GPU User Guide.

Time-Sliced NVIDIA vGPU for Compute

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

These vGPU types support a single display with a fixed maximum resolution.

MIG-Backed NVIDIA vGPU for Compute for NVIDIA HGX B200 180GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

B200X-7-180C

180

1

7

7

MIG 7g.180gb

B200X-4-90C

90

1

4

4

MIG 4g.90gb

B200X-3-90C

90

2

3

3

MIG 3g.90gb

B200X-2-45C

45

3

2

2

MIG 2g.45gb

B200X-1-45C

45

4

1

1

MIG 1g.45gb

B200X-1-23C

22.5

7

1

1

MIG 1g.23gb

B200X-1-23CME

22.5

1

1

1

MIG 1g.23gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA HGX B200 180GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

B200X-180C

180

1

1

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA RTX PRO 6000 SE 96GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

DC-4-96C

96

1

4

4

MIG 4g.96gb

DC-4-48C

48

2

4

1

MIG 4g.48gb

DC-2-48C

48

1

2

2

MIG 2g.48gb

DC-4-32C

32

3

4

1

MIG 4g.32gb

DC-4-24C

24

4

4

1

MIG 4g.24gb

DC-2-24C

24

2

2

1

MIG 2g.24gb

DC-1-24C

24

1

1

1

MIG 1g.24gb

DC-2-16C

16

3

2

1

MIG 2g.16gb

DC-2-12C

12

4

2

1

MIG 2g.12gb

DC-1-12C

12

2

1

1

MIG 1g.12gb

DC-1-8C

8

3

1

1

MIG 1g.8gb

Time-Sliced NVIDIA vGPU for Compute for NVIDIA RTX PRO 6000 SE 96GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

DC-96C

96

1

1

3840x2400

1

DC-48C

48

2

2

3840x2400

1

DC-32C

32

3

3

3840x2400

1

DC-24C

24

4

4

3840x2400

1

DC-16C

16

6

6

3840x2400

1

DC-12C

12

8

8

3840x2400

1

DC-8C

8

12

12

3840x2400

1

NVIDIA Hopper GPU Architecture#

MIG-Backed and Time-Sliced NVIDIA vGPU for Compute for the NVIDIA Ampere GPU Architecture#

Physical GPUs per board: 1

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

Required license edition: NVIDIA AI Enterprise

MIG-Backed NVIDIA vGPU for Compute

For details on GPU instance profiles, refer to the NVIDIA Multi-Instance GPU User Guide.

Time-Sliced NVIDIA vGPU for Compute

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

These vGPU types support a single display with a fixed maximum resolution.

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H800 PCIe 94GB (H800 NVL)#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H800L-7-94C

96

1

7

7

MIG 7g.94gb

H800L-4-47C

48

1

4

4

MIG 4g.47gb

H800L-3-47C

48

2

3

3

MIG 3g.47gb

H800L-2-24C

24

3

2

2

MIG 2g.24gb

H800L-1-24C

24

4

1

1

MIG 1g.24gb

H800L-1-12C

12

7

1

1

MIG 1g.12gb

H800L-1-12CME [3]

12

1

1

1

MIG 1g.12gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H800 PCIe 94GB (H800 NVL)#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H800L-94C

96

1

1

3840x2400

1

H800L-47C

48

2

2

3840x2400

1

H800L-23C

23

4

4

3840x2400

1

H800L-15C

15

6

4

3840x2400

1

H800L-11C

11

8

8

3840x2400

1

H800L-6C

6

15

8

3840x2400

1

H800L-4C

4

23

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H800 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H800-7-80C

81

1

7

7

MIG 7g.80gb

H800-4-40C

40

1

4

4

MIG 4g.40gb

H800-3-40C

40

2

3

3

MIG 3g.40gb

H800-2-20C

20

3

2

2

MIG 2g.20gb

H800-1-20C [3]

20

4

1

1

MIG 1g.20gb

H800-1-10C

10

7

1

1

MIG 1g.10gb

H800-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H800 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H800-80C

81

1

1

3840x2400

1

H800-40C

40

2

2

3840x2400

1

H800-20C

20

4

4

3840x2400

1

H800-16C

16

5

4

3840x2400

1

H800-10C

10

8

8

3840x2400

1

H800-8C

8

10

8

3840x2400

1

H800-5C

5

16

16

3840x2400

1

H800-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H800 SXM5 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H800XM-7-80C

81

1

7

7

MIG 7g.80gb

H800XM-4-40C

40

1

4

4

MIG 4g.40gb

H800XM-3-40C

40

2

3

3

MIG 3g.40gb

H800XM-2-20C

20

3

2

2

MIG 2g.20gb

H800XM-1-20C [3]

20

4

1

1

MIG 1g.20gb

H800XM-1-10C

10

7

1

1

MIG 1g.10gb

H800XM-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H800 SXM5 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H800XM-80C

81

1

1

3840x2400

1

H800XM-40C

40

2

2

3840x2400

1

H800XM-20C

20

4

4

3840x2400

1

H800XM-16C

16

5

4

3840x2400

1

H800XM-10C

10

8

8

3840x2400

1

H800XM-8C

8

10

8

3840x2400

1

H800XM-5C

5

16

16

3840x2400

1

H800XM-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H200 PCIe 141GB (H200 NVL)#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H200-7-141C

144

1

7

7

MIG 7g.141gb

H200-4-71C

72

1

4

4

MIG 4g.71gb

H200-3-71C

72

2

3

3

MIG 3g.71gb

H200-2-35C

35

3

2

2

MIG 2g.35gb

H200-1-35C [3]

35

4

1

1

MIG 1g.35gb

H200-1-18C

18

7

1

1

MIG 1g.18gb

H200-1-18CME [3]

18

1

1

1

MIG 1g.18gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H200 PCIe 141GB (H200 NVL)#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H200-141C

144

1

1

3840x2400

1

H200-70C

71

2

2

3840x2400

1

H200-35C

35

4

4

3840x2400

1

H200-28C

28

5

5

3840x2400

1

H200-17C

17

8

8

3840x2400

1

H200-14C

14

10

10

3840x2400

1

H200-8C

8

16

16

3840x2400

1

H200-7C

7

20

20

3840x2400

1

H200-4C

4

32

32

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H200 SXM5 141GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H200X-7-141C

144

1

7

7

MIG 7g.141gb

H200X-4-71C

72

1

4

4

MIG 4g.71gb

H200X-3-71C

72

2

3

3

MIG 3g.71gb

H200X-2-35C

35

3

2

2

MIG 2g.35gb

H200X-1-35C [3]

35

4

1

1

MIG 1g.35gb

H200X-1-18C

18

7

1

1

MIG 1g.18gb

H200X-1-18CME [3]

18

1

1

1

MIG 1g.18gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H200 SXM5 141GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H200X-141C

144

1

1

3840x2400

1

H200X-70C

71

2

2

3840x2400

1

H200X-35C

35

4

4

3840x2400

1

H200X-28C

28

5

5

3840x2400

1

H200X-17C

17

8

8

3840x2400

1

H200X-14C

14

10

10

3840x2400

1

H200X-8C

8

16

16

3840x2400

1

H200X-7C

7

20

20

3840x2400

1

H200X-4C

4

32

32

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H100 PCIe 94GB (H100 NVL)#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H100L-7-94C

96

1

7

7

MIG 7g.94gb

H100L-4-47C

48

1

4

4

MIG 4g.47gb

H100L-3-47C

48

2

3

3

MIG 3g.47gb

H100L-2-24C

24

3

2

2

MIG 2g.24gb

H100L-1-24C [3]

24

4

1

1

MIG 1g.24gb

H100L-1-12C

12

7

1

1

MIG 1g.12gb

H100L-1-12CME [3]

12

1

1

1

MIG 1g.12gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H100 PCIe 94GB (H100 NVL)#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H100L-94C

96

1

1

3840x2400

1

H100L-47C

48

2

2

3840x2400

1

H100L-23C

23

4

4

3840x2400

1

H100L-15C

15

6

4

3840x2400

1

H100L-11C

11

8

8

3840x2400

1

H100L-6C

6

15

8

3840x2400

1

H100L-4C

4

23

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H100 SXM5 94GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H100XL-7-94C

96

1

7

7

MIG 7g.94gb

H100XL-4-47C

48

1

4

4

MIG 4g.47gb

H100XL-3-47C

48

2

3

3

MIG 3g.47gb

H100XL-2-24C

24

3

2

2

MIG 2g.24gb

H100XL-1-24C [3]

24

4

1

1

MIG 1g.24gb

H100XL-1-12C

12

7

1

1

MIG 1g.12gb

H100XL-1-12CME [3]

12

1

1

1

MIG 1g.12gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H100 SXM5 94GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H100XL-94C

96

1

1

3840x2400

1

H100XL-47C

48

2

2

3840x2400

1

H100XL-23C

23

4

4

3840x2400

1

H100XL-15C

15

6

4

3840x2400

1

H100XL-11C

11

8

8

3840x2400

1

H100XL-6C

6

15

8

3840x2400

1

H100XL-4C

4

23

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H100 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H100-7-80C

81

1

7

7

MIG 7g.80gb

H100-4-40C

40

1

4

4

MIG 4g.40gb

H100-3-40C

40

2

3

3

MIG 3g.40gb

H100-2-20C

20

3

2

2

MIG 2g.20gb

H100-1-20C [3]

20

4

1

1

MIG 1g.20gb

H100-1-10C

10

7

1

1

MIG 1g.10gb

H100-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H100 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H100-80

81

1

1

3840x2400

1

H100-40C

40

2

2

3840x2400

1

H100-20C

20

4

4

3840x2400

1

H100-16C

16

6

4

3840x2400

1

H100-10C

10

8

8

3840x2400

1

H100-8C

8

10

8

3840x2400

1

H100-5C

5

16

16

3840x2400

1

H100-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H100 SXM5 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H100XM-7-80C

81

1

7

7

MIG 7g.80gb

H100XM-4-40C

40

1

4

4

MIG 4g.40gb

H100XM-3-40C

40

2

3

3

MIG 3g.40gb

H100XM-2-20C

20

3

2

2

MIG 2g.20gb

H100XM-1-20C [3]

20

4

1

1

MIG 1g.20gb

H100XM-1-10C

10

7

1

1

MIG 1g.10gb

H100XM-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H100 SXM5 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H100XM-80C

81

1

1

3840x2400

1

H100XM-40C

40

2

2

3840x2400

1

H100XM-20C

20

4

4

3840x2400

1

H100XM-16C

16

6

4

3840x2400

1

H100XM-10C

10

8

8

3840x2400

1

H100XM-8C

8

10

8

3840x2400

1

H100XM-5C

5

16

16

3840x2400

1

H100XM-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H100 SXM5 64GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H100XS-7-64C

65

1

7

7

MIG 7g.64gb

H100XS-4-32C

32

1

4

4

MIG 4g.32gb

H100XS-3-32C

32

2

3

3

MIG 3g.32gb

H100XS-2-16C

16

3

2

2

MIG 2g.16gb

H100XS-1-16C [3]

16

4

1

1

MIG 1g.16gb

H100XS-1-8C

8

7

1

1

MIG 1g.8gb

H100XS-1-8CME [3]

8

1

1

1

MIG 1g.8gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H100 SXM5 64GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H100XS-64C

65

1

1

3840x2400

1

H100XS-32C

32

2

2

3840x2400

1

H100XS-16C

16

4

4

3840x2400

1

H100XS-8C

8

8

8

3840x2400

1

H100XS-4C

4

16

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H20 SXM5 141GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H20X-7-141C

144

1

7

7

MIG 7g.141gb

H20X-4-71C

72

1

4

4

MIG 4g.71gb

H20X-3-71C

72

2

3

3

MIG 3g.71gb

H20X-2-35C

35

3

2

2

MIG 2g.35gb

H20X-1-35C [3]

35

4

1

1

MIG 1g.35gb

H20X-1-18C

18

7

1

1

MIG 1g.18gb

H20X-1-18CME [3]

18

1

1

1

MIG 1g.18gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H20 SXM5 141GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H20X-141C

144

1

1

3840x2400

1

H20X-70C

71

2

2

3840x2400

1

H20X-35C

35

4

4

3840x2400

1

H20X-28C

28

5

5

3840x2400

1

H20X-17C

17

8

8

3840x2400

1

H20X-14C

14

10

10

3840x2400

1

H20X-8C

8

16

16

3840x2400

1

H20X-7C

7

20

20

3840x2400

1

H20X-4C

4

32

32

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA H20 SXM5 96GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

H20-7-96C

98

1

7

7

MIG 7g.96gb

H20-4-48C

49

1

4

4

MIG 4g.48gb

H20-3-48C

49

2

3

3

MIG 3g.48gb

H20-2-24C

24

3

2

2

MIG 2g.24gb

H20-1-24C [3]

24

4

1

1

MIG 1g.24gb

H20-1-12C

12

7

1

1

MIG 1g.12gb

H20-1-12CME [3]

12

1

1

1

MIG 1g.12gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA H20 SXM5 96GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

H20-96C

98

1

1

3840x2400

1

H20-48C

49

2

2

3840x2400

1

H20-24C

24

4

2

3840x2400

1

H20-16C

16

6

4

3840x2400

1

H20-12C

12

8

4

3840x2400

1

H20-6C

6

16

8

3840x2400

1

H20-4C

4

24

8

3840x2400

1

NVIDIA Ada Lovelace GPU Architecture#

Physical GPUs per board: 1

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

Required license edition: NVIDIA AI Enterprise

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

These vGPU types support a single display with a fixed maximum resolution.

NVIDIA vGPU for Compute for NVIDIA L40#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

L40-48C

49

1

1

3840x2400

1

L40-24C

24

2

2

3840x2400

1

L40-16C

16

3

2

3840x2400

1

L40-12C

12

4

4

3840x2400

1

L40-8C

8

6

4

3840x2400

1

L40-6C

6

8

8

3840x2400

1

L40-4C

4

12

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA L40S#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

L40S-48C

49

1

1

3840x2400

1

L40S-24C

24

2

2

3840x2400

1

L40S-16C

16

3

2

3840x2400

1

L40S-12C

12

4

4

3840x2400

1

L40S-8C

8

6

4

3840x2400

1

L40S-6C

6

8

8

3840x2400

1

L40S-4C

4

12

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA L20#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

L20-48C

49

1

1

3840x2400

1

L20-24C

24

2

2

3840x2400

1

L20-16C

16

3

2

3840x2400

1

L20-12C

12

4

4

3840x2400

1

L20-8C

8

6

4

3840x2400

1

L20-6C

6

8

8

3840x2400

1

L20-4C

4

12

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA L20 Liquid-Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

L20-48C

49

1

1

3840x2400

1

L20-24C

24

2

2

3840x2400

1

L20-16C

16

3

2

3840x2400

1

L20-12C

12

4

4

3840x2400

1

L20-8C

8

6

4

3840x2400

1

L20-6C

6

8

8

3840x2400

1

L20-4C

4

12

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA L4#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

L4-24C

24

1

1

3840x2400

1

L4-12C

12

2

2

3840x2400

1

L4-8C

8

3

2

3840x2400

1

L4-6C

6

4

4

3840x2400

1

L4-4C

4

6

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA L2#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

L2-24C

24

1

1

3840x2400

1

L2-12C

12

2

2

3840x2400

1

L2-8C

8

3

2

3840x2400

1

L2-6C

6

4

4

3840x2400

1

L2-4C

4

6

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA RTX 6000 Ada#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTX 6000 Ada-48C

49

1

1

3840x2400

1

RTX 6000 Ada-24C

24

2

2

3840x2400

1

RTX 6000 Ada-16C

16

3

2

3840x2400

1

RTX 6000 Ada-12C

12

4

4

3840x2400

1

RTX 6000 Ada-8C

8

6

4

3840x2400

1

RTX 6000 Ada-6C

6

8

8

3840x2400

1

RTX 6000 Ada-4C

4

12

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA RTX 5880 Ada#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTX 5880 Ada-48C

49

1

1

3840x2400

1

RTX 5880 Ada-24C

24

2

2

3840x2400

1

RTX 5880 Ada-16C

16

3

2

3840x2400

1

RTX 5880 Ada-12C

12

4

4

3840x2400

1

RTX 5880 Ada-8C

8

6

4

3840x2400

1

RTX 5880 Ada-6C

6

8

8

3840x2400

1

RTX 5880 Ada-4C

4

12

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA RTX 5000 Ada#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTX 5000 Ada-32C

32

1

1

3840x2400

1

RTX 5000 Ada-16C

16

2

2

3840x2400

1

RTX 5000 Ada-8C

8

4

4

3840x2400

1

RTX 5000 Ada-4C

4

8

8

3840x2400

1

NVIDIA Ampere GPU Architecture#

Physical GPUs per board: 1 (with the exception of NVIDIA A16)

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

Required license edition: NVIDIA AI Enterprise

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

These vGPU types support a single display with a fixed maximum resolution.

NVIDIA vGPU for Compute for NVIDIA A40#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A40-48C

49

1

1

3840x2400

1

A40-24C

24

2

2

3840x2400

1

A40-16C

16

3

2

3840x2400

1

A40-12C

12

4

4

3840x2400

1

A40-8C

8

6

4

3840x2400

1

A40-6C

6

8

8

3840x2400

1

A40-4C

4

12

8

3840x2400

1

Physical GPUs per board: 4

NVIDIA vGPU for Compute for NVIDIA A16#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A16-16C

16

1

1

3840x2400

1

A16-8C

8

2

2

3840x2400

1

A16-4C

4

4

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA A10#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A10-24C

24

1

1

3840x2400

1

A10-12C

12

2

2

3840x2400

1

A10-8C

8

3

2

3840x2400

1

A10-6C

6

4

4

3840x2400

1

A10-4C

4

6

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA RTX A6000#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTXA6000-48C

49

1

1

3840x2400

1

RTXA6000-24C

24

2

2

3840x2400

1

RTXA6000-16C

16

3

2

3840x2400

1

RTXA6000-12C

12

4

4

3840x2400

1

RTXA6000-8C

8

6

4

3840x2400

1

RTXA6000-6C

6

8

8

3840x2400

1

RTXA6000-4C

4

12

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA RTX A5500#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTXA5500-24C

24

1

1

3840x2400

1

RTXA5500-12C

12

2

2

3840x2400

1

RTXA5500-8C

8

3

2

3840x2400

1

RTXA5500-6C

6

4

4

3840x2400

1

RTXA5500-4C

4

6

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA RTX A5000#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTXA5000-24

24

1

1

3840x2400

1

RTXA5000-12C

12

2

2

3840x2400

1

RTXA5000-8C

8

3

2

3840x2400

1

RTXA5000-6C

6

4

4

3840x2400

1

RTXA5000-4C

4

6

4

3840x2400

1

MIG-Backed and Time-Sliced NVIDIA vGPU for Compute for the NVIDIA Ampere GPU Architecture#

Physical GPUs per board: 1

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

Required license edition: NVIDIA AI Enterprise

MIG-Backed NVIDIA vGPU for Compute

For details on GPU instance profiles, refer to the NVIDIA Multi-Instance GPU User Guide.

Time-Sliced NVIDIA vGPU for Compute

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

These vGPU types support a single display with a fixed maximum resolution.

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A800 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A800D-7-80C

81

1

7

7

MIG 7g.80gb

A800D-4-40C

40

1

4

4

MIG 4g.40gb

A800D-3-40C

40

2

3

3

MIG 3g.40gb

A800D-2-20C

20

3

2

2

MIG 2g.20gb

A800D-1-20C [3]

20

4

1

1

MIG 1g.20gb

A800D-1-10C

10

7

1

1

MIG 1g.10gb

A800D-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A800 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A800D-80C

81

1

1

3840x2400

1

A800D-40C

40

2

2

3840x2400

1

A800D-20C

20

4

4

3840x2400

1

A800D-16C

16

5

4

3840x2400

1

A800D-10C

10

8

8

3840x2400

1

A800D-8C

8

10

8

3840x2400

1

A800D-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A800 PCIe 80GB Liquid-Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A800D-7-80C

81

1

7

7

MIG 7g.80gb

A800D-4-40C

40

1

4

4

MIG 4g.40gb

A800D-3-40C

40

2

3

3

MIG 3g.40gb

A800D-2-20C

20

3

2

2

MIG 2g.20gb

A800D-1-20C [3]

20

4

1

1

MIG 1g.20gb

A800D-1-10C

10

7

1

1

MIG 1g.10gb

A800D-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A800 PCIe 80GB Liquid-Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A800D-80C

81

1

1

3840x2400

1

A800D-40C

40

2

2

3840x2400

1

A800D-20C

20

4

4

3840x2400

1

A800D-16C

16

5

4

3840x2400

1

A800D-10C

10

8

8

3840x2400

1

A800D-8C

8

10

8

3840x2400

1

A800D-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA AX800#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A800D-7-80C

81

1

7

7

MIG 7g.80gb

A800D-4-40C

40

1

4

4

MIG 4g.40gb

A800D-3-40C

40

2

3

3

MIG 3g.40gb

A800D-2-20C

20

3

2

2

MIG 2g.20gb

A800D-1-20C [3]

20

4

1

1

MIG 1g.20gb

A800D-1-10C

10

7

1

1

MIG 1g.10gb

A800D-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA AX800#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A800D-80C

81

1

1

3840x2400

1

A800D-40C

40

2

2

3840x2400

1

A800D-20C

20

4

4

3840x2400

1

A800D-16C

16

5

4

3840x2400

1

A800D-10C

10

8

8

3840x2400

1

A800D-8C

8

10

8

3840x2400

1

A800D-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A800 PCIe 40GB Active Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A800-7-40C

40

1

7

7

MIG 7g.40gb

A800-4-20C

20

1

4

4

MIG 4g.20gb

A800-3-20C

20

2

3

3

MIG 3g.20gb

A800-2-10C

10

3

2

2

MIG 2g.10gb

A800-1-10C [3]

10

4

1

1

MIG 1g.10gb

A800-1-5C

5

7

1

1

MIG 1g.5gb

A800-1-5CME [3]

5

1

1

1

MIG 1g.5gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A800 PCIe 40GB Active Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A800-40C

40

1

1

3840x2400

1

A800-20C

20

2

2

3840x2400

1

A800-10C

10

4

4

3840x2400

1

A800-8C

8

5

4

3840x2400

1

A800-5C

5

8

8

3840x2400

1

A800-4C

4

10

8

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A800 HGX 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A800DX-7-80C

81

1

7

7

MIG 7g.80gb

A800DX-4-40C

40

1

4

4

MIG 4g.40gb

A800DX-3-40C

40

2

3

3

MIG 3g.40gb

A800DX-2-20C

20

3

2

2

MIG 2g.20gb

A800DX-1-20C [3]

20

4

1

1

MIG 1g.20gb

A800DX-1-10C

10

7

1

1

MIG 1g.10gb

A800DX-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A800 HGX 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A800DX-80C

81

1

1

3840x2400

1

A800DX-40C

40

2

2

3840x2400

1

A800DX-20C

20

4

4

3840x2400

1

A800DX-16C

16

5

4

3840x2400

1

A800DX-10C

10

8

8

3840x2400

1

A800DX-8C

8

10

8

3840x2400

1

A800DX-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A100 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A100D-7-80C

81

1

7

7

MIG 7g.80gb

A100D-4-40C

40

1

4

4

MIG 4g.40gb

A100D-3-40C

40

2

3

3

MIG 3g.40gb

A100D-2-20C

20

3

2

2

MIG 2g.20gb

A100D-1-20C [3]

20

4

1

1

MIG 1g.20gb

A100D-1-10C

10

7

1

1

MIG 1g.10gb

A100D-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A100 PCIe 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A100D-80C

81

1

1

3840x2400

1

A100D-40C

40

2

2

3840x2400

1

A100D-20C

20

4

4

3840x2400

1

A100D-16C

16

5

4

3840x2400

1

A100D-10C

10

8

8

3840x2400

1

A100D-8C

8

10

8

3840x2400

1

A100D-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A100 PCIe 80GB Liquid-Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A100D-7-80C

81

1

7

7

MIG 7g.80gb

A100D-4-40C

40

1

4

4

MIG 4g.40gb

A100D-3-40C

40

2

3

3

MIG 3g.40gb

A100D-2-20C

20

3

2

2

MIG 2g.20gb

A100D-1-20C [3]

20

4

1

1

MIG 1g.20gb

A100D-1-10C

10

7

1

1

MIG 1g.10gb

A100D-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A100 PCIe 80GB Liquid-Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A100D-80C

81

1

1

3840x2400

1

A100D-40C

40

2

2

3840x2400

1

A100D-20C

20

4

4

3840x2400

1

A100D-16C

16

5

4

3840x2400

1

A100D-10C

10

8

8

3840x2400

1

A100D-8C

8

10

8

3840x2400

1

A100D-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A100X#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A100D-7-80C

81

1

7

7

MIG 7g.80gb

A100D-4-40C

40

1

4

4

MIG 4g.40gb

A100D-3-40C

40

2

3

3

MIG 3g.40gb

A100D-2-20C

20

3

2

2

MIG 2g.20gb

A100D-1-20C [3]

20

4

1

1

MIG 1g.20gb

A100D-1-10C

10

7

1

1

MIG 1g.10gb

A100D-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A100X#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A100D-80C

81

1

1

3840x2400

1

A100D-40C

40

2

2

3840x2400

1

A100D-20C

20

4

4

3840x2400

1

A100D-16C

16

5

4

3840x2400

1

A100D-10C

10

8

8

3840x2400

1

A100D-8C

8

10

8

3840x2400

1

A100D-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A100 HGX 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A100DX-7-80C

81

1

7

7

MIG 7g.80gb

A100DX-4-40C

40

1

4

4

MIG 4g.40gb

A100DX-3-40C

40

2

3

3

MIG 3g.40gb

A100DX-2-20C

20

3

2

2

MIG 2g.20gb

A100DX-1-20C [3]

20

4

1

1

MIG 1g.20gb

A100DX-1-10C

10

7

1

1

MIG 1g.10gb

A100DX-1-10CME [3]

10

1

1

1

MIG 1g.10gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A100 HGX 80GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A100DX-80C

81

1

1

3840x2400

1

A100DX-40C

40

2

2

3840x2400

1

A100DX-20C

20

4

4

3840x2400

1

A100DX-16C

16

5

4

3840x2400

1

A100DX-10C

10

8

8

3840x2400

1

A100DX-8C

8

10

8

3840x2400

1

A100DX-4C

4

20

16

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A100 PCIe 40GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A100-7-40C

40

1

7

7

MIG 7g.40gb

A100-4-20C

20

1

4

4

MIG 4g.20gb

A100-3-20C

20

2

3

3

MIG 3g.20gb

A100-2-10C

10

3

2

2

MIG 2g.10gb

A100-1-10C [3]

10

4

1

1

MIG 1g.10gb

A100-1-5C

5

7

1

1

MIG 1g.5gb

A100-1-5CME [3]

5

1

1

1

MIG 1g.5gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A100 PCIe 40GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A100-40C

40

1

1

3840x2400

1

A100-20C

20

2

2

3840x2400

1

A100-10C

10

4

4

3840x2400

1

A100-8C

8

5

4

3840x2400

1

A100-5C

5

8

8

3840x2400

1

A100-4C

4

10

8

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A100 HGX 40GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A100X-7-40C

40

1

7

7

MIG 7g.40gb

A100X-4-20C

20

1

4

4

MIG 4g.20gb

A100X-3-20C

20

2

3

3

MIG 3g.20gb

A100X-2-10C

10

3

2

2

MIG 2g.10gb

A100X-1-10C [3]

10

4

1

1

MIG 1g.10gb

A100X-1-5C

5

7

1

1

MIG 1g.5gb

A100X-1-5CME [3]

5

1

1

1

MIG 1g.5gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A100 HGX 40GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A100X-40C

40

1

1

3840x2400

1

A100X-20C

20

2

2

3840x2400

1

A100X-10C

10

4

4

3840x2400

1

A100X-8C

8

5

4

3840x2400

1

A100X-5C

5

8

8

3840x2400

1

A100X-4C

4

10

8

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A30#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A30-4-24C

24

1

4

4

MIG 4g.24gb

A30-2-12C

12

2

2

2

MIG 2g.12gb

A30-2-12CME [3]

12

1

2

2

MIG 2g.12gb+me

A30-1-6C

6

4

1

1

MIG 1g.6gb

A30-1-6CME [3]

6

1

1

1

MIG 1g.6gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A30#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A30-24C

24

1

1

3840x2400

1

A30-12C

12

2

2

3840x2400

1

A30-8C

8

3

2

3840x2400

1

A30-6C

6

4

4

3840x2400

1

A30-4C

4

6

4

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A30X#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A30-4-24C

24

1

4

4

MIG 4g.24gb

A30-2-12C

12

2

2

2

MIG 2g.12gb

A30-2-12CME [3]

12

1

2

2

MIG 2g.12gb+me

A30-1-6C

6

4

1

1

MIG 1g.6gb

A30-1-6CME [3]

6

1

1

1

MIG 1g.6gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A30X#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A30-24C

24

1

1

3840x2400

1

A30-12C

12

2

2

3840x2400

1

A30-8C

8

3

2

3840x2400

1

A30-6C

6

4

4

3840x2400

1

A30-4C

4

6

4

3840x2400

1

MIG-Backed NVIDIA vGPU for Compute for NVIDIA A30 Liquid-Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Slices per vGPU

Compute Instances per vGPU

Corresponding GPU Instance Profile

A30-4-24C

24

1

4

4

MIG 4g.24gb

A30-2-12C

12

2

2

2

MIG 2g.12gb

A30-2-12CME [3]

12

1

2

2

MIG 2g.12gb+me

A30-1-6C

6

4

1

1

MIG 1g.6gb

A30-1-6CME [3]

6

1

1

1

MIG 1g.6gb+me

Time-Sliced NVIDIA vGPU for Compute for NVIDIA A30 Liquid-Cooled#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU in Equal-Size Mode

Maximum vGPUs per GPU in Mixed-Size Mode

Maximum Display Resolution [4]

Virtual Displays per vGPU

A30-24C

24

1

1

3840x2400

1

A30-12C

12

2

2

3840x2400

1

A30-8C

8

3

2

3840x2400

1

A30-6C

6

4

4

3840x2400

1

A30-4C

4

6

4

3840x2400

1

NVIDIA Turing GPU Architecture#

Physical GPUs per board: 1

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

This GPU does not support mixed-size mode.

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

Required license edition: NVIDIA AI Enterprise

These vGPU types support a single display with a fixed maximum resolution.

NVIDIA vGPU for Compute for NVIDIA Tesla T4#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

T4-16C

16

1

3840x2400

1

T4-8C

8

2

3840x2400

1

T4-4C

4

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA Quadro RTX 6000 Passive#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTX6000P-24C

24

1

3840x2400

1

RTX6000P-12C

12

2

3840x2400

1

RTX6000P-8C

8

3

3840x2400

1

RTX6000P-6C

6

4

3840x2400

1

RTX6000P-4C

4

6

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA Quadro RTX 8000 Passive#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

RTX8000P-48C

49

1

3840x2400

1

RTX8000P-24C

24

2

3840x2400

1

RTX8000P-16C

16

3

3840x2400

1

RTX8000P-12C

12

4

3840x2400

1

RTX8000P-8C

8

6

3840x2400

1

RTX8000P-6C

6

8

3840x2400

1

RTX8000P-4C

4

8

3840x2400

1

NVIDIA Volta GPU Architecture#

Physical GPUs per board: 1

The maximum number of vGPUs per board is the product of the maximum number of vGPUs per GPU and the number of physical GPUs per board.

This GPU does not support mixed-size mode.

Intended use cases:

  • vGPUs with more than 40 GB of framebuffer: Training Workloads

  • vGPUs with 40 GB of framebuffer: Inference Workloads

Required license edition: NVIDIA AI Enterprise

These vGPU types support a single display with a fixed maximum resolution.

NVIDIA vGPU for Compute for NVIDIA Tesla V100 SXM2#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

V100X-16C

16

1

3840x2400

1

V100X-8C

8

2

3840x2400

1

V100X-4C

4

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA Tesla V100 SXM2 32GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

V100DX-32C

32

1

3840x2400

1

V100DX-16C

16

2

3840x2400

1

V100DX-8C

8

4

3840x2400

1

V100DX-4C

6

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA Tesla V100 PCIe#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

V100-16C

16

1

3840x2400

1

V100-8C

8

2

3840x2400

1

V100-4C

4

4

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA Tesla V100 PCIe 32GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

V100D-32C

32

1

3840x2400

1

V100D-16C

16

2

3840x2400

1

V100D-8C

8

4

3840x2400

1

V100D-4C

4

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA Tesla V100S PCIe 32GB#

Virtual GPU Type

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum Display Resolution [4]

Virtual Displays per vGPU

V100S-32C

32

1

3840x2400

1

V100S-16C

16

2

3840x2400

1

V100S-8C

8

4

3840x2400

1

V100S-4C

4

8

3840x2400

1

NVIDIA vGPU for Compute for NVIDIA Tesla V100 FHHL#

Virtual GPU Type

Intended Use Case

Framebuffer (GB)

Maximum vGPUs per GPU

Maximum vGPUs per Board

Maximum Display Resolution [4]

Virtual Displays per vGPU

V100L-16C

Training Workloads

16

1

1

3840x2400

1

V100L-8C

Training Workloads

8

2

2

3840x2400

1

V100L-4C

Inference Workloads

4

4

4

3840x2400

1

vGPU for Compute FAQs#

Q. What are the differences between NVIDIA vGPU for Compute and GPU passthrough?

  1. NVIDIA vGPU for Compute and GPU passthrough are two different approaches to deploying NVIDIA GPUs in a virtualized environment supported by NVIDIA AI Enterprise. NVIDIA vGPU for Compute enables multiple VMs to share a single physical GPU concurrently. This approach is highly cost-effective and scalable because GPU resources are efficiently distributed among various workloads. It also delivers excellent compute performance while utilizing NVIDIA drivers. vGPU deployments offer live migration and suspend/resume capabilities, providing greater flexibility in VM management. In contrast, GPU passthrough dedicates an entire physical GPU to a single VM. While this provides maximum performance as the VM has exclusive access to the GPU, it does not support live migration or suspend/resume features. Since the GPU cannot be shared with other VMs, passthrough is less scalable and is typically more suitable for workloads that demand dedicated GPU power.

Q. Where do I download the NVIDIA vGPU for Compute from?

  1. NVIDIA vGPU for Compute is available to download from the NVIDIA AI Enterprise Infra Collection, which you can access by logging in to the NVIDIA NGC Catalog. If you have not already purchased NVIDIA AI Enterprise and want to try it, you can obtain a NVIDIA AI Enterprise 90 Day Trial License.

Q. What is the difference between vGPU and MIG?

  1. The fundamental distinction between vGPU and MIG lies in their approach to GPU resource partitioning.

MIG (Multi-Instance GPU) employs spatial partitioning, dividing a single GPU into several independent, isolated instances. Each MIG instance possesses its own dedicated compute cores, memory, and resources, operating simultaneously and independently. This architecture guarantees predictable performance by eliminating resource contention. While an entire MIG-enabled GPU can be passed through to a single VM, individual MIG instances cannot be directly assigned to multiple VMs without the integration of vGPU. For multi-tenancy across VMs utilizing MIG, vGPU is essential. It empowers the hypervisor to manage and allocate distinct MIG-backed vGPUs to different virtual machines. Once assigned, each MIG instance functions as a separate, isolated GPU, delivering strict resource isolation and consistent performance for workloads. For more information on using vGPU with MIG, refer to the technical brief.

vGPU (Virtual GPU) utilizes temporal partitioning. This method allows multiple virtual machines to share GPU resources by alternating access through a time-slicing mechanism. The GPU scheduler dynamically assigns time slices to each VM, effectively balancing workload demands. While this approach offers greater flexibility and higher GPU utilization, performance can vary based on the specific demands of the concurrent workloads. To enable multi-tenancy, where multiple VMs share a single physical GPU, vGPU is a prerequisite. Without vGPU, a GPU can only be assigned to one VM at a time, thereby limiting scalability and overall resource efficiency.

Q. What is the difference between time-sliced vGPUs and MIG-backed vGPUs?

  1. Time-sliced vGPUs and MIG-backed vGPUs are two different approaches to sharing GPU resources in virtualized environments. Here are the key differences:

Differences Between Time-Sliced and MIG-Backed vGPUs#

Time-sliced vGPUs

MIG-backed vGPUs

Share the entire GPU among multiple VMs.

Partition the GPU into smaller, dedicated instances.

Each vGPU gets full access to all streaming multiprocessors (SMs) and engines, but only for a specific time slice.

Each vGPU gets exclusive access to a portion of the GPU’s memory and compute resources.

Processes run in series, with each vGPU waiting while others use the GPU.

Processes run in parallel on dedicated hardware slices.

The number of VMs per GPU is limited only by framebuffer size.

Depending on the number of MIG instances supported on a GPU, this can range from 4 to 7 VMs per GPU.

Better for workloads that require occasional bursts of full GPU power.

Provides better performance isolation and more consistent latency.

Q. Where can I find more information on the NVIDIA License System (NLS), which is the licensing solution for vGPU for Compute?

  1. You can refer to the NVIDIA License System documentation and the NLS FAQ.

Footnotes

  • NVIDIA HGX A100 4-GPU baseboard with four fully connected GPUs

  • NVIDIA HGX A100 8-GPU baseboards with eight fully connected GPUs

Fully connected means that each GPU is connected to every other GPU on the baseboard.