NVIDIA vGPU for Compute Installation#

Install NVIDIA vGPU for Compute to enable GPU virtualization.

Installation Overview

  1. Verify Prerequisites - Confirm hardware, BIOS settings, and licensing requirements

  2. Install :term:`NGC CLI` - Download software from NVIDIA NGC Catalog

  3. Install :term:`Virtual GPU Manager` - Deploy on hypervisor host (VMware, KVM, Nutanix)

  4. Verify :term:`Fabric Manager` - Included in the NVIDIA AI Enterprise drivers for HGX multi-GPU configurations

  5. Install :term:`vGPU Guest Driver` - Deploy in each virtual machine

  6. Configure Licensing - Connect VMs to NVIDIA License System

Refer to the NVIDIA AI Enterprise Product Support Matrix for supported platforms and versions.

Prerequisites#

Confirm the following before you install NVIDIA vGPU for Compute.

System Requirements#

  • At least one NVIDIA data center GPU in a single NVIDIA AI Enterprise compatible NVIDIA-Certified Systems. NVIDIA recommends the following GPUs based on your infrastructure.

    Table 47 System Requirements Use Cases#

    Use Case

    GPU

    AI Inference and Mainstream AI Servers

    • NVIDIA A30

    • NVIDIA A100

    • 1 - 8x NVIDIA L4

    • NVIDIA L40S

    • NVIDIA H100 NVL

    • NVIDIA H200 NVL

    • NVIDIA RTX Pro 6000 Blackwell Server Edition

    • NVIDIA RTX Pro 4500 Blackwell Server Edition

    AI Model Training (Large) and Inference (HGX Scale Up and Out Server)

    • NVIDIA H100 HGX

    • NVIDIA H200 HGX

    • NVIDIA B200 HGX

    • NVIDIA B300 HGX

  • If using GPUs based on the NVIDIA Ampere architecture or later, the following BIOS settings are enabled on your server platform:

    • Single Root I/O Virtualization (SR-IOV) - Enabled

    • VT-d/IOMMU - Enabled

  • NVIDIA AI Enterprise License

  • NVIDIA AI Enterprise Software:

    • NVIDIA Virtual GPU Manager

    • NVIDIA vGPU for Compute Guest Driver

Use nvidia-smi for testing, monitoring, and benchmarking.

Recommended server settings:

  • Hyperthreading - Enabled

  • Power Setting or System Profile - High Performance

  • CPU Performance - Enterprise or High Throughput (if available in the BIOS)

  • Memory Mapped I/O greater than 4-GB - Enabled (if available in the BIOS)

Installing NGC CLI#

Use the NGC Catalog CLI to download NVIDIA Virtual GPU Manager and the vGPU for Compute Guest Driver from the NVIDIA NGC Catalog.

To install the NGC Catalog CLI:

  1. Log in to the NVIDIA NGC Catalog.

  2. In the top right corner, click Welcome and then select Setup from the menu.

  3. Click Downloads under Install NGC CLI from the Setup page.

  4. From the CLI Install page, click the Windows, Linux, or MacOS tab, according to the platform from which you will be running NGC Catalog CLI.

  5. Follow the instructions to install the CLI.

  6. Verify the installation by entering ngc --version in a terminal or command prompt. The output should be NGC Catalog CLI x.y.z where x.y.z indicates the version.

  7. Configure NGC CLI so that you can run the commands. You will be prompted to enter your NGC API Key. Enter the following command:

    $ ngc config set
    
    Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: (COPY/PASTE API KEY)
    
    Enter CLI output format type [ascii]. Choices: [ascii, csv, json]: ascii
    
    Enter org [no-org]. Choices: ['no-org']:
    
    Enter team [no-team]. Choices: ['no-team']:
    
    Enter ace [no-ace]. Choices: ['no-ace']:
    
    Successfully saved NGC configuration to /home/$username/.ngc/config
    
  8. In a terminal or command window, run:

    • NVIDIA Virtual GPU Manager

      ngc registry resource download-version "nvidia/vgpu/vgpu-host-driver-X:X.X"
      
    • NVIDIA vGPU for Compute Guest Driver

      ngc registry resource download-version "nvidia/vgpu/vgpu-guest-driver-X:X.X"
      

For more information on configuring the NGC CLI, refer to the Getting Started with the NGC CLI documentation.

Installing NVIDIA Virtual GPU Manager#

Install Virtual GPU Manager on the hypervisor to enable GPU virtualization. Steps depend on the platform. This section assumes:

  • You have downloaded the Virtual GPU Manager software from NVIDIA NGC Catalog

  • You want to deploy the NVIDIA vGPU for Compute on a single server node

Table 48 Hypervisor Platform Installation Instructions for the NVIDIA Virtual GPU Manager#

Hypervisor Platform

Installation Instructions

Red Hat Enterprise Linux KVM

Installing and Configuring the NVIDIA Virtual GPU Manager for Red Hat Enterprise Linux KVM

Ubuntu KVM

Installing and Configuring the NVIDIA Virtual GPU Manager for Ubuntu

VMware vSphere

Installing and Configuring the NVIDIA Virtual GPU Manager for VMware vSphere

Next, install the vGPU Guest Driver in each guest VM.

NVIDIA Fabric Manager on HGX Servers#

NVIDIA Fabric Manager coordinates NVSwitch and NVLink on NVIDIA HGX platforms for multi-GPU VMs.

Starting with NVIDIA AI Enterprise Infra 8.0 (vGPU 20.0), Fabric Manager and Fabric Manager development binaries are integrated into the NVIDIA AI Enterprise drivers. A separate Fabric Manager installation is no longer required. NVIDIA NVLink System Monitor (NVLSM) continues to be provided as a standalone utility.

When Fabric Manager Is Required

  • Required for multi-GPU VMs (1, 2, 4, or 8 GPUs) on HGX platforms

  • Necessary for Ampere, Hopper, and Blackwell HGX systems with NVSwitch

  • Enables high-bandwidth interconnect topologies for AI training and large-scale workloads

It provides a unified GPU memory fabric, monitors NVLinks, and supports high-bandwidth communication among GPUs in the same VM.

Note

  • Fabric Manager is available after you install the NVIDIA Virtual GPU Manager or NVIDIA Data Center GPU Driver. No separate package installation is required.

  • Start the Fabric Manager service before creating VMs with multi-GPU configurations. Without it on HGX, GPU topologies inside the VM may be incomplete or non-functional. For capabilities, configuration, and usage, refer to the NVIDIA Fabric Manager User Guide.

  • For Fabric Manager integration or 1-, 2-, 4-, or 8-GPU VM deployment on your hypervisor, refer to your hypervisor vendor documentation.

Installing NVIDIA vGPU Guest Driver#

Install the NVIDIA vGPU Guest Driver in each virtual machine to enable GPU access. The process is the same for vGPU, Passthrough, and bare-metal. This section assumes:

  • You have downloaded the vGPU for Compute Guest Driver from NVIDIA NGC Catalog

  • The Guest VM has been created and booted on the hypervisor

After installation, license each guest VM through the NVIDIA License System for full capability. Refer to Licensing vGPU VMs. Then configure vGPU profiles per Configuration.

Installing the NVIDIA GPU Operator Using a Bash Shell Script#

A bash script that installs the NVIDIA GPU Operator with the NVIDIA vGPU for Compute Driver is available in the NVIDIA AI Enterprise Infra 8 collection.

Note

Use this path only when the Guest VM does not already have the vGPU for Compute Driver; the GPU Operator installs that driver.

Refer to the GPU Operator for deploying the vGPU for Compute Driver with the script.

Installing NVIDIA AI Enterprise Applications Software#

Prerequisites#

Before you install any NVIDIA AI Enterprise container:

  • Guest OS is supported.

  • The VM has a valid vGPU for Compute license (refer to Licensing vGPU VMs).

  • At least one NVIDIA GPU is visible to the system.

  • The vGPU for Compute Guest Driver is installed; nvidia-smi lists the GPU.

Installing Docker Engine

Install Docker for your Guest VM Linux distribution using the official Docker Installation Guide.

Installing the NVIDIA Container Toolkit

The NVIDIA Container Toolkit adds a runtime and helpers so Docker containers use NVIDIA GPUs automatically. Enable the Docker repo and install the toolkit on the Guest VM per Installing the NVIDIA Container Toolkit.

Then configure the Docker runtime using Configuration.

Verifying the Installation: Run a Sample CUDA Container

Run a sample CUDA container test on the GPU per Running a Sample Workload.

Accessing NVIDIA AI Enterprise Containers on NGC

NVIDIA AI Enterprise application images live in the NVIDIA NGC Catalog under the NVIDIA AI Enterprise Supported label.

Each image ships the user-space stack for that workload: CUDA libraries, cuDNN, Magnum IO where needed, TensorRT, and the framework.

  1. Create an NGC API key using the catalog URL NVIDIA provides.

  2. Authenticate with Docker to NGC Registry. In your shell, run:

    docker login nvcr.io
    Username: $oauthtoken
    Password: <paste-your-NGC_API_key-here>
    
    A successful login (``Login Succeeded``) lets you pull containers from NGC.
    
  3. From the NVIDIA vGPU for Compute VM, browse the NGC Catalog for containers labeled NVIDIA AI Enterprise Supported.

  4. Copy the relevant docker pull command.

    sudo docker pull nvcr.io/nvaie/rapids-pb25h1:x.y.z-runtime
    

    Where x.y.z is the version of your container.

  5. Run the container with GPU access.

    sudo docker run --gpus all -it --rm nvcr.io/nvaie/rapids-pb25h1:x.y.z-runtime
    

    Where x.y.z is the version of your container.

    This starts an interactive session with all vGPUs on the Guest VM exposed to the container.

Podman (a Docker alternative) follows a similar install flow for NVIDIA AI Enterprise containers. See NVIDIA AI Enterprise: RHEL with KVM Deployment Guide.

Cloud Native Stack (CNS) bundles Ubuntu or RHEL, Kubernetes, Helm, and the NVIDIA GPU and Network Operator for cloud-native GPU workloads.

Use the repository installation guides for OS-specific steps and for deploying an NGC Catalog app to validate GPU access.

Next Steps#

After installing the vGPU Manager and guest drivers: