Is this page helpful?

NVIDIA vGPU Types Reference#

This reference provides complete vGPU type specifications for all supported NVIDIA GPU architectures.

Quick Navigation by GPU Architecture

🆕 Blackwell Architecture

Latest generation GPUs

NVIDIA B300 HGX
NVIDIA B200 HGX
NVIDIA RTX PRO 4500 Blackwell SE
NVIDIA RTX PRO 6000 Blackwell SE

Blackwell Architecture vGPU Types

Hopper Architecture

High-performance AI training and inference

NVIDIA H100, H200, H800, H20

Hopper Architecture vGPU Types

Ada Lovelace Architecture

Advanced ray tracing and AI

NVIDIA L4, L20, L40, L40S
NVIDIA RTX 6000 Ada

Ada Lovelace Architecture vGPU Types

Ampere Architecture

Proven AI and HPC performance

NVIDIA A100, A30, A40, A10, A16
NVIDIA RTX A-series

Ampere Architecture vGPU Types

Turing Architecture

First-generation ray tracing

NVIDIA T4

Turing Architecture vGPU Types

Understanding vGPU Types

vGPU types define the GPU resources allocated to virtual machines. Each type specifies:

Framebuffer Size - Amount of GPU memory
Maximum vGPUs - Number of vGPUs supported per physical GPU
Compute Resources - SMs, encoders, decoders
License Edition - Required NVIDIA AI Enterprise license

vGPU Configuration Modes

Table 47 vGPU Configuration Comparison#
Mode	Isolation	Use Case	Supported Architectures
Time-Sliced	Temporal	General-purpose, cost-effective	All architectures
MIG-Backed	Spatial (hardware)	Multi-tenant, guaranteed performance	Ampere, Hopper, Blackwell
Time-Sliced MIG-Backed	Spatial + Temporal	Maximum density with isolation	Blackwell (RTX PRO 4500, RTX PRO 6000)

For detailed configuration guidance, refer to vGPU Configuration.

Frequently Asked Questions#

Q. What are the differences between NVIDIA vGPU for Compute and GPU passthrough?

A. Both are supported ways to use NVIDIA GPUs with NVIDIA AI Enterprise in a virtualized environment.

Table 48 vGPU for Compute vs GPU Passthrough#
	vGPU for Compute	GPU Passthrough
GPU sharing	One physical GPU shared by multiple VMs	One physical GPU dedicated to one VM
Memory Isolation	Time sliced vGPU: Strong hardware based memory isolation enforced by IOMMU, configured through hypervisor MIG backed vGPU: Dedicated L2 cache, memory controllers, and DRAM address buses per MIG instance provide strong hardware level spatial isolation	Complete. A single VM has exclusive GPU access
Fault isolation	Faults in one VM do not propagate to others	Complete. GPU fault or VM crash affects only that VM and its dedicated GPU
Live migration	Supported on compatible hypervisors and vGPU types	Not supported for the GPU device
Suspend/resume	Supported on compatible hypervisors and vGPU types	Not supported for the GPU device
Framebuffer per VM	Fraction of physical GPU memory (configured per vGPU profile)	Full physical GPU memory
Scheduling	Hypervisor-managed (Best Effort, Equal Share, or Fixed Share)	N/A — VM has exclusive GPU access
Heterogeneous profiles	Supported — mixed framebuffer sizes on one GPU	N/A — single VM per GPU
Best for	Multi-tenant, shared infrastructure, density optimization	Single-VM workloads requiring full GPU performance

Q. Where do I download the NVIDIA vGPU for Compute from?

Download from the NVIDIA AI Enterprise Infra 8 collection after signing in to the NVIDIA NGC Catalog. For evaluation without a purchase, use the NVIDIA AI Enterprise 90 Day Trial License.

Q. What is the difference between vGPU and MIG?

A. They partition the GPU differently: MIG uses fixed hardware slices (spatial partitioning); vGPU time-multiplexes access (temporal partitioning).

MIG (Multi-Instance GPU) splits one GPU into isolated instances, each with dedicated memory and compute. Instances run in parallel with minimal cross-tenant interference. You can pass through a whole MIG-capable GPU to one VM; assigning distinct MIG instances to different VMs in a multi-tenant setup uses MIG-backed vGPU so the hypervisor can hand each VM its slice. Refer to the technical brief.

vGPU (Virtual GPU) schedules time slices so multiple VMs share one physical GPU. Utilization is often high, but latency and throughput depend on what else is running. Sharing one GPU across several VMs in this way is what vGPU software is for; without it, a single GPU is typically bound to one VM at a time.

Q. What is the difference between time-sliced vGPUs and MIG-backed vGPUs?

Both share GPU capacity across VMs. Time-sliced vGPUs multiplex the full GPU in time (processes run in series), while MIG-backed vGPUs assign dedicated hardware partitions for true parallel execution with stronger isolation. For a detailed comparison, refer to NVIDIA vGPU for Compute Features.

Q. Where can I find more information on the NVIDIA License System (NLS), the licensing solution for vGPU for Compute?

Refer to the NVIDIA License System and the NLS FAQ.

Reference Pages by Architecture