Unified Virtual Memory (UVM)#
Unified Virtual Memory (UVM) gives the guest a single virtual address space visible to both CPU and GPU. Allocations in that space can be touched from host and device code without the application issuing an explicit copy for every transfer, which simplifies some CUDA programs.
For behavior and enablement steps, refer to the Unified Virtual Memory documentation.
UVM Known Issues and Limitations#
Unified Virtual Memory (UVM) is restricted to 1:1 time-sliced and MIG vGPU for Compute profiles that allocate the entire framebuffer of a compatible physical GPU or GPU Instance. Fractional time-sliced vGPUs do not support UVM.
UVM is only supported on Linux Guest OS distros. Windows Guest OS is not supported.
Enabling UVM disables vGPU migration for the VM, which may reduce operational flexibility in environments reliant on live migration.
UVM is disabled by default and must be explicitly enabled for each vGPU that requires it by setting a specific vGPU plugin parameter for the VM.
When deploying NVIDIA NIM, if UVM is enabled and an optimized engine is available, the model will run on the TensorRT-LLM (TRT-LLM) backend. Otherwise, it will typically run on the vLLM backend.
Hypervisor Platform Support for UVM#
Unified Virtual Memory (UVM) is disabled by default. If used, you must enable unified memory individually for each vGPU for Compute VM that requires it by setting a vGPU plugin parameter. How to enable UVM for a vGPU VM depends on the hypervisor that you are using.
Hypervisor Platform |
Documentation |
|---|---|
Red Hat Enterprise Linux with KVM |
Enabling Unified Memory for NVIDIA vGPU for Compute VM on Red Hat Enterprise Linux KVM |
Ubuntu with KVM |
Enabling Unified Memory for NVIDIA vGPU for Compute VM on Ubuntu KVM |
VMware vSphere |
Enabling Unified Memory for NVIDIA vGPU for Compute VM on VMware vSphere |
vGPU Support for UVM#
UVM is supported on 1:1 MIG-backed and time-sliced vGPUs. These vGPUs have the entire framebuffer of a MIG GPU Instance or physical GPU assigned to a single vGPU.
Board |
vGPU |
|---|---|
NVIDIA HGX B300 SXM |
|
NVIDIA HGX B200 SXM |
|
NVIDIA RTX Pro 6000 Blackwell Server Edition |
|
NVIDIA RTX Pro 4500 Blackwell SE |
|
Board |
vGPU |
|---|---|
NVIDIA H800 PCIe 94 GB (H800 NVL) |
|
NVIDIA H800 PCIe 80 GB |
|
NVIDIA H800 SXM5 80 GB |
|
NVIDIA H200 SXM5 |
|
NVIDIA H200 NVL |
|
NVIDIA H100 PCIe 94 GB (H100 NVL) |
|
NVIDIA H100 SXM5 94 GB |
|
NVIDIA H100 PCIe 80 GB |
|
NVIDIA H100 SXM5 80 GB |
|
NVIDIA H100 SXM5 64 GB |
|
NVIDIA H20 SXM5 141 GB |
|
NVIDIA H20 SXM5 96 GB |
|
Board |
vGPU |
|---|---|
NVIDIA L40 |
L40-48C |
NVIDIA L40S |
L40S-48C |
|
L20-48C |
NVIDIA L4 |
L4-24C |
NVIDIA L2 |
L2-24C |
NVIDIA RTX 6000 Ada |
RTX 6000 Ada-48C |
NVIDIA RTX 5880 Ada |
RTX 5880 Ada-48C |
NVIDIA RTX 5000 Ada |
RTX 5000 Ada-32C |
Board |
vGPU |
|---|---|
|
|
NVIDIA A800 PCIe 40 GB active-cooled |
|
NVIDIA A800 HGX 80 GB |
|
|
|
NVIDIA A100 HGX 80 GB |
|
NVIDIA A100 PCIe 40 GB |
|
NVIDIA A100 HGX 40 GB |
|
NVIDIA A40 |
A40-48C |
|
|
NVIDIA A16 |
A16-16C |
NVIDIA A10 |
A10-24C |
NVIDIA RTX A6000 |
A6000-48C |
NVIDIA RTX A5500 |
A5500-24C |
NVIDIA RTX A5000 |
A5000-24C |