Recommended NVIDIA GPUs for NVIDIA vPC#
For NVIDIA vPC deployments targeting enterprise knowledge workers, the primary considerations are user density, cost effectiveness, and a responsive desktop experience. The GPUs recommended in Table 1 are designed to help maximize user density per server while delivering cost-efficient performance for large-scale VDI deployments.
The NVIDIA RTX PRO 4500 Blackwell Server Edition is the primary recommended GPU for NVIDIA vPC, providing a strong balance of user experience, quality of service, and cost per user. It also supports Multi-Instance GPU (MIG), enabling additional deployment flexibility for vPC environments.
RTX PRO 4500 Blackwell Server Edition |
NVIDIA L4 |
NVIDIA A16 |
|
|---|---|---|---|
GPU Architecture |
Blackwell |
Ada Lovelace |
Ampere |
Number of GPUs / Board |
1 |
1 |
4 |
Memory Size |
32 GB GDDR7 |
24 GB GDDR6 |
64 GB GDDR6 [16 GB per GPU] |
MIG Support |
Yes, up to 2 MIG slices |
No |
No |
Form Factor |
PCIe 5.0 Single Slot, FHFL |
PCIe 4.0 Single-Slot, Low-Profile |
PCIe 4.0 Dual Slot, FHFL |
Power |
165W |
72W |
250W |
Thermal |
Passive |
Passive |
Passive |
Positioning |
Primary recommendation for vPC deployments prioritizing strong user experience, QoS, and cost per user |
Compact single-slot GPU for graphics, video, and inference workloads |
Multi-GPU board for high-density vPC deployments |
Maximum 1B Users per Board |
Not supported on Blackwell |
24 |
64 [16 Users per GPU] |
Maximum 2B Users per Board |
16 |
12 |
32 [8 Users per GPU] |
Maximum 3B Users per Board |
10 |
8 |
20 [5 Users per GPU] |
Note
For the ability to run mixed workloads with NVIDIA GPUs, please note the appropriate software licenses within the NVIDIA Virtual GPU Software Packaging, Pricing, and Licensing Guide.
NVIDIA vGPU supports error correcting code (ECC) on capable GPUs. ECC memory improves data integrity by detecting and handling double-bit errors. However, not all GPUs, vGPU types, and hypervisor software versions support ECC memory with NVIDIA vGPU.
On some GPU architectures, enabling ECC reduces the amount of frame buffer that is usable by vGPUs. As a result, less VRAM is available to workloads. Disabling ECC increases the amount of usable VRAM, but removes ECC protection.
When planning deployments or migrating between GPU architectures, verify whether ECC affects usable VRAM on the target GPU and size the environment accordingly. This consideration is especially important for Maxwell, Pascal, and Turing GPUs. By contrast, Blackwell-generation GPUs use ECC that is built into GDDR7 memory and is always enabled without reducing usable VRAM. More information about ECC memory for NVIDIA vGPUs is available here.
GPUs Supporting MIG-Backed vGPU#
The NVIDIA RTX PRO 4500 Blackwell Server Edition supports MIG-backed vGPU, enabling virtual GPUs to be created from individual MIG slices and assigned to virtual machines. This model combines MIG’s hardware-level spatial partitioning with the temporal partitioning capabilities of vGPU, providing stronger isolation, more predictable performance, better quality of service (QoS), and greater flexibility in how GPU resources are shared across workloads.
For multi-user vPC environments, standard time-sliced vGPU improves utilization through temporal sharing, but MIG-backed vGPU addresses a different requirement: predictability and isolation at scale. By providing hardware-isolated GPU instances, MIG-backed vGPU helps reduce contention between workloads and supports more consistent per-VM performance. This makes MIG-backed vGPU especially valuable in enterprise knowledge worker deployments, where organizations want to scale efficiently without compromising the consistency of the virtual desktop experience. More information on MIG-backed vGPU is available here.
NVIDIA RTX PRO 4500 Blackwell Server Edition#
The NVIDIA RTX PRO 4500 Blackwell Server Edition is the primary recommended GPU for NVIDIA vPC. With 32 GB of GDDR7 memory and a 165 W single-slot design, it provides a strong balance of performance, user density, and deployment efficiency for modern virtual desktop environments. It also delivers higher NVENC throughput, enabling more efficient remote display streaming and helping improve image quality and responsiveness for end users.
For enterprise knowledge worker deployments, the RTX PRO 4500 Blackwell Server Edition delivers four core advantages: cost effectiveness, user density, responsive end-user experience, and power-efficient rack deployment. Its Blackwell architecture introduces Universal MIG, enabling up to two isolated GPU instances for compute and graphics workloads with predictable performance and strict resource guarantees.
The RTX PRO 4500 Blackwell Server Edition is also a strong modernization path for customers moving from earlier generations of data center GPUs, including NVIDIA Ampere (A-series) or Ada Lovelace (L-series) architectures. Its architectural improvements and deployment-ready integration with vGPU 20.0 make it a strong platform for organizations refreshing their virtual desktop infrastructure.
NVIDIA L4#
The NVIDIA L4, based on the NVIDIA Ada Lovelace architecture, is a low-profile, single-slot card with 24 GB of GDDR6 memory. It is optimized for video and inference at scale across a broad range of AI applications, enabling high-quality, personalized user experiences. The L4 offers a cost-effective and energy-efficient solution for delivering high throughput and low latency, making it well suited for deployment in any server—from the edge, to the data center, to the cloud.
NVIDIA A16#
The NVIDIA A16, based on the NVIDIA Ampere architecture, is a dual-slot FHFL card with 64 GB of total memory (4 GPUs with 16 GB each) and a 250 W passive design. It is optimized for high user density and delivers exceptional value in vPC environments targeting task and knowledge workers. The A16 enables IT teams to maximize data center efficiency by consolidating user sessions at scale.
Knowledge Worker VDI#
For knowledge worker VDI workloads, the principal factors in determining cost effectiveness are performance per dollar, user density, and end-user experience. As more users are added to a server, CPU utilization increases. Adding an NVIDIA GPU for this workload offloads graphics rendering from the CPU, helping improve responsiveness and overall desktop experience for end users.
In this context, the NVIDIA RTX PRO 4500 Blackwell Server Edition is a modern platform for enterprise vPC deployments that require a strong balance of density, user experience, and deployment efficiency. In particular, an 8x RTX PRO 4500 Blackwell Server Edition per server configuration provides a compelling alternative to the NVIDIA A16 for organizations seeking modern architecture, predictable QoS through MIG-backed vGPU, and strong rack-level efficiency.
GPU |
Maximum Users per GPU Board |
Maximum Boards per 2U Server |
Maximum Users per 2U Server |
|---|---|---|---|
RTX PRO 6000 Blackwell Server Edition (with 2 GB Profile Size) |
48 [1] |
4 |
192 [2] |
RTX PRO 4500 Blackwell Server Edition (with 2 GB Profile Size) |
16 |
8 |
128 |
L40S (with 1 GB Profile Size) |
32 |
8 |
256 |
L4 (with 1 GB Profile Size) |
24 |
16 |
384 |
A40 (with 1 GB Profile Size) |
32 |
8 |
256 |
A10 (with 1 GB Profile Size) |
24 |
16 |
384 |
A16 (with 1 GB Profile Size) |
64 (16 x 4) |
4 |
256 |
Table 2 assumes that each user requires a vGPU profile with 1 or 2GB of frame buffer, and therefore reflects maximum supported density rather than a recommended deployment point. Higher user density does not necessarily translate to the best end-user experience, because the optimal profile size depends on the applications, display configuration, and user behavior in a given environment. Modern Operating Systems including Windows 11 and evolving application workloads require larger frame buffer sizes. With GPU demands continuing to increase, larger profile sizes (2B or 3B) are often needed to maintain a consistent user experience. To determine the profile size and user density that provide the best end-user experience, NVIDIA recommends conducting a proof of concept (POC).
Impact of GPU Sharing#
NVIDIA vGPU software enables multiple virtual machines (VMs) to share a single physical GPU. This improves overall GPU utilization, but the way resources are shared depends on the underlying virtualization technology: either time-sliced vGPU or MIG-backed vGPU.
Time-Sliced vGPU Sharing#
With time-sliced vGPU, multiple VMs share GPU access over time. NVIDIA vGPU software uses the best effort scheduler by default, which aims to balance performance across vGPUs.
Scheduling Options for GPU Sharing#
To accommodate a variety of Quality of Service (QoS) levels for sharing a GPU, NVIDIA vGPU software provides multiple GPU scheduling options. For more information about these GPU scheduling options, refer to vGPU Schedulers.
MIG-Backed vGPU Sharing#
With MIG (Multi-Instance GPU), a single physical GPU is partitioned at the hardware level into multiple fully isolated GPU instances. This provides guaranteed performance isolation between VMs.
Performance Allocation#
Unlike standard time-sliced vGPU, MIG partitions a physical GPU into hardware-isolated GPU instances. Each MIG instance is assigned a dedicated slice of GPU resources with its own Streaming Multiprocessors (SMs) and memory subsystem, which provides stronger performance isolation than standard time-sliced vGPU.
Multiple vGPUs can also be created within a single MIG slice. This capability is referred to as MIG-backed time-sliced vGPU. In this case, the vGPUs remain bounded by the hardware-isolated resources of the MIG slice to which they are assigned. For example:
When MIG instances are created, each instance delivers consistent and isolated performance to its assigned VM.
Within each MIG slice, vGPUs can be created and time-sliced within that isolated slice. These vGPUs can be assigned to separate VMs, which continue to benefit from MIG’s hardware-level isolation boundaries.