NVIDIA Docs Hub NVIDIA Virtual GPU (vGPU) Software NVIDIA RTX vWS: Sizing and GPU Selection Guide for Virtualized Workloads Recommended NVIDIA GPUs for NVIDIA RTX vWS

Recommended NVIDIA GPUs for NVIDIA RTX vWS

Table 2 lists the hardware specification for the most recent generation NVIDIA data center GPUs recommended for NVIDIA RTX Virtual Workstation.

Table 2 NVIDIA GPUs Recommended for RTX vWS

	L40S ²	A16 ³	L4	A40	A10
GPUs/ Board (Architecture)	Ada Lovelace	Ampere	Ada Lovelace	Ampere	Ampere
Memory Size	48GB GDDR6 with ECC	4x 16 GB GDDR6 with ECC	24GB GDDR6 with ECC	48GB GDDR6 with ECC	24GB GDDR6 with ECC
vGPU Profiles	1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 16GB, 24GB, 48GB	1GB, 2GB, 4GB, 8GB, 16GB	1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 24GB	1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 16GB, 24GB, 48GB	1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 24GB
Form Factor	PCIe Full Height Full Length adapter 4.4” (H) x 10.5” (L), dual slot	PCIe 4.0 Dual Slot Full Length Full Slot (FHFL)	PCIe low-profile, 1-slot	PCIe 4.0 Dual Slot Full Length Full Slot (FHFL)	PCIe 4.0 Single Slot Full Length Full Slot (FHFL)
Power	350W	250W	72W	300W	140W
Thermal	Passive	Passive	Passive	Passive	Passive
Use Case	Accelerates deep learning and machine learning training and inference. Along with Light to High-end 3D design and creative workflows. Flexibly runs mixed workloads for both virtual workstations and compute workloads	Entry level Virtual Workstations Upgrade path for M10	End-to-end acceleration for the next generation of AI-enabled applications from gen AI, LLM inference, small-model training and fine-tuning to 3D graphics, rendering, and video applications Upgrade path for T4	Light to High-end 3D design and creative workflows. Flexibly runs mixed workloads for both virtual workstations and compute workloads	Ideal for mainstream professional visualization applications running on high-performance mid-range virtual workstations.

Note

It is essential to resize your environment when switching from Maxwell GPUs to newer GPUs like Pascal, Turing, and Ampere GPUs. For example, the NVIDIA T4 leverages ECC memory which is enabled by default. When enabled, ECC has a 1/15 overhead cost due to the need to use extra VRAM to store the ECC bits themselves; therefore, the amount of frame buffer usable by vGPU is reduced. Additional information for each hypervisor can be found in the respective NVIDIA documentation accessible here.

Increased Frame Buffer Requirements with Ada and Ampere Architectures

With the Ada and Ampere architectures, increased frame buffer (FB) requirements are crucial to consider. It is not recommended to use 1 or 2 GB profiles due to their limitations in meeting modern workload demands. For GPUs like L40S or A40, using small profiles can quickly lead to channel limitations. Therefore, opting for larger FB profiles is essential for optimal performance.

Important points to consider:

Modern Workload Demands: Applications such as high-resolution graphics, AI, and data-intensive tasks require significant GPU memory. Small FB profiles (1-2 GB) are inadequate for these applications, leading to frequent memory overflows and degraded performance.
Channel Limitations: GPUs have a limited number of channels. Smaller FB profiles quickly exhaust these channels, preventing efficient parallel processing and leading to an application error.
Performance Optimization: Larger FB profiles provide the necessary memory bandwidth and capacity to handle complex workloads efficiently. With larger profiles, you can run fewer vGPUs simultaneously, but they receive more channels, making it less likely to encounter channel limitations and ensuring smooth and consistent performance.
Scalability: Investing in larger FB profiles not only meets current demands but also provides a buffer for future workload increases, reducing the need for frequent upgrades.

For more details on GPU channel calculations, see Understanding GPU Channels.

Table 3 Some Recommended Configurations

Virtual GPU Type	Frame Buffer (GB)	Maximum vGPUs per GPU in Equal-Size Mode	Maximum vGPUs per GPU in Mixed-Size Mode	Use Case
A40-4Q	4	12	8	Virtual Workstations (vWS)
A40-8Q	8	6	4	Virtual Workstations (vWS)
L40S-8Q	8	6	4	Virtual Workstations (vWS)
L40S-16Q	16	3	2	Virtual Workstations (vWS)

For more details on Equal-Size and Mixed-Size modes, see vGPU Profiles.

For detailed configurations and additional guidelines, please refer to the NVIDIA vGPU User Guide.

[2]

For the L40S and A40 GPUs, 1GB profiles are not recommended. Instead, larger profile sizes are advised to fully utilize the capabilities of these GPUs. This is particularly important because these GPUs are designed to handle high-performance workloads that demand more substantial resources.

[3]

NVIDIA A16 is recommended only for entry level virtual workstations with light weight users. A minimum 8GB (8Q) profile is recommended when deploying NVIDIA RTX Virtual Workstations with A16.

Previous Overview

Next Selecting the Right NVIDIA GPU for Virtualization