NVIDIA RTX vWS: Sizing and GPU Selection Guide for Virtualized Workloads
NVIDIA RTX vWS: Sizing and GPU Selection Guide for Virtualized Workloads

Recommended NVIDIA GPUs for NVIDIA RTX vWS

Table 2 lists the hardware specification for the most recent generation NVIDIA data center GPUs recommended for NVIDIA RTX Virtual Workstation.

Table 2 - NVIDIA GPUs Recommended for RTX vWS

L40S 2

A16 3

L4

A40

A10

GPUs/ Board (Architecture) Ada Lovelace Ampere Ada Lovelace Ampere Ampere
Memory Size 48GB GDDR6 with ECC 4x 16 GB GDDR6 with ECC 24GB GDDR6 with ECC 48GB GDDR6 with ECC 24GB GDDR6 with ECC
vGPU Profiles 1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 16GB, 24GB, 48GB 1GB, 2GB, 4GB, 8GB, 16GB 1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 24GB 1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 16GB, 24GB, 48GB 1GB, 2GB, 3GB, 4GB, 6GB, 8GB, 12GB, 24GB
Form Factor PCIe Full Height Full Length adapter 4.4” (H) x 10.5” (L), dual slot PCIe 4.0 Dual Slot Full Length Full Slot (FHFL) PCIe low-profile, 1-slot PCIe 4.0 Dual Slot Full Length Full Slot (FHFL) PCIe 4.0 Single Slot Full Length Full Slot (FHFL)
Power 350W 250W 72W 300W 140W
Thermal Passive Passive Passive Passive Passive
Use Case Accelerates deep learning and machine learning training and inference. Along with Light to High-end 3D design and creative workflows. Flexibly runs mixed workloads for both virtual workstations and compute workloads Entry level Virtual Workstations Upgrade path for M10 End-to-end acceleration for the next generation of AI-enabled applications from gen AI, LLM inference, small-model training and fine-tuning to 3D graphics, rendering, and video applications Upgrade path for T4 Light to High-end 3D design and creative workflows. Flexibly runs mixed workloads for both virtual workstations and compute workloads Ideal for mainstream professional visualization applications running on high-performance mid-range virtual workstations.
Note

It is essential to resize your environment when switching from Maxwell GPUs to newer GPUs like Pascal, Turing, and Ampere GPUs. For example, the NVIDIA T4 leverages ECC memory which is enabled by default. When enabled, ECC has a 1/15 overhead cost due to the need to use extra VRAM to store the ECC bits themselves; therefore, the amount of frame buffer usable by vGPU is reduced. Additional information for each hypervisor can be found in the respective NVIDIA documentation accessible here.

With the Ada and Ampere architectures, increased frame buffer (FB) requirements are crucial to consider. It is not recommended to use 1 or 2 GB profiles due to their limitations in meeting modern workload demands. For GPUs like L40S or A40, using small profiles can quickly lead to channel limitations. Therefore, opting for larger FB profiles is essential for optimal performance.

Important points to consider:

  • Modern Workload Demands: Applications such as high-resolution graphics, AI, and data-intensive tasks require significant GPU memory. Small FB profiles (1-2 GB) are inadequate for these applications, leading to frequent memory overflows and degraded performance.

  • Channel Limitations: GPUs have a limited number of channels. Smaller FB profiles quickly exhaust these channels, preventing efficient parallel processing and leading to an application error.

  • Performance Optimization: Larger FB profiles provide the necessary memory bandwidth and capacity to handle complex workloads efficiently. With larger profiles, you can run fewer vGPUs simultaneously, but they receive more channels, making it less likely to encounter channel limitations and ensuring smooth and consistent performance.

  • Scalability: Investing in larger FB profiles not only meets current demands but also provides a buffer for future workload increases, reducing the need for frequent upgrades.

For more details on GPU channel calculations, see Understanding GPU Channels.

Table 3 - Some Recommended Configurations

Virtual GPU

Type

Frame Buffer

(GB)

Maximum vGPUs per GPU in

Equal-Size Mode

Maximum vGPUs per GPU in

Mixed-Size Mode

Use Case

A40-4Q 4 12 8 Virtual Workstations (vWS)
A40-8Q 8 6 4 Virtual Workstations (vWS)
L40S-8Q 8 6 4 Virtual Workstations (vWS)
L40S-16Q 16 3 2 Virtual Workstations (vWS)

For more details on Equal-Size and Mixed-Size modes, see vGPU Profiles.

For detailed configurations and additional guidelines, please refer to the NVIDIA vGPU User Guide.

[2]

For the L40S and A40 GPUs, 1GB profiles are not recommended. Instead, larger profile sizes are advised to fully utilize the capabilities of these GPUs. This is particularly important because these GPUs are designed to handle high-performance workloads that demand more substantial resources.

[3]

NVIDIA A16 is recommended only for entry level virtual workstations with light weight users. A minimum 8GB (8Q) profile is recommended when deploying NVIDIA RTX Virtual Workstations with A16.

Previous Overview
Next Selecting the Right NVIDIA GPU for Virtualization
© Copyright © 2024, NVIDIA Corporation. Last updated on Oct 3, 2024.