Overview - NVIDIA Docs

This document provides insights into leveraging NVIDIA Virtual PC (vPC) for knowledge workers. It gives recommendations based on NVIDIA’s nVector knowledge worker benchmarking and covers common questions such as:

Which NVIDIA® GPU should I use for my business needs?
How do I select the right NVIDIA virtual GPU (vGPU) profile(s) for the types of users I will have?
What advantages of running NVIDIA vPC versus traditional CPU-only virtual desktop infrastructure (VDI)?

Knowledge worker workloads will vary per user depending on many factors, including applications, the types of applications, file sizes, and the number of monitors and their resolution. The test procedures used NVIDIA nVector, a tool for capturing real-world metrics, as the testing framework for executing a typical knowledge worker workload that simulates application workflows. Since the number of monitors and their resolution directly impact sizing, our testing to support this document explored multi-monitor setups with various screen resolutions. Tests were executed on CPU-only virtual machines (VMs) as well as VMs accelerated with NVIDIA vGPU.

It is recommended that you test your unique workloads to determine the best NVIDIA virtual GPU solution to meet your needs. The most successful customer deployments start with a proof of concept (POC) and are “tuned” throughout the lifecycle of the deployment. Beginning with a POC enables customers to understand the expectations and behavior of their users and optimize their deployment for the best user density while maintaining required performance levels. Continued maintenance is essential because user behavior can change throughout a project, as well as each individual’s role within said organization. For example, a user that was once a light graphics user might become a heavy graphics user when they change teams or are assigned a different project. Management and monitoring tools enable administrators and IT staff to optimize their deployment for each user.

About NVIDIA nVector Benchmark

NVIDIA’s performance engineering team developed a methodology and benchmarking tool which simulates, at scale, a knowledge worker workflow. This workflow is a good representation of commonly used software applications:

Microsoft Word
Microsoft Excel
Microsoft PowerPoint
Google Chrome (32-bit) web browser and video streaming
PDF document viewing

These applications will perform various functions throughout the test, replicating an end user’s task. Microsoft Word, Excel, and PowerPoint create new content, modify existing content, and move content between applications. Tasks within these applications include scrolling, zooming, menu navigation, and PDF creation. Google Chrome streams live videos and visits interactive websites. Microsoft Edge acts as a PDF viewer.

When running the nVector Knowledge Worker workload at scale, nVector randomizes workloads across multiple virtual machines.

What is NVIDIA vPC?

The NVIDIA virtual PC (vPC) software edition enables the delivery of graphics-rich virtual desktops accelerated by NVIDIA GPUs. NVIDIA vPC allows sharing the same GPU across multiple virtual machines, delivering a native-PC experience to knowledge workers while improving user density. Because tasks typically done on the CPU are offloaded to the GPU, the user has a much better experience, and more users can be supported.

Virtual GPU profiles determine the amount of frame buffer allocated to your virtual machine. The vGPU profiles supported on NVIDIA GPUs with NVIDIA vPC software are 1B (with 1024 MB of frame buffer) and 2B (with 2048 MB of frame buffer). The 1B profile typically supports single and dual HD configurations and is used when considering density. The 2B profile can utilize multi-monitor setups and is typically for higher resolution configurations. Using NVIDIA’s nVector testing framework, we conducted extensive testing on various configurations with both profiles to give IT admins an idea of what to expect when they scale within their own environments. Because users work in applications with varying levels of utilization, performing a POC with your workload against the testing done within this document is recommended.

⁴ Support starts with the NVIDIA virtual GPU software March 2018 release (version 6.0).

¹³ 5K resolution support starts with the NVIDIA virtual GPU software December 2019 release (10.0).

NVIDIA vPC delivers an engaging user experience for the digital workplace. Users can be most productive using modern applications and work the way they want, from anywhere. Delivering up to 50% better performance ¹ over CPU-only VDI, NVIDIA vPC combined with A16 enables IT Departments to cost-effectively scale virtualization to every user with performance that rivals a physical PC.

Recommended NVIDIA GPU for NVIDIA vPC

Density-optimized GPUs are typically recommended for virtual desktop users running office productivity applications, streaming video, Windows 10, and Windows 11. They are designed to maximize the number of VDI users supported in a server.

	NVIDIA A16
# GPUs / Boards [Architecture]	4 [Ampere]
RT Cores	40 [4 x 10 per GPU]
Memory Size	64 GB GDDR6 [4 x 16GB per GPU]
Form Factor	PCIe 4.0 Dual Slot FHFL
Power	250W
Thermal	Passive
Optimized for	Density
1B Users per Board	64

The NVIDIA A16 is based on the NVIDIA Ampere™ architecture. The A16 is a 64 GB (4x GPUs with 16 GB per card) dual-slot FHFL card that draws up to 250 W and is passively cooled.

The A16 provides the best value for knowledge worker deployments. This enables IT Departments to maximize data center resources by running virtual workstations, deep learning inferencing, rendering, and other graphics and compute intensive workloads – all leveraging the same data center infrastructure. This ability to run mixed workloads can increase user productivity by enabling your datacenter to run day and night. For example, IT can distribute resources by day for knowledge workers, and at night run rendering and compute workloads. This approach maximizes utilization and reduces costs in the data center. For additional information regarding A16, please check out the A16 product features.

Note

For the ability to run mixed workloads with the A16 please note the appropriate software licenses within the NVIDIA Virtual GPU Software Packaging, Pricing, and Licensing Guide.

All current GPUs support ECC memory, and it’s enabled by default, which reduces the size of usable VRAM compared to what’s available when ECC is disabled. The physical GPU (i.e. not running vGPU) sees the same VRAM reduction when ECC is enabled. It is essential to resize your environment when switching from Maxwell, Pascal, and Turing GPUs; to newer GPUs like the A16. Additional information can be found here.

Note

To maximize available VRAM on newer GPUs like the A16, ensure ECC memory is disabled. This step is crucial when transitioning from Maxwell, Pascal, and Turing architectures.

NVIDIA vPC VDI Cost per User

The figure assumes an estimated street price of the GPU plus the cost of NVIDIA vPC software with a four-year subscription, divided by the number of users. NVIDIA recommends the A16 for vPC, highlighting its optimization for density and cost-effectiveness per user. However, in scenarios demanding more graphics-intensive tasks like ray tracing, and where scaling is not a focus, the L40 or L4 may be more suitable. The L40, equipped with 142 RT Cores per GPU, or the L4, with 60 RT Cores per GPU, could deliver superior performance in such situations.

The complete list of NVIDIA GPUs that support vPC can be found here.

[1]

Performance measured using NVIDIA nVector benchmark running knowledge worker workloads (Excel, Word, PowerPoint, Chrome, Media Player, PDF) running on dual 1920x1080 resolution displays with NVIDIA vPC (vGPU 13.0) and NVIDIA A16-1B measuring frames per second.