Overview

NVIDIA virtual GPU (vGPU) allows multiple virtual machines (VMs) to have simultaneous, direct access to a single physical GPU using the same NVIDIA graphics drivers deployed on nonvirtualized operating systems. It also aggregates multiple GPUs and allocates to a single virtual machine to power the most demanding workloads. This gives VMs unparalleled graphics performance, application compatibility, cost-effectiveness, and scalability brought about by sharing a GPU among multiple workloads.

This section covers how NVIDIA vGPU solutions fundamentally alter the landscape of desktop virtualization and enable users and applications of all levels of complexity and graphics requirements to utilize said solutions. It also describes the NVIDIA vGPU architecture, the GPUs recommended for virtualization, the three virtual GPU software editions, and key standards supported by NVIDIA virtual GPU technology.

Why NVIDIA vGPU?

The promise of desktop virtualization, realized for server workloads years ago, is flexibility and manageability. Due to cost considerations, desktop virtualization was used where flexibility and security were the primary drivers. The democratization of technology over the years has reduced the total cost of ownership of desktop virtualization. This, along with advances in storage and multi-core processors, makes for a reasonable and advantageous cost to ownership.

The biggest challenge for desktop virtualization is providing a cost-effective yet rich user experience. There have been attempts to solve this problem with software graphics or shared GPU technologies. Still, those technologies do not support the rich applications needed to be successful and ensure end-user adoption. This compares to dedicated GPU pass-through, which provides 100% application compatibility, but only for the highest end-user cases due to the high cost and limited density of virtual machines per host server.

Due to the lack of scalable, sharable, and cost-effective per-user GPUs that provide 100% application compatibility, providing a cost-effective, rich user experience has been challenging for broad use cases in desktop virtualization. Meanwhile, high-end 3D applications did not work in a virtualized environment or were so expensive to implement with pass-thru that they were reserved for only the most limited circumstances.

This is no longer true, thanks to the NVIDIA vGPU solution combined with Citrix Virtual Desktops and Apps. NVIDIA vGPU technology allows multiple virtual desktops or applications to share a single physical GPU, which may reside on a single PCI card. This approach delivers the 100% application compatibility of vDGA pass-through graphics but at a lower cost, as multiple virtual session hosts can share a single graphics card, providing a rich yet more cost-effective user experience. With Citrix Virtual Apps & Desktops, you can more efficiently centralize, pool, and manage traditionally complex and expensive distributed workstations and desktops. This allows all your user groups to benefit fully from virtualization’s advantages.

NVIDIA vGPU Architecture

Figure 1.1 illustrates the high-level architecture of an NVIDIA virtual GPU. NVIDIA GPUs are installed within the server, and the NVIDIA vGPU manager software is installed on the host server. This software facilitates the sharing of a single GPU among multiple VMs. Alternatively, vGPU technology allows a single VM to utilize multiple vGPUs from one or more physical GPUs.

Physical NVIDIA GPUs can support multiple virtual GPUs (vGPUs), which are allocated directly to guest VMs under the control of NVIDIA’s Virtual GPU Manager running in the hypervisor. Guest VMs interact with NVIDIA vGPUs similarly to those with a directly passed-through physical GPU managed by the hypervisor.

Figure 1.1 - NVIDIA vGPU Solution Architecture

NVIDIA vGPUs are comparable to conventional GPUs in that they have a fixed amount of GPU memory and one or more virtual display outputs or heads. Multiple heads support multiple displays. Managed by the NVIDIA vGPU Manager installed in the hypervisor, the vGPU memory is allocated out of the physical GPU frame buffer when the vGPU is created. The vGPU retains exclusive use of that GPU memory until it is destroyed.

All vGPUs resident on a physical GPU share access to the GPU’s engines, including the graphics (3D) and video decode and encode engines. A VM’s guest OS leverages direct access to the GPU for performance and fast critical paths. Non-critical performance management operations use a para-virtualized interface to the NVIDIA Virtual GPU Manager.

NVIDIA vGPU Software Licensed Products

NVIDIA virtual GPU software divides NVIDIA GPU resources so the GPU can be shared across multiple virtual machines running any application.

The portfolio of NVIDIA virtual GPU software products for desktop virtualization is as follows:

NVIDIA RTX Virtual Workstation (vWS)
NVIDIA Virtual PC (NVIDIA vPC)
NVIDIA Virtual Apps (NVIDIA vApps)

To run these software products, you need an NVIDIA GPU and a software license that addresses your specific use case. You can use Citrix Virtual Apps for NVIDIA Virtual Applications (NVIDIA vApps), Citrix Virtual Desktop for NVIDIA Virtual PC (NVIDIA vPC), and NVIDIA RTX Virtual Workstation (vWS).

For further details on vGPU licensing, please refer to the vGPU Client Licensing User Guide.

Supported Graphics Protocols

This version of NVIDIA vGPU software includes support for:

Full DirectX 11, DirectX 12, Direct2D, and DirectX Video Acceleration (DXVA)
OpenGL 4.6
NVIDIA vGPU software SDK (remote graphics acceleration)
Vulkan 1.3
NVIDIA RTX (on GPUs based on the NVIDIA Volta graphic architecture and later architectures)
OpenCL and CUDA applications WITHOUT Unified Memory are supported on supported GPUs.
- NVIDIA CUDA Toolkit and OpenCL Support on NVIDIA vGPU Software.

Note

Unified Memory and CUDA tools are NOT supported on NVIDIA vGPU.
These APIs are backward compatible. Older versions of the API are also supported.

Before You Begin

This section describes the general prerequisites and some general preparatory steps that must be addressed before deployment.

Note

This deployment guide assumes you are building an environment as a proof of concept and is not meant to be a production deployment. As a result, choices made are intended to speed up and ease the process. See the corresponding guides for each technology, and make choices appropriate for your needs before building your production environment.

Server BIOS Settings

Configure the BIOS as appropriate for your physical hosts, as described below:

Hyperthreading – Enabled
Power Setting or System Profile– High Performance
CPU Performance (if applicable) – Enterprise or High Throughput
Memory Mapped I/O above 4-GB - Enabled (if applicable)
VT-d or AMD IOMMU – Enabled

Citrix GPU Utilization Patch

KB458639 addresses an issue with incorrect Canonical Display Driver (CDD) buffer flushing, which degrades performance in Remote Desktop Protocol (RDP) Windows 2000 Display Driver Model (XDDM) scenarios. This issue affects applications that use graphics processing units (GPU) to operate, such as Microsoft Teams, Microsoft Office, and web browsers.

Server 2019 – KB4586839