Overview - NVIDIA Docs

Azure Stack HCI is a powerful hyper-converged infrastructure solution that combines on-premises infrastructure with cloud services, providing a comprehensive and flexible environment for deploying virtualized workloads. This guide explores the integration of NVIDIA virtual GPU (vGPU) software within Azure Stack HCI, enabling multiple virtual machines (VMs) to have simultaneous, direct access to a single physical GPU. This capability is essential for enhancing graphics performance, ensuring application compatibility, and optimizing cost-effectiveness in environments requiring GPU acceleration.

This chapter covers how NVIDIA vGPU solutions fundamentally alter the landscape of desktop virtualization and GPU accelerated servers. NVIDIA vGPU enables users to execute these solutions with various workloads of all levels of complexity and graphics requirements. This chapter also describes the NVIDIA vGPU architecture, the NVIDIA GPUs recommended for virtualization, the NVIDIA vGPU software licensed products for desktop virtualization, as well as key standards supported by NVIDIA virtual GPU technology.

Why NVIDIA vGPU?

The promise of desktop and data center virtualization lies in its flexibility and manageability. Initially driven by the need for flexibility and security, desktop and data center virtualization has become more accessible due to the democratization of technology, which has significantly reduced costs. This has expanded market accessibility and driven growth, with NVIDIA playing a key role as a facilitator. Advances in storage and multi-core processors further enhance the competitive advantage regarding the total cost of ownership.

One of the biggest challenges in desktop virtualization is providing a cost-effective yet rich user experience.

NVIDIA virtual GPU (vGPU) software addresses this challenge by enabling powerful GPU performance for a wide range of workloads, from graphics-rich virtual workstations to data science and AI. This allows IT to harness the management and security benefits of virtualization alongside the performance of NVIDIA GPUs needed for modern workloads. Installed on a physical GPU in a cloud or enterprise data center server, NVIDIA vGPU software creates virtual GPUs that can be shared across multiple virtual machines and accessed from any device, anywhere.

NVIDIA vGPU Architecture

The high-level architecture of an NVIDIA virtual GPU-enabled VDI environment is illustrated in Figure 1.1 below. This software enables multiple VMs to share a single GPU and, if there are multiple GPUs in the server, a single VM can be enabled to access multiple GPUs. This GPU-enabled environment provides unprecedented performance and enables support for more users on a server. Work typically done by the CPU, such as GPU-ready workloads or graphics and encoder tasks can now be offloaded to the GPU. Physical NVIDIA GPUs can support multiple virtual GPUs (vGPUs) and be assigned directly to guest VMs under the control of NVIDIA’s Virtual GPU Manager running in a hypervisor.

Guest VMs use the NVIDIA GPUs in the same manner as a physical GPU passed through by the hypervisor, with an NVIDIA GPU driver installed directly in the guest VM.

Figure 1.1 - NVIDIA vGPU Architecture

NVIDIA vGPUs are comparable to conventional GPUs in that they have a fixed amount of GPU memory, but they feature virtual displays instead of physical ports. Managed by the NVIDIA vGPU Manager installed in the hypervisor, the vGPU memory is allocated out of the physical GPU frame buffer when the vGPU is created. The vGPU retains exclusive use of that GPU Memory until it is destroyed.

Note

These are virtual displays, meaning that there is no physical connection point for external physical displays on GPUs.

All vGPUs that reside on a physical GPU share GPU engines, including the graphics (3D) and video decode and encode engines. The right side of Figure 1.1 shows the vGPU internal architecture. The VM’s guest OS leverages direct access to the GPU for performance and fast critical paths. Noncritical performance management operations use a para-virtualization interface to the NVIDIA Virtual GPU Manager.