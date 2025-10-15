Figure 1 illustrates the high-level architecture of an NVIDIA virtual GPU. NVIDIA GPUs are installed within the server, and the NVIDIA vGPU manager software is installed on the host server. This software facilitates the sharing of a single GPU among multiple VMs. Alternatively, vGPU technology allows a single VM to utilize multiple vGPUs from one or more physical GPUs.

Physical NVIDIA GPUs can support multiple virtual GPUs (vGPUs) allocated directly to guest VMs under NVIDIA’s Virtual GPU Manager running in the hypervisor. Guest VMs interact with NVIDIA vGPUs similarly to those with a directly passed-through physical GPU managed by the hypervisor.

Figure 1 NVIDIA vGPU System Architecture

In NVIDIA vGPU deployments, the appropriate vGPU license is identified based on the assigned vGPU profile for each VM. Each NVIDIA vGPU behaves similarly to a conventional GPU, featuring a fixed amount of GPU memory and supporting one or more virtual display outputs or heads. Multiple heads can accommodate multiple displays. The vGPU memory allocation is managed by the NVIDIA vGPU Manager installed in the hypervisor, utilizing the physical GPU frame buffer at creation and retaining exclusive use of that GPU memory until termination.

All vGPUs sharing a physical GPU have access to their engines, including graphics (3D), video decoding, and encoding engines. A VM’s guest OS leverages direct access to the GPU for optimal performance and critical paths. At the same time, non-critical management operations utilize a para-virtualized interface to the NVIDIA Virtual GPU Manager.