NVIDIA RTX vWS: Sizing and GPU Selection Guide for Virtualized Workloads
NVIDIA RTX vWS: Sizing and GPU Selection Guide for Virtualized Workloads

Performance Metrics

The previous chapter introduces essential tools for capturing critical performance metrics, which will be detailed in subsequent sections. It is crucial to gather these metrics during your Proof of Concept (POC) and regularly in production to optimize VDI delivery.

In a VDI environment, performance metrics are categorized into two tiers: server-level and VM-level. Each tier has distinct metrics that must be validated to ensure optimal performance and scalability.

As discussed in the previous chapter, both the GPU Profiler and VMware Aria Operations are invaluable tools for monitoring resource usage metrics within VMs. The upcoming sections detail these metrics, essential for conducting a POC or monitoring an existing deployment to identify and address potential performance bottlenecks effectively.

Framebuffer Usage

In a virtualized environment, the frame buffer represents the amount of vGPU memory available to the guest operating system. A good rule of thumb to follow is that a VM’s frame buffer usage should not frequently exceed 90% or average over 70%. If high utilization is noted, the vGPU backed VM is more prone to suboptimal user experiences, including performance degradation and potential crashes. Given the varied user interactions and workflows in software applications, conducting a POC with your specific workload is recommended to determine appropriate frame buffer thresholds for your environment.

vCPU Usage

When deploying NVIDIA RTX vWS, monitoring vCPU usage is equally critical alongside vGPU frame buffer utilization. As all workloads depend on CPU resources, ensuring vCPU usage doesn’t become a bottleneck is essential for maintaining optimal performance. Even when processes are accelerated using vGPU, vCPU resources are still integral to their operation. Thus, balancing and monitoring both vGPU and vCPU resources is key to optimizing system performance.

Video Encode/Decode

NVIDIA GPUs feature hardware-based encoders and decoders, specifically:

  • NVENC (NVIDIA Video Encoder): This hardware-accelerated encoder offloads computationally intensive video encoding tasks from the CPU to the GPU, significantly improving performance and efficiency.

  • NVDEC (NVIDIA Video Decoder): This hardware-accelerated decoder provides fast real-time decoding for various video codecs, enhancing video playback performance by reducing CPU load.

Metrics for encoder and decoder usage can be captured when these NVIDIA hardware components are actively utilized. The Video Encoder Usage metric specifically measures how intensively the GPU’s encoder is utilized by the protocol or application, crucial for monitoring performance in virtualized environments.

In the previous chapter, we introduced the NVIDIA System Management Interface (nvidia-smi) and VMware esxtop as valuable tools for monitoring resource usage metrics on a physical host. The upcoming sections delve into these metrics, essential for conducting a POC or maintaining an operational deployment to identify and address performance bottlenecks effectively.

CPU Core Utilization

VMware’s esxtop utility monitors essential physical host state information for each CPU processor. The % Total CPU Core Utilization metric is crucial for analyzing and maintaining optimal VM performance. As previously noted, every process within a VM runs on a vCPU, utilizing physical cores on the host for execution. When host threads are fully utilized, processes in a VM may bottleneck, leading to considerable performance degradation.

GPU Utilization

The NVIDIA System Management Interface (nvidia-smi) monitors GPU utilization rates, indicating the workload each GPU handles over time. It provides insights into how extensively vGPU-backed VMs utilize NVIDIA GPUs on the host server.

Previous Tools
Next Performance Analysis
© Copyright © 2024, NVIDIA Corporation. Last updated on Oct 3, 2024.