Typical VDI deployments have two conflicting goals: Achieving the best possible user experience and maximizing user density on server hardware. Problems can arise as density is scaled up because it can negatively impact user experience after a certain point. Scalability testing used nVector to execute tests at scale on 64 and 128 VMs while leveraging dual HD (1920x1080) monitors. Capacity planning for the server is often dependent upon server resource utilization metrics and user experience. This testing phase examined both, and the following sections summarize their importance and how to analyze these metrics. The GPU Profiler is a commonly used tool that can quickly capture resource utilization while a workload is being executed on a virtual machine. More information on the GPU Profiler is available in this section.

The utilization of the GPU compute engine, the frame buffer, the encoder, and the decoder can all be monitored and logged through NVIDIA System Management Interface (nvidia-smi), a command-line interface tool. In addition, NVIDIA vGPU metrics are integrated through management packs like VMware vRealize Operations. For our testing purposes, nVector automated the capture of the following server metrics. It is strongly advised to test your specific workloads during a POC. You can run nvidia-smi commands on the hypervisor to monitor the GPU utilization of the physical GPU. Please refer to Deployment Best Practices for further syntax information.

Observing overall server utilization will allow you to assess the trade-offs between end-user experience and resource utilization. To do this, monitoring tools periodically sample CPU core and GPU utilization during a single workload session. To determine the ‘steady state’ portion of the workload, samples are filtered, leaving out when users have all logged on, and the workload ramps up and down. Once a steady state has been established, all samples are aggregated to get the total CPU core utilization on the server.

User Experience Metrics#

NVIDIA’s nVector benchmarking tool has built-in mechanisms to measure user experience. This next section will dig deeper into how the end-user experience is measured and how results are obtained.

Latency Metrics# Latency defines the response or feel of the end-user when working with applications in the VDI. Increased latency can provide a poor experience, including mouse cursor delay, text display issues when typing, and audio/video sync issues. The lower the latency, the better! Imagine that you are working on a PowerPoint presentation, adding a shape, and resizing it. On the first attempt, this process is instantaneous. However, the second attempt is delayed by several seconds or is sluggish. With such inconsistency, the user tends to overshoot or have trouble getting the mouse in the correct position. This lack of consistent experience can be very frustrating. Often, it results in the user experiencing high error rates as they click too fast or too slow, trying to pace themselves with an unpredictable response time. NVIDIA’s nVector benchmarking tool measures the variation in end-user latency and how frequently it is experienced.

Remoted Frames Metrics# Frame rate metrics are captured on the endpoint and provide an excellent metric on the possible end-user experience. The average frame rate is captured and calculated across the simulated workload. A lower frame rate can cause slow response during screen refresh and stuttering during scrolling or zooming. The higher the frame rate, the better! Remoted frames are a standard measure of user experience. NVIDIA’s nVector benchmarking tool collects data on the ‘frames per second’ provided by the remote protocol vendor for the entire workload duration. The tool then tallies the data for all VDI sessions to get the total number of frames remoted for all users. Hypervisor vendors likewise measure total remoted frames as an indicator of the quality of user experience. The greater this number, the more fluid the user experience.