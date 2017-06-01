GRID supports monitoring and control within a guest VM of vGPUs or pass-through GPUs that are assigned to the VM. The scope of management interfaces and tools used within a guest VM is limited to the guest VM within which they are used. They cannot monitor any other GPUs in the virtualization platform.

For monitoring from a guest VM, certain properties do not apply to vGPUs. The values that the GRID management interfaces report for these properties indicate that the properties do not apply to a vGPU.



The GRID server interfaces that are available for GPU management from a guest VM depend on the guest operating system that is running in the VM.



Usage of GPU engines is reported for vGPUs as a percentage of the vGPU’s maximum possible capacity on each engine. The GPU engines are as follows:

Graphics/SM

Memory controller

Video encoder

Video decoder

GRID vGPUs are permitted to occupy the full capacity of each physical engine if no other vGPUs are contending for the same engine. Therefore, if a vGPU occupies 20% of the entire graphics engine in a particular sampling period, its graphics usage as reported inside the VM is 20%.

GRID supports monitoring and control within a guest VM by using NVML.



GRID vGPUs are presented in guest VM management interfaces in the same fashion as pass-through GPUs.



To determine whether a GPU device in a guest VM is a vGPU or a pass-through GPU, call the NVML function nvmlDeviceGetVirtualizationMode().

A GPU reports its virtualization mode as follows:

A GPU operating in pass-through mode reports its virtualization mode as NVML_GPU_VIRTUALIZATION_MODE_PASSTHROUGH .

. A vGPU reports its virtualization mode as NVML_GPU_VIRTUALIZATION_MODE_VGPU .

Properties and metrics other than GPU engine usage are reported for a vGPU in a similar way to how the same properties and metrics are reported for a physical GPU. However, some properties do not apply to vGPUs. The NVML device query functions for getting these properties return a value that indicates that the properties do not apply to a vGPU. For details of NVML device query functions, see Device Queries in NVML API Reference Manual .



GPU Property NVML Device Query Function NVML return code on vGPU Serial Number nvmlDeviceGetSerial() vGPUs are not assigned serial numbers. NOT_SUPPORTED GPU UUID nvmlDeviceGetUUID() vGPUs are allocated random UUIDs. SUCCESS VBIOS Version nvmlDevicenvmlDeviceGetVbiosVersion() vGPU VBIOS version is hard-wired to zero. SUCCESS GPU Part Number nvmlDeviceGetBoardPartNumber() NOT_SUPPORTED

The InfoROM object is not exposed on vGPUs. All the functions in the following table return NOT_SUPPORTED .



GPU Property NVML Device Query Function Image Version nvmlDeviceGetInforomImageVersion() OEM Object nvmlDeviceGetInforomVersion() ECC Object nvmlDeviceGetInforomVersion() Power Management Object nvmlDeviceGetInforomVersion()

GPU Property NVML Device Query Function NVML return code on vGPU GPU Operation Mode (Current) nvmlDeviceGetGpuOperationMode() Tesla GPU operating modes are not supported on vGPUs. NOT_SUPPORTED GPU Operation Mode (Pending) nvmlDeviceGetGpuOperationMode() Tesla GPU operating modes are not supported on vGPUs. NOT_SUPPORTED Compute Mode nvmlDeviceGetComputeMode() A vGPU always returns NVML_COMPUTEMODE_PROHIBITED . SUCCESS Driver Model nvmlDeviceGetDriverModel() A vGPU supports WDDM mode only in Windows VMs. SUCCESS (Windows)

PCI Express characteristics are not exposed on vGPUs. All the functions in the following table return NOT_SUPPORTED .



GPU Property NVML Device Query Function Generation Max nvmlDeviceGetMaxPcieLinkGeneration() Generation Current nvmlDeviceGetCurrPcieLinkGeneration() Link Width Max nvmlDeviceGetMaxPcieLinkWidth() Link Width Current nvmlDeviceGetCurrPcieLinkWidth() Bridge Chip Type nvmlDeviceGetBridgeChipInfo() Bridge Chip Firmware nvmlDeviceGetBridgeChipInfo() Replays nvmlDeviceGetPcieReplayCounter() TX Throughput nvmlDeviceGetPcieThroughput() RX Throughput nvmlDeviceGetPcieThroughput()

All the functions in the following table return NOT_SUPPORTED .



GPU Property NVML Device Query Function Fan Speed nvmlDeviceGetFanSpeed() Clocks Throttle Reasons nvmlDeviceGetSupportedClocksThrottleReasons() nvmlDeviceGetCurrentClocksThrottleReasons() Current Temperature nvmlDeviceGetTemperature() nvmlDeviceGetTemperatureThreshold() Shutdown Temperature nvmlDeviceGetTemperature() nvmlDeviceGetTemperatureThreshold() Slowdown Temperature nvmlDeviceGetTemperature() nvmlDeviceGetTemperatureThreshold()

vGPUs do not expose physical power consumption of the underlying GPU. All the functions in the following table return NOT_SUPPORTED .



GPU Property NVML Device Query Function Management Mode nvmlDeviceGetPowerManagementMode() Draw nvmlDeviceGetPowerUsage() Limit nvmlDeviceGetPowerManagementLimit() Default Limit nvmlDeviceGetPowerManagementDefaultLimit() Enforced Limit nvmlDeviceGetEnforcedPowerLimit() Min Limit nvmlDeviceGetPowerManagementLimitConstraints() Max Limit nvmlDeviceGetPowerManagementLimitConstraints()

Error-correcting code (ECC) is not supported on vGPUs. All the functions in the following table return NOT_SUPPORTED .



GPU Property NVML Device Query Function Mode nvmlDeviceGetEccMode() Error Counts nvmlDeviceGetMemoryErrorCounter() nvmlDeviceGetTotalEccErrors() Retired Pages nvmlDeviceGetRetiredPages() nvmlDeviceGetRetiredPagesPendingStatus()

All the functions in the following table return NOT_SUPPORTED .



GPU Property NVML Device Query Function Application Clocks nvmlDeviceGetApplicationsClock() Default Application Clocks nvmlDeviceGetDefaultApplicationsClock() Max Clocks nvmlDeviceGetMaxClockInfo() Policy: Auto Boost nvmlDeviceGetAutoBoostedClocksEnabled() Policy: Auto Boost Default nvmlDeviceGetAutoBoostedClocksEnabled()

To build an NVML-enabled application, refer to the sample code included in the SDK.

In Windows VMs, GPU metrics are available as Windows Performance Counters through the NVIDIA GPU object.

For access to Windows Performance Counters through programming interfaces, refer to the performance counter sample code included with the NVIDIA Windows Management Instrumentation SDK.



On vGPUs, the following GPU performance counters read as 0 because they are not applicable to vGPUs:

% Bus Usage

% Cooler rate

Core Clock MHz

Fan Speed

Memory Clock MHz

PCI-E current speed to GPU Mbps

PCI-E current width to GPU

PCI-E downstream width to GPU

Power Consumption mW

Temperature C

In Windows VMs, Windows Management Instrumentation (WMI) exposes GPU metrics in the ROOT\CIMV2\NV namespace through NVWMI. NVWMI is included with the NVIDIA driver package. After the driver is installed, NVWMI help information in Windows Help format is available as follows:

Copy Copied! C:\Program Files\NVIDIA Corporation\NVIDIA WMI Provider>nvwmi.chm

For access to NVWMI through programming interfaces, use the NVWMI SDK. The NVWMI SDK, with white papers and sample programs, is included in the NVIDIA Windows Management Instrumentation SDK.



On vGPUs, some instance properties of the following classes do not apply to vGPUs:

Ecc

Gpu

PcieLink

Ecc instance properties that do not apply to vGPUs

Ecc Instance Property Value reported on vGPU isSupported False isWritable False isEnabled False isEnabledByDefault False aggregateDoubleBitErrors 0 aggregateSingleBitErrors 0 currentDoubleBitErrors 0 currentSingleBitErrors 0





Gpu instance properties that do not apply to vGPUs

Gpu Instance Property Value reported on vGPU gpuCoreClockCurrent -1 memoryClockCurrent -1 pciDownstreamWidth 0 pcieGpu.curGen 0 pcieGpu.curSpeed 0 pcieGpu.curWidth 0 pcieGpu.maxGen 1 pcieGpu.maxSpeed 2500 pcieGpu.maxWidth 0 power -1 powerSampleCount -1 powerSamplingPeriod -1 verVBIOS.orderedValue 0 verVBIOS.strValue - verVBIOS.value 0





PcieLink instance properties that do not apply to vGPUs

No instances of PcieLink are reported for vGPU.