NVML API Reference Guide :: GPU Deployment and Management Documentation

NVML API Reference Guide (PDF) - vR575 (older) - Last updated June 05, 2025 - Send Feedback

4.26. vGPU Utilization and Accounting

This chapter describes operations that are associated with vGPU Utilization and Accounting.

Functions

nvmlReturn_t nvmlDeviceGetVgpuInstancesUtilizationInfo ( nvmlDevice_t device, nvmlVgpuInstancesUtilizationInfo_t* vgpuUtilInfo )
nvmlReturn_t nvmlDeviceGetVgpuProcessUtilization ( nvmlDevice_t device, unsigned long long lastSeenTimeStamp, unsigned int* vgpuProcessSamplesCount, nvmlVgpuProcessUtilizationSample_t* utilizationSamples )
nvmlReturn_t nvmlDeviceGetVgpuProcessesUtilizationInfo ( nvmlDevice_t device, nvmlVgpuProcessesUtilizationInfo_t* vgpuProcUtilInfo )
nvmlReturn_t nvmlDeviceGetVgpuUtilization ( nvmlDevice_t device, unsigned long long lastSeenTimeStamp, nvmlValueType_t* sampleValType, unsigned int* vgpuInstanceSamplesCount, nvmlVgpuInstanceUtilizationSample_t* utilizationSamples )
nvmlReturn_t nvmlVgpuInstanceClearAccountingPids ( nvmlVgpuInstance_t vgpuInstance )
nvmlReturn_t nvmlVgpuInstanceGetAccountingMode ( nvmlVgpuInstance_t vgpuInstance, nvmlEnableState_t* mode )
nvmlReturn_t nvmlVgpuInstanceGetAccountingPids ( nvmlVgpuInstance_t vgpuInstance, unsigned int* count, unsigned int* pids )
nvmlReturn_t nvmlVgpuInstanceGetAccountingStats ( nvmlVgpuInstance_t vgpuInstance, unsigned int pid, nvmlAccountingStats_t* stats )
nvmlReturn_t nvmlVgpuInstanceGetLicenseInfo_v2 ( nvmlVgpuInstance_t vgpuInstance, nvmlVgpuLicenseInfo_t* licenseInfo )

Functions

nvmlReturn_t nvmlDeviceGetVgpuInstancesUtilizationInfo ( nvmlDevice_t device, nvmlVgpuInstancesUtilizationInfo_t* vgpuUtilInfo )

Parameters

device: The identifier for the target device
vgpuUtilInfo: Pointer to the caller-provided structure of nvmlVgpuInstancesUtilizationInfo_t

Returns

NVML_SUCCESS If utilization samples are successfully retrieved
NVML_ERROR_UNINITIALIZED If the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT If device is invalid, vgpuUtilInfo is NULL, or vgpuUtilInfo->vgpuInstanceCount is 0
NVML_ERROR_NOT_SUPPORTED If vGPU is not supported by the device
NVML_ERROR_GPU_IS_LOST If the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_ARGUMENT_VERSION_MISMATCH If the version of vgpuUtilInfo is invalid
NVML_ERROR_INSUFFICIENT_SIZE If vgpuUtilInfo->vgpuUtilArray is NULL, or the buffer size of vgpuUtilInfo->vgpuInstanceCount is too small. The caller should check the current vGPU instance count from the returned vgpuUtilInfo->vgpuInstanceCount, and call the function again with a buffer of size vgpuUtilInfo->vgpuInstanceCount * sizeof(nvmlVgpuInstanceUtilizationInfo_t)
NVML_ERROR_NOT_FOUND If sample entries are not found
NVML_ERROR_UNKNOWN On any unexpected error

Description

Retrieves recent utilization for vGPU instances running on a physical GPU (device).

For Kepler or newer fully supported devices.

Reads recent utilization of GPU SM (3D/Compute), framebuffer, video encoder, video decoder, jpeg decoder, and OFA for vGPU instances running on a device. Utilization values are returned as an array of utilization sample structures in the caller-supplied buffer pointed at by vgpuUtilInfo->vgpuUtilArray. One utilization sample structure is returned per vGPU instance, and includes the CPU timestamp at which the samples were recorded. Individual utilization values are returned as "unsigned int" values in nvmlValue_t unions. The function sets the caller-supplied vgpuUtilInfo->sampleValType to NVML_VALUE_TYPE_UNSIGNED_INT to indicate the returned value type.

To read utilization values, first determine the size of buffer required to hold the samples by invoking the function with vgpuUtilInfo->vgpuUtilArray set to NULL. The function will return NVML_ERROR_INSUFFICIENT_SIZE, with the current vGPU instance count in vgpuUtilInfo->vgpuInstanceCount, or NVML_SUCCESS if the current vGPU instance count is zero. The caller should allocate a buffer of size vgpuUtilInfo->vgpuInstanceCount * sizeof(nvmlVgpuInstanceUtilizationInfo_t). Invoke the function again with the allocated buffer passed in vgpuUtilInfo->vgpuUtilArray, and vgpuUtilInfo->vgpuInstanceCount set to the number of entries the buffer is sized for.

On successful return, the function updates vgpuUtilInfo->vgpuInstanceCount with the number of vGPU utilization sample structures that were actually written. This may differ from a previously read value as vGPU instances are created or destroyed.

vgpuUtilInfo->lastSeenTimeStamp represents the CPU timestamp in microseconds at which utilization samples were last read. Set it to 0 to read utilization based on all the samples maintained by the driver's internal sample buffer. Set vgpuUtilInfo->lastSeenTimeStamp to a timeStamp retrieved from a previous query to read utilization since the previous query.

nvmlReturn_t nvmlDeviceGetVgpuProcessUtilization ( nvmlDevice_t device, unsigned long long lastSeenTimeStamp, unsigned int* vgpuProcessSamplesCount, nvmlVgpuProcessUtilizationSample_t* utilizationSamples )

Parameters

device: The identifier for the target device
lastSeenTimeStamp: Return only samples with timestamp greater than lastSeenTimeStamp.
vgpuProcessSamplesCount: Pointer to caller-supplied array size, and returns number of processes running on vGPU instances
utilizationSamples: Pointer to caller-supplied buffer in which vGPU sub process utilization samples are returned

Returns

NVML_SUCCESS if utilization samples are successfully retrieved
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid, vgpuProcessSamplesCount or a sample count of 0 is passed with a non-NULL utilizationSamples
NVML_ERROR_INSUFFICIENT_SIZE if supplied vgpuProcessSamplesCount is too small to return samples for all vGPU instances currently executing on the device
NVML_ERROR_NOT_SUPPORTED if vGPU is not supported by the device
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_NOT_FOUND if sample entries are not found
NVML_ERROR_UNKNOWN on any unexpected error

Description

Retrieves current utilization for processes running on vGPUs on a physical GPU (device).

For Maxwell or newer fully supported devices.

Reads recent utilization of GPU SM (3D/Compute), framebuffer, video encoder, and video decoder for processes running on vGPU instances active on a device. Utilization values are returned as an array of utilization sample structures in the caller-supplied buffer pointed at by utilizationSamples. One utilization sample structure is returned per process running on vGPU instances, that had some non-zero utilization during the last sample period. It includes the CPU timestamp at which the samples were recorded. Individual utilization values are returned as "unsigned int" values.

To read utilization values, first determine the size of buffer required to hold the samples by invoking the function with utilizationSamples set to NULL. The function will return NVML_ERROR_INSUFFICIENT_SIZE, with the current vGPU instance count in vgpuProcessSamplesCount. The caller should allocate a buffer of size vgpuProcessSamplesCount * sizeof(nvmlVgpuProcessUtilizationSample_t). Invoke the function again with the allocated buffer passed in utilizationSamples, and vgpuProcessSamplesCount set to the number of entries the buffer is sized for.

On successful return, the function updates vgpuSubProcessSampleCount with the number of vGPU sub process utilization sample structures that were actually written. This may differ from a previously read value depending on the number of processes that are active in any given sample period.

lastSeenTimeStamp represents the CPU timestamp in microseconds at which utilization samples were last read. Set it to 0 to read utilization based on all the samples maintained by the driver's internal sample buffer. Set lastSeenTimeStamp to a timeStamp retrieved from a previous query to read utilization since the previous query.

nvmlReturn_t nvmlDeviceGetVgpuProcessesUtilizationInfo ( nvmlDevice_t device, nvmlVgpuProcessesUtilizationInfo_t* vgpuProcUtilInfo )

Parameters

device: The identifier for the target device
vgpuProcUtilInfo: Pointer to the caller-provided structure of nvmlVgpuProcessesUtilizationInfo_t

Returns

NVML_SUCCESS If utilization samples are successfully retrieved
NVML_ERROR_UNINITIALIZED If the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT If device is invalid, or vgpuProcUtilInfo is null
NVML_ERROR_ARGUMENT_VERSION_MISMATCH If the version of vgpuProcUtilInfo is invalid
NVML_ERROR_INSUFFICIENT_SIZE If vgpuProcUtilInfo->vgpuProcUtilArray is null, or supplied vgpuProcUtilInfo->vgpuProcessCount is too small to return samples for all processes on vGPU instances currently executing on the device. The caller should check the current processes count from the returned vgpuProcUtilInfo->vgpuProcessCount, and call the function again with a buffer of size vgpuProcUtilInfo->vgpuProcessCount * sizeof(nvmlVgpuProcessUtilizationSample_t)
NVML_ERROR_NOT_SUPPORTED If vGPU is not supported by the device
NVML_ERROR_GPU_IS_LOST If the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_NOT_FOUND If sample entries are not found
NVML_ERROR_UNKNOWN On any unexpected error

Description

Retrieves recent utilization for processes running on vGPU instances on a physical GPU (device).

For Maxwell or newer fully supported devices.

Reads recent utilization of GPU SM (3D/Compute), framebuffer, video encoder, video decoder, jpeg decoder, and OFA for processes running on vGPU instances active on a device. Utilization values are returned as an array of utilization sample structures in the caller-supplied buffer pointed at by vgpuProcUtilInfo->vgpuProcUtilArray. One utilization sample structure is returned per process running on vGPU instances, that had some non-zero utilization during the last sample period. It includes the CPU timestamp at which the samples were recorded. Individual utilization values are returned as "unsigned int" values.

To read utilization values, first determine the size of buffer required to hold the samples by invoking the function with vgpuProcUtilInfo->vgpuProcUtilArray set to NULL. The function will return NVML_ERROR_INSUFFICIENT_SIZE, with the current processes' count running on vGPU instances in vgpuProcUtilInfo->vgpuProcessCount. The caller should allocate a buffer of size vgpuProcUtilInfo->vgpuProcessCount * sizeof(nvmlVgpuProcessUtilizationSample_t). Invoke the function again with the allocated buffer passed in vgpuProcUtilInfo->vgpuProcUtilArray, and vgpuProcUtilInfo->vgpuProcessCount set to the number of entries the buffer is sized for.

On successful return, the function updates vgpuProcUtilInfo->vgpuProcessCount with the number of vGPU sub process utilization sample structures that were actually written. This may differ from a previously read value depending on the number of processes that are active in any given sample period.

vgpuProcUtilInfo->lastSeenTimeStamp represents the CPU timestamp in microseconds at which utilization samples were last read. Set it to 0 to read utilization based on all the samples maintained by the driver's internal sample buffer. Set vgpuProcUtilInfo->lastSeenTimeStamp to a timeStamp retrieved from a previous query to read utilization since the previous query.

nvmlReturn_t nvmlDeviceGetVgpuUtilization ( nvmlDevice_t device, unsigned long long lastSeenTimeStamp, nvmlValueType_t* sampleValType, unsigned int* vgpuInstanceSamplesCount, nvmlVgpuInstanceUtilizationSample_t* utilizationSamples )

Parameters

device: The identifier for the target device
lastSeenTimeStamp: Return only samples with timestamp greater than lastSeenTimeStamp.
sampleValType: Pointer to caller-supplied buffer to hold the type of returned sample values
vgpuInstanceSamplesCount: Pointer to caller-supplied array size, and returns number of vGPU instances
utilizationSamples: Pointer to caller-supplied buffer in which vGPU utilization samples are returned

Returns

NVML_SUCCESS if utilization samples are successfully retrieved
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if device is invalid, vgpuInstanceSamplesCount or sampleValType is NULL, or a sample count of 0 is passed with a non-NULL utilizationSamples
NVML_ERROR_INSUFFICIENT_SIZE if supplied vgpuInstanceSamplesCount is too small to return samples for all vGPU instances currently executing on the device
NVML_ERROR_NOT_SUPPORTED if vGPU is not supported by the device
NVML_ERROR_GPU_IS_LOST if the target GPU has fallen off the bus or is otherwise inaccessible
NVML_ERROR_NOT_FOUND if sample entries are not found
NVML_ERROR_UNKNOWN on any unexpected error

Description

Retrieves current utilization for vGPUs on a physical GPU (device).

For Kepler or newer fully supported devices.

Reads recent utilization of GPU SM (3D/Compute), framebuffer, video encoder, and video decoder for vGPU instances running on a device. Utilization values are returned as an array of utilization sample structures in the caller-supplied buffer pointed at by utilizationSamples. One utilization sample structure is returned per vGPU instance, and includes the CPU timestamp at which the samples were recorded. Individual utilization values are returned as "unsigned int" values in nvmlValue_t unions. The function sets the caller-supplied sampleValType to NVML_VALUE_TYPE_UNSIGNED_INT to indicate the returned value type.

To read utilization values, first determine the size of buffer required to hold the samples by invoking the function with utilizationSamples set to NULL. The function will return NVML_ERROR_INSUFFICIENT_SIZE, with the current vGPU instance count in vgpuInstanceSamplesCount, or NVML_SUCCESS if the current vGPU instance count is zero. The caller should allocate a buffer of size vgpuInstanceSamplesCount * sizeof(nvmlVgpuInstanceUtilizationSample_t). Invoke the function again with the allocated buffer passed in utilizationSamples, and vgpuInstanceSamplesCount set to the number of entries the buffer is sized for.

On successful return, the function updates vgpuInstanceSampleCount with the number of vGPU utilization sample structures that were actually written. This may differ from a previously read value as vGPU instances are created or destroyed.

nvmlReturn_t nvmlVgpuInstanceClearAccountingPids ( nvmlVgpuInstance_t vgpuInstance )

Parameters

vgpuInstance: The identifier of the target vGPU instance

Returns

NVML_SUCCESS if accounting information has been cleared
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if vgpuInstance is invalid
NVML_ERROR_NO_PERMISSION if the user doesn't have permission to perform this operation
NVML_ERROR_NOT_SUPPORTED if the vGPU doesn't support this feature or accounting mode is disabled
NVML_ERROR_UNKNOWN on any unexpected error

Description

Clears accounting information of the vGPU instance that have already terminated.

For Maxwell or newer fully supported devices. Requires root/admin permissions.

Note:

Accounting Mode needs to be on. See nvmlVgpuInstanceGetAccountingMode.
Only compute and graphics applications stats are reported and can be cleared since monitoring applications stats don't contribute to GPU utilization.

nvmlReturn_t nvmlVgpuInstanceGetAccountingMode ( nvmlVgpuInstance_t vgpuInstance, nvmlEnableState_t* mode )

Parameters

vgpuInstance: The identifier of the target vGPU instance
mode: Reference in which to return the current accounting mode

Returns

NVML_SUCCESS if the mode has been successfully retrieved
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if vgpuInstance is 0, or mode is NULL
NVML_ERROR_NOT_FOUND if vgpuInstance does not match a valid active vGPU instance on the system
NVML_ERROR_NOT_SUPPORTED if the vGPU doesn't support this feature
NVML_ERROR_DRIVER_NOT_LOADED if NVIDIA driver is not running on the vGPU instance
NVML_ERROR_UNKNOWN on any unexpected error

Description

Queries the state of per process accounting mode on vGPU.

For Maxwell or newer fully supported devices.

nvmlReturn_t nvmlVgpuInstanceGetAccountingPids ( nvmlVgpuInstance_t vgpuInstance, unsigned int* count, unsigned int* pids )

Parameters

vgpuInstance: The identifier of the target vGPU instance
count: Reference in which to provide the pids array size, and to return the number of elements ready to be queried
pids: Reference in which to return list of process ids

Returns

NVML_SUCCESS if pids were successfully retrieved
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if vgpuInstance is 0, or count is NULL
NVML_ERROR_NOT_FOUND if vgpuInstance does not match a valid active vGPU instance on the system
NVML_ERROR_NOT_SUPPORTED if the vGPU doesn't support this feature or accounting mode is disabled
NVML_ERROR_INSUFFICIENT_SIZE if count is too small (count is set to expected value)
NVML_ERROR_UNKNOWN on any unexpected error

Description

Queries list of processes running on vGPU that can be queried for accounting stats. The list of processes returned can be in running or terminated state.

For Maxwell or newer fully supported devices.

To just query the maximum number of processes that can be queried, call this function with *count = 0 and pids=NULL. The return code will be NVML_ERROR_INSUFFICIENT_SIZE, or NVML_SUCCESS if list is empty.

For more details see nvmlVgpuInstanceGetAccountingStats.

Note:

In case of PID collision some processes might not be accessible before the circular buffer is full.

See also:

nvmlVgpuInstanceGetAccountingPids

nvmlReturn_t nvmlVgpuInstanceGetAccountingStats ( nvmlVgpuInstance_t vgpuInstance, unsigned int pid, nvmlAccountingStats_t* stats )

Parameters

vgpuInstance: The identifier of the target vGPU instance
pid: Process Id of the target process to query stats for
stats: Reference in which to return the process's accounting stats

Returns

NVML_SUCCESS if stats have been successfully retrieved
NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
NVML_ERROR_INVALID_ARGUMENT if vgpuInstance is 0, or stats is NULL
NVML_ERROR_NOT_FOUND if vgpuInstance does not match a valid active vGPU instance on the system or stats is not found
NVML_ERROR_NOT_SUPPORTED if the vGPU doesn't support this feature or accounting mode is disabled
NVML_ERROR_UNKNOWN on any unexpected error

Description

Queries process's accounting stats.

For Maxwell or newer fully supported devices.

Accounting stats capture GPU utilization and other statistics across the lifetime of a process, and can be queried during life time of the process or after its termination. The time field in nvmlAccountingStats_t is reported as 0 during the lifetime of the process and updated to actual running time after its termination. Accounting stats are kept in a circular buffer, newly created processes overwrite information about old processes.

See nvmlAccountingStats_t for description of each returned metric. List of processes that can be queried can be retrieved from nvmlVgpuInstanceGetAccountingPids.

Note:

Accounting Mode needs to be on. See nvmlVgpuInstanceGetAccountingMode.
Only compute and graphics applications stats can be queried. Monitoring applications stats can't be queried since they don't contribute to GPU utilization.
In case of pid collision stats of only the latest process (that terminated last) will be reported

nvmlReturn_t nvmlVgpuInstanceGetLicenseInfo_v2 ( nvmlVgpuInstance_t vgpuInstance, nvmlVgpuLicenseInfo_t* licenseInfo )

Parameters

vgpuInstance: Identifier of the target vGPU instance
licenseInfo: Pointer to vGPU license information structure

Returns

NVML_SUCCESS if information is successfully retrieved
NVML_ERROR_INVALID_ARGUMENT if vgpuInstance is 0, or licenseInfo is NULL
NVML_ERROR_NOT_FOUND if vgpuInstance does not match a valid active vGPU instance on the system
NVML_ERROR_DRIVER_NOT_LOADED if NVIDIA driver is not running on the vGPU instance
NVML_ERROR_UNKNOWN on any unexpected error

Description

Query the license information of the vGPU instance.

For Maxwell or newer fully supported devices.

< Previous | Next >

NVML API Reference Guide (PDF) - vR575 (older) - Last updated June 05, 2025 - Send Feedback