Process Statistics¶
- group DCGMAPI_PROCESS_STATS
Describes APIs to investigate statistics such as accounting, performance and errors during the lifetime of a GPU process.
Functions
-
dcgmReturn_t dcgmWatchPidFields(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId, long long updateFreq, double maxKeepAge, int maxKeepSamples)¶
Request that DCGM start recording stats for fields that can be queried with dcgmGetPidInfo().
Note that the first update of the field will not occur until the next field update cycle. To force a field update cycle, call dcgmUpdateAllFields(1).
- Parameters
pDcgmHandle – IN: DCGM Handle
groupId – IN: Group ID representing collection of one or more GPUs. Look at dcgmGroupCreate for details on creating the group. Alternatively, pass in the group id as DCGM_GROUP_ALL_GPUS to perform operation on all the GPUs.
updateFreq – IN: How often to update this field in usec
maxKeepAge – IN: How long to keep data for this field in seconds
maxKeepSamples – IN: Maximum number of samples to keep. 0=no limit
- Returns
DCGM_ST_OK if the call was successful
DCGM_ST_BADPARAM if a parameter is invalid
DCGM_ST_REQUIRES_ROOT if the host engine is being run as non-root, and accounting mode could not be enabled (requires root). Run “nvidia-smi -am 1” as root on the node before starting DCGM to fix this.
-
dcgmReturn_t dcgmGetPidInfo(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId, dcgmPidInfo_t *pidInfo)¶
Get information about all GPUs while the provided pid was running.
In order for this request to work, you must first call dcgmWatchPidFields() to make sure that DCGM is watching the appropriate field IDs that will be populated in pidInfo
- Parameters
pDcgmHandle – IN: DCGM Handle
groupId – IN: Group ID representing collection of one or more GPUs. Look at dcgmGroupCreate for details on creating the group. Alternatively, pass in the group id as DCGM_GROUP_ALL_GPUS to perform operation on all the GPUs.
pidInfo – IN/OUT: Structure to return information about pid in. pidInfo->pid must be set to the pid in question. pidInfo->version should be set to dcgmPidInfo_version.
- Returns
DCGM_ST_OK if the call was successful
DCGM_ST_NO_DATA if the PID did not run on any GPU
-
dcgmReturn_t dcgmWatchPidFields(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId, long long updateFreq, double maxKeepAge, int maxKeepSamples)¶