2.2.1. Discovery

[System]

The following APIs are used to discover GPUs and their attributes on a Node.

Functions

dcgmReturn_t dcgmGetAllDevices ( dcgmHandle_t pDcgmHandle, unsigned int  gpuIdList[DCGM_MAX_NUM_DEVICES], int* count )
dcgmReturn_t dcgmGetAllSupportedDevices ( dcgmHandle_t pDcgmHandle, unsigned int  gpuIdList[DCGM_MAX_NUM_DEVICES], int* count )
dcgmReturn_t dcgmGetDeviceAttributes ( dcgmHandle_t pDcgmHandle, unsigned int  gpuId, dcgmDeviceAttributes_t* pDcgmAttr )

Functions

dcgmReturn_t dcgmGetAllDevices ( dcgmHandle_t pDcgmHandle, unsigned int  gpuIdList[DCGM_MAX_NUM_DEVICES], int* count )
Parameters
pDcgmHandle
IN : DCGM Handle
gpuIdList
OUT : Array reference to fill GPU Ids present on the system.
count
OUT : Number of GPUs returned in gpuIdList.
Returns

Description

This method is used to get identifiers corresponding to all the devices on the system. The identifier represents DCGM GPU Id corresponding to each GPU on the system and is immutable during the lifespan of the engine. The list should be queried again if the engine is restarted.

The GPUs returned from this function include gpuIds of GPUs that are not supported by DCGM. To only get gpuIds of GPUs that are supported by DCGM, use dcgmGetAllSupportedDevices().

dcgmReturn_t dcgmGetAllSupportedDevices ( dcgmHandle_t pDcgmHandle, unsigned int  gpuIdList[DCGM_MAX_NUM_DEVICES], int* count )
Parameters
pDcgmHandle
IN : DCGM Handle
gpuIdList
OUT : Array reference to fill GPU Ids present on the system.
count
OUT : Number of GPUs returned in gpuIdList.
Returns

Description

This method is used to get identifiers corresponding to all the DCGM-supported devices on the system. The identifier represents DCGM GPU Id corresponding to each GPU on the system and is immutable during the lifespan of the engine. The list should be queried again if the engine is restarted.

The GPUs returned from this function ONLY includes gpuIds of GPUs that are supported by DCGM. To get gpuIds of all GPUs in the system, use dcgmGetAllDevices().

dcgmReturn_t dcgmGetDeviceAttributes ( dcgmHandle_t pDcgmHandle, unsigned int  gpuId, dcgmDeviceAttributes_t* pDcgmAttr )
Parameters
pDcgmHandle
IN : DCGM Handle
gpuId
IN : GPU Id corresponding to which the attributes should be fetched
pDcgmAttr
IN/OUT : Device attributes corresponding to gpuId. pDcgmAttr->version should be set to dcgmDeviceAttributes_version before this call.
Returns

Description

Gets device attributes corresponding to the gpuId. If operation is not successful for any of the requested fields then the field is populated with one of DCGM_BLANK_VALUES defined in dcgm_structs.h.