Field Constants

group dcgmFieldConstants

Constants that represent contents of individual field values.

Defines

DCGM_CUDA_COMPUTE_CAPABILITY_MAJOR(x)

DCGM_FI_DEV_CUDA_COMPUTE_CAPABILITY is 16 bits of major version followed by 16 bits of the minor version.

These macros separate the two.

DCGM_CUDA_COMPUTE_CAPABILITY_MINOR(x)
DCGM_CLOCKS_THROTTLE_REASON_GPU_IDLE

DCGM_FI_DEV_CLOCK_THROTTLE_REASONS is a bitmap of why the clock is throttled.

These macros are masks for relevant throttling, and are a 1:1 map to the NVML reasons documented in nvml.h. The notes for the header are copied blow: Nothing is running on the GPU and the clocks are dropping to Idle state

Note

This limiter may be removed in a later release

DCGM_CLOCKS_THROTTLE_REASON_CLOCKS_SETTING

GPU clocks are limited by current setting of applications clocks.

DCGM_CLOCKS_THROTTLE_REASON_SW_POWER_CAP

SW Power Scaling algorithm is reducing the clocks below requested clocks.

DCGM_CLOCKS_THROTTLE_REASON_HW_SLOWDOWN

HW Slowdown (reducing the core clocks by a factor of 2 or more) is engaged.

This is an indicator of:

  • temperature being too high

  • External Power Brake Assertion is triggered (e.g. by the system power supply)

  • Power draw is too high and Fast Trigger protection is reducing the clocks

  • May be also reported during PState or clock change

  • This behavior may be removed in a later release.

DCGM_CLOCKS_THROTTLE_REASON_SYNC_BOOST

Sync Boost.

This GPU has been added to a Sync boost group with nvidia-smi or DCGM in order to maximize performance per watt. All GPUs in the sync boost group will boost to the minimum possible clocks across the entire group. Look at the throttle reasons for other GPUs in the system to see why those GPUs are holding this one at lower clocks.

DCGM_CLOCKS_THROTTLE_REASON_SW_THERMAL

SW Thermal Slowdown.

This is an indicator of one or more of the following:

  • Current GPU temperature above the GPU Max Operating Temperature

  • Current memory temperature above the Memory Max Operating Temperature

DCGM_CLOCKS_THROTTLE_REASON_HW_THERMAL

HW Thermal Slowdown (reducing the core clocks by a factor of 2 or more) is engaged.

This is an indicator of:

  • temperature being too high

DCGM_CLOCKS_THROTTLE_REASON_HW_POWER_BRAKE

HW Power Brake Slowdown (reducing the core clocks by a factor of 2 or more) is engaged.

This is an indicator of:

  • External Power Brake Assertion being triggered (e.g. by the system power supply)

DCGM_CLOCKS_THROTTLE_REASON_DISPLAY_CLOCKS

GPU clocks are limited by current setting of Display clocks.

Enums

enum dcgmGpuVirtualizationMode_t

GPU virtualization mode types for DCGM_FI_DEV_VIRTUAL_MODE.

Values:

enumerator DCGM_GPU_VIRTUALIZATION_MODE_NONE

Represents Bare Metal GPU.

enumerator DCGM_GPU_VIRTUALIZATION_MODE_PASSTHROUGH

Device is associated with GPU-Passthrough.

enumerator DCGM_GPU_VIRTUALIZATION_MODE_VGPU

Device is associated with vGPU inside virtual machine.

enumerator DCGM_GPU_VIRTUALIZATION_MODE_HOST_VGPU

Device is associated with VGX hypervisor in vGPU mode.

enumerator DCGM_GPU_VIRTUALIZATION_MODE_HOST_VSGA

Device is associated with VGX hypervisor in vSGA mode.