Managing Power Capping

The GPU has three sources of power limits:

  • VBIOS: defines the maximum possible TGP (Total Graphics Power) value.

  • The nvidia-smi tool: sets the power limit of the GPU through host by users.

  • SMBPBI: sets the power limit of the GPU via an out-of-band channel.

The GPU Performance Monitoring Unit (PMU) selects the most conservative policy to cap power consumption on a system.

Managing N+N Configuration (IPMI)

By default, a system will boot with three power supplies. To achieve safe operation of an N+N configuration, you need to enable the power capping feature to limit the power consumed by the system.

  1. Get the system power limit.

    ipmitool raw 0x3c 0x80 0x05
    

    The format of the response is c8 32. To convert this value:

    (0xc8 + 0x32 << 8) = 0x32c8 = 13000
    

    If the feature is disabled, a value greater than 12,000 is returned.

  2. Enable PSU redundancy support.

    To enable the PSU redundancy feature, set the power budget limit outside the actual system budget. The following example sets the power budget to 12 kW.

    ipmitool raw x3c 0x81 0x05 0xE0 0x2E  //Set 12 kW
    

    Note

    This feature is disabled by default starting with version 24.07.1.

Managing Power Capping Using Redfish API

To manage the maximum power consumption on a system through power capping using Redfish API, refer to Querying GPU Power Limit and Power Capping.