Intelligent Platform Management Interface

1.0

The NVIDIA® BlueField®- 2 DPU provides management interfaces to the BMC and the BlueField device.

The BMC, based on the Intelligent Platform Management Interface (IPMI) standard, supports both out-of-band (OOB) dedicated interfaces, and a serial port to access the CLI of the BMC.

The BMC is connected to an external host server via LAN. IPMItool commands may be issued from the external server to retrieve information from the BMC as follows:

Copy
Copied!
            

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN <ipmitool_arguments>

The sections below provide more details about the IPMItool commands which are supported.

FRU Reading

To retrieve FRU info, run:

Copy
Copied!
            

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN fru print <fru-id>

FRU ID of the BMC FRU EEPROM is optional and can be found using the fru print command.

It is possible to dump the binary FRU data into a file. Run:

Copy
Copied!
            

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN fru read <fru-id> <filename>

Warning

The parameter <filename> is the absolute path to the file.


System Event Log

The system event log (SEL) is non-volatile repository for system events and certain system configuration information. SEL entries have a unique "record ID" field. This field is used for retrieving log entries from the SEL. Record IDs are not required to be sequential or consecutive. Applications should not assume that the SEL record ID follows any particular numeric ordering.

Event logs are chassis events, recorded in the BMC software which can be read using IPMI commands.

If the SEL is full and a new event is raised, the oldest record is removed and the new one is placed at the end of the SEL.

SEL may be accessed, even after BlueField failure, on the server through IPMI LAN access.

The following table lists the command to use in order to view event logs:

Command

Description

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel

Displays information about SEL

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel list

Displays list of events

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel elist

Displays extended info list of events

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel save <filename>

Saves SEL events to a file

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel clear

Clears SEL


Sensor Data Record (SDR) Repository

Supported SDR Commands

BMC software supports reading chassis sensor information using the IPMItool.

The following table lists commands which allow reading SDR data:

Command

Description

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr list

Displays sensor data repository entry readings and their status

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr elist

Displays extended sensor information

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sensor list

Displays sensors and thresholds in a wide table format

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr get <name>

Displays information for sensor data records specified by sensor ID

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr type <type>

Displays all records from the SDR repository of a specific type

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sensor get <sensor_name>

Displays information for sensors specified by name

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sensor reading <name>…<name>

Displays readings for sensors specified by name


SDR Entry List

SDR contains information about the type and number of sensors. The following is a list of the available SDR information:

Managed Entity

List

Tools/Commands

NIC thermal sensors

  • bluefield_temp – thermal sensors on BlueField SoC

ipmitool sensor listipmitool -I ipmb sensor listipmitool -I ipmb sensor get bluefield_temp

NIC voltage sensors

ADC voltage sensors

  • 0P6V_VIT

  • 1P2V

  • 1P2V_BMC_DDR

  • 1P2V_DDR

  • 1P8V

  • 1p15V_BMC

  • 2P5V_DDR

  • 3P3V

  • 3P3V_AUX

  • 5V

  • 12V_CORE_IN

  • 12V_PCIe

  • VCORE

ipmitool sensor list

SFP temperature sensors

DPU port temperature sensors:

  • p0_temp

  • p1_temp

ipmitool -I ipmb sensor listipmitool -I ipmb sensor get p0_temp

SFP link status

QSFP port link status:

  • p0_link

  • p1_link

ipmitool -I ipmb sensor get p0_link

Arm DDR thermal sensors

ddr0_0_temp

ipmitool -I ipmb sensor get ddr0_0_temp

Rebooting BlueField with BMC

BMC software enables resetting the BlueField.

To reset the main CPU, run:

Copy
Copied!
            

ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN chassis power reset


The BMC can retrieve information on BlueField's sensors and FRUs via IPMI over IPMB protocol. IPMItool commands can be issued from the BMC using the following format:

Copy
Copied!
            

ipmitool -I ipmb <ipmitool_arguments>

List of IPMI Supported Sensors

Sensor

Sensor ID

Description

bluefield_temp

0

Support NIC monitoring of BlueField’s temperature

ddr0_0_temp

1

Support monitoring of DDR0 temp (on memory controller 0)

ddr0_1_temp

2

Support monitoring of DDR1 temp (on memory controller 0)

ddr1_0_temp

3

Support monitoring of DDR0 temp (on memory controller 1)

ddr1_1_temp

4

Support monitoring of DDR1 temp (on memory controller 1)

p0_temp

5

Port 0 temperature

p1_temp

6

Port 1 temperature

p0_link

7

Port0 link status

p1_link

8

Port1 link status


List of IPMI Supported FRUs

FRU

ID

Description

update_timer

0

set_emu_param.service is responsible for collecting data on sensors and FRUs every 3 seconds. This regular update is required for sensors but not for FRUs whose content is less susceptible to change. update_timer is used to sample the FRUs every hour instead. Users may need this timer in the case where they are issuing several raw IPMItool FRU read commands. This helps in assessing how much time users have to retrieve large FRU data before the next FRU update.

update_timer is a hexadecimal number.

fw_info

1

ConnectX firmware information, Arm firmware version, and MLNX_OFED version

The fw_info is in ASCII format

nic_pci_dev_info

2

NIC vendor ID, device ID, subsystem vendor ID, and subsystem device ID

The nic_pci_dev_info is in ASCII format

cpuinfo

3

CPU information reported in lscpu and /proc/cpuinfo

The cpuinfo is in ASCII format

ddr0_0_spd

4

FRU for SPD MC0 DIMM 0 (MC = memory controller)

The ddr0_0_spd is in binary format

ddr0_1_spd

5

FRU for SPD MC0 DIMM1

The ddr0_1_spd is in binary format

ddr1_0_spd

6

FRU for SPD MC1 DIMM0

The ddr1_0_spd is in binary format

ddr1_1_spd

7

FRU for SPD MC1 DIMM1

The ddr1_1_spd is in binary format

emmc_info

8

eMMC size, list of its partitions, and partitions usage (in ASCII format).

eMMC CID, CSD, and extended CSD registers (in binary format).

The ASCII data is separated from the binary data with ‘StartBinary’ marker.

qsfp0_eeprom

9

FRU for QSFP 0 EEPROM page 0 content (256 bytes in binary format)

qsfp1_eeprom

10

FRU for QSFP 1 EEPROM page 0 content (256 bytes in binary format)

ip_addresses

11

This FRU is empty at start time. It can be used to write the BMC port 0 and port 1 IP addresses to the BlueField. They follow these formats:

Copy
Copied!
            

BMC: XXX.XXX.XXX.XXX P0: XXX.XXX.XXX.XXX P1: XXX.XXX.XXX.XXX

The size of the written file should be 61 bytes exactly.

dimms_ce_ue

12

FRU reporting the number of correctable and uncorrectable errors in the DIMMs.

This FRU is updated once every 3 seconds.

eth0

13

Network interface 0 information. Updated once every minute.


Supported IPMI Commands

All of the following commands are prepended with ipmitool on the command line.

Commands

IPMItool Command

Relevant IPMI 2.0 Rev1.1 Spec Section

Get Device ID

mc info

20.1

Broadcast "Get Device ID"

Part of "mc info"

20.9

Get BMC Global Enables

mc getenables

22.2

Get Device SDR Info

sdr info

35.2

Get Device SDR

"sdr get", "sdr list" or

"sdr elist"

35.3

Get Sensor Hysteresis

sdr get <sensor-id>

35.7

Set Sensor Threshold

sensor thresh <sensor-id> <threshold> <setting>

35.8

Get Sensor Threshold

sdr get <sensor-id>

35.9

Get Sensor Event Enable

sdr get <sensor-id>

35.11

Get Sensor Reading

sensor reading <sensor-id>

35.14

Get Sensor Type

sdr type <type>

35.16

Read FRU Data

fru read <fru-number> <file-to-write-to> – provides FRU data

34.2

Get SDR Repository Info

sdr info

33.9

Get SEL Info

"sel" or "sel info"

40.2

Get SEL Allocation Info

"sel" or "sel info"

40.3

Get SEL Entry

"sel list" or "sel elist"

40.5

Delete SEL Entry

sel delete <id>

40.8

Clear SEL

sel clear

40.9


The BMC has 2 IPMB modes. It can be used as a requester or responder.

  • Requester Mode
    When used as a requester, the BMC sends IPMB request messages to the BlueField via SMBus 0. The BlueField then processes the request and sends a message back to the BMC.

  • Responder Mode
    When used as a responder, the BMC receives IPMB request messages from the BlueField on SMBus 0. It then processes the message and sends a response back to the BlueField.

Both modes are enabled automatically at boot time.

For more information on how to use IPMI, please refer to the IPMI 2.0 standard.

BMC supports IPMI boot option selection commands. UEFI on BlueField-2 can query for the boot options through an IPMI command over IPMB. Currently the UEFI on BlueField-2 supports only the option to change the boot device selector flag with the following supported options: PXE boot or the default boot device as selected in the boot menu on BlueField-2.

  • Get current setting – ipmitool chassis bootparam get 5

  • Force pxe boot – ipmitool chassis bootparam set bootflag force_pxe

  • Default boot device – ipmitool chassis bootparam set bootflag none

The DPU boot override setting from BMC is persistent until it is set to none or the BFB image is updated again.

BMC supports reset control of BlueField-2 through the GPIOs connected to the BMC.

Issue the following command from the BMC to get the power status of the DPU:

Copy
Copied!
            

ipmitool chassis power status

OEM command 0xA1 is defined for various reset controls of BlueField-2 from BMC under the OEM NetFn group 0x30.

NVIDIA OEM command to reset BlueField DPU:

Request

Response

Reset Option

  • NetFun: 0x32

  • Command: 0xA1

  • Req_data1 (Reset Option): 0x00

Completion code:

  • Success: 0x00

  • Failure: <IPMI error code>

  • 0x00 – hard reset of BlueField-2 DPU

    Warning

    HARD_RST is allowed only when the host asserts the PERST signal

  • 0x01 – hard reset of BlueField-2 Arm cores

  • 0x02 – soft reset of BlueField-2 Arm cores

  • 0x03 – reset of TOR eSwitch

© Copyright 2023, NVIDIA. Last updated on Jan 16, 2024.