Intelligent Platform Management Interface
The NVIDIA® BlueField®- 2 DPU provides management interfaces to the BMC and the BlueField device.
The BMC, based on the Intelligent Platform Management Interface (IPMI) standard, supports both out-of-band (OOB) dedicated interfaces, and a serial port to access the CLI of the BMC.
The BMC is connected to an external host server via LAN. IPMItool commands may be issued from the external server to retrieve information from the BMC as follows:
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN <ipmitool_arguments>
The sections below provide more details about the IPMItool commands which are supported.
FRU Reading
To retrieve FRU info, run:
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN fru print <fru-id>
FRU ID of the BMC FRU EEPROM is optional and can be found using the fru print command.
It is possible to dump the binary FRU data into a file. Run:
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN fru read <fru-id> <filename>
The parameter <filename> is the absolute path to the file.
System Event Log
The system event log (SEL) is non-volatile repository for system events and certain system configuration information. SEL entries have a unique "record ID" field. This field is used for retrieving log entries from the SEL. Record IDs are not required to be sequential or consecutive. Applications should not assume that the SEL record ID follows any particular numeric ordering.
Event logs are chassis events, recorded in the BMC software which can be read using IPMI commands.
If the SEL is full and a new event is raised, the oldest record is removed and the new one is placed at the end of the SEL.
SEL may be accessed, even after BlueField failure, on the server through IPMI LAN access.
The following table lists the command to use in order to view event logs:
Command |
Description |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel |
Displays information about SEL |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel list |
Displays list of events |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel elist |
Displays extended info list of events |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel save <filename> |
Saves SEL events to a file |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sel clear |
Clears SEL |
Sensor Data Record (SDR) Repository
Supported SDR Commands
BMC software supports reading chassis sensor information using the IPMItool.
The following table lists commands which allow reading SDR data:
Command |
Description |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr list |
Displays sensor data repository entry readings and their status |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr elist |
Displays extended sensor information |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sensor list |
Displays sensors and thresholds in a wide table format |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr get <name> |
Displays information for sensor data records specified by sensor ID |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sdr type <type> |
Displays all records from the SDR repository of a specific type |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sensor get <sensor_name> |
Displays information for sensors specified by name |
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN sensor reading <name>…<name> |
Displays readings for sensors specified by name |
SDR Entry List
SDR contains information about the type and number of sensors. The following is a list of the available SDR information:
Managed Entity |
List |
Tools/Commands |
NIC thermal sensors |
|
ipmitool sensor listipmitool -I ipmb sensor listipmitool -I ipmb sensor get bluefield_temp |
NIC voltage sensors |
ADC voltage sensors
|
ipmitool sensor list |
SFP temperature sensors |
DPU port temperature sensors:
|
ipmitool -I ipmb sensor listipmitool -I ipmb sensor get p0_temp |
SFP link status |
QSFP port link status:
|
ipmitool -I ipmb sensor get p0_link |
Arm DDR thermal sensors |
ddr0_0_temp |
ipmitool -I ipmb sensor get ddr0_0_temp |
Rebooting BlueField with BMC
BMC software enables resetting the BlueField.
To reset the main CPU, run:
ipmitool -C 17 -I lanplus -H <bmc_ip_addr> -U ADMIN -P ADMIN chassis power reset
The BMC can retrieve information on BlueField's sensors and FRUs via IPMI over IPMB protocol. IPMItool commands can be issued from the BMC using the following format:
ipmitool -I ipmb <ipmitool_arguments>
List of IPMI Supported Sensors
Sensor |
Sensor ID |
Description |
bluefield_temp |
0 |
Support NIC monitoring of BlueField’s temperature |
ddr0_0_temp |
1 |
Support monitoring of DDR0 temp (on memory controller 0) |
ddr0_1_temp |
2 |
Support monitoring of DDR1 temp (on memory controller 0) |
ddr1_0_temp |
3 |
Support monitoring of DDR0 temp (on memory controller 1) |
ddr1_1_temp |
4 |
Support monitoring of DDR1 temp (on memory controller 1) |
p0_temp |
5 |
Port 0 temperature |
p1_temp |
6 |
Port 1 temperature |
p0_link |
7 |
Port0 link status |
p1_link |
8 |
Port1 link status |
List of IPMI Supported FRUs
FRU |
ID |
Description |
update_timer |
0 |
set_emu_param.service is responsible for collecting data on sensors and FRUs every 3 seconds. This regular update is required for sensors but not for FRUs whose content is less susceptible to change. update_timer is used to sample the FRUs every hour instead. Users may need this timer in the case where they are issuing several raw IPMItool FRU read commands. This helps in assessing how much time users have to retrieve large FRU data before the next FRU update. update_timer is a hexadecimal number. |
fw_info |
1 |
ConnectX firmware information, Arm firmware version, and MLNX_OFED version The fw_info is in ASCII format |
nic_pci_dev_info |
2 |
NIC vendor ID, device ID, subsystem vendor ID, and subsystem device ID The nic_pci_dev_info is in ASCII format |
cpuinfo |
3 |
CPU information reported in lscpu and /proc/cpuinfo The cpuinfo is in ASCII format |
ddr0_0_spd |
4 |
FRU for SPD MC0 DIMM 0 (MC = memory controller) The ddr0_0_spd is in binary format |
ddr0_1_spd |
5 |
FRU for SPD MC0 DIMM1 The ddr0_1_spd is in binary format |
ddr1_0_spd |
6 |
FRU for SPD MC1 DIMM0 The ddr1_0_spd is in binary format |
ddr1_1_spd |
7 |
FRU for SPD MC1 DIMM1 The ddr1_1_spd is in binary format |
emmc_info |
8 |
eMMC size, list of its partitions, and partitions usage (in ASCII format). eMMC CID, CSD, and extended CSD registers (in binary format). The ASCII data is separated from the binary data with ‘StartBinary’ marker. |
qsfp0_eeprom |
9 |
FRU for QSFP 0 EEPROM page 0 content (256 bytes in binary format) |
qsfp1_eeprom |
10 |
FRU for QSFP 1 EEPROM page 0 content (256 bytes in binary format) |
ip_addresses |
11 |
This FRU is empty at start time. It can be used to write the BMC port 0 and port 1 IP addresses to the BlueField. They follow these formats:
The size of the written file should be 61 bytes exactly. |
dimms_ce_ue |
12 |
FRU reporting the number of correctable and uncorrectable errors in the DIMMs. This FRU is updated once every 3 seconds. |
eth0 |
13 |
Network interface 0 information. Updated once every minute. |
Supported IPMI Commands
All of the following commands are prepended with ipmitool on the command line.
Commands |
IPMItool Command |
Relevant IPMI 2.0 Rev1.1 Spec Section |
Get Device ID |
mc info |
20.1 |
Broadcast "Get Device ID" |
Part of "mc info" |
20.9 |
Get BMC Global Enables |
mc getenables |
22.2 |
Get Device SDR Info |
sdr info |
35.2 |
Get Device SDR |
"sdr get", "sdr list" or "sdr elist" |
35.3 |
Get Sensor Hysteresis |
sdr get <sensor-id> |
35.7 |
Set Sensor Threshold |
sensor thresh <sensor-id> <threshold> <setting> |
35.8 |
Get Sensor Threshold |
sdr get <sensor-id> |
35.9 |
Get Sensor Event Enable |
sdr get <sensor-id> |
35.11 |
Get Sensor Reading |
sensor reading <sensor-id> |
35.14 |
Get Sensor Type |
sdr type <type> |
35.16 |
Read FRU Data |
fru read <fru-number> <file-to-write-to> – provides FRU data |
34.2 |
Get SDR Repository Info |
sdr info |
33.9 |
Get SEL Info |
"sel" or "sel info" |
40.2 |
Get SEL Allocation Info |
"sel" or "sel info" |
40.3 |
Get SEL Entry |
"sel list" or "sel elist" |
40.5 |
Delete SEL Entry |
sel delete <id> |
40.8 |
Clear SEL |
sel clear |
40.9 |
The BMC has 2 IPMB modes. It can be used as a requester or responder.
Requester Mode
When used as a requester, the BMC sends IPMB request messages to the BlueField via SMBus 0. The BlueField then processes the request and sends a message back to the BMC.Responder Mode
When used as a responder, the BMC receives IPMB request messages from the BlueField on SMBus 0. It then processes the message and sends a response back to the BlueField.
Both modes are enabled automatically at boot time.
For more information on how to use IPMI, please refer to the IPMI 2.0 standard.
BMC supports IPMI boot option selection commands. UEFI on BlueField-2 can query for the boot options through an IPMI command over IPMB. Currently the UEFI on BlueField-2 supports only the option to change the boot device selector flag with the following supported options: PXE boot or the default boot device as selected in the boot menu on BlueField-2.
Get current setting – ipmitool chassis bootparam get 5
Force pxe boot – ipmitool chassis bootparam set bootflag force_pxe
Default boot device – ipmitool chassis bootparam set bootflag none
The DPU boot override setting from BMC is persistent until it is set to none or the BFB image is updated again.
BMC supports reset control of BlueField-2 through the GPIOs connected to the BMC.
Issue the following command from the BMC to get the power status of the DPU:
ipmitool chassis power status
OEM command 0xA1 is defined for various reset controls of BlueField-2 from BMC under the OEM NetFn group 0x30.
NVIDIA OEM command to reset BlueField DPU:
Request |
Response |
Reset Option |
|
Completion code:
|
|