The NVIDIA® BlueField® DPU provides management interfaces to the BMC and the BlueField device.
The BMC, based on the Intelligent Platform Management Interface (IPMI) standard, supports both out-of-band (OOB) dedicated interfaces, and a serial port to access the CLI of the BMC.
External Host Retrieving Data From BMC Via LAN
The BMC is connected to an external host server via LAN. IPMItool commands may be issued from the external server to retrieve information from the BMC as follows:
The sections below provide more details about the IPMItool commands which are supported.
To retrieve FRU info, run:
FRU ID of the BMC FRU EEPROM is optional and can be found using the
fru print command.
It is possible to dump the binary FRU data into a file. Run:
The parameter <filename> is the absolute path to the file.
System Event Log
The system event log (SEL) is non-volatile repository for system events and certain system configuration information. SEL entries have a unique "record ID" field. This field is used for retrieving log entries from the SEL. Record IDs are not required to be sequential or consecutive. Applications should not assume that the SEL record ID follows any particular numeric ordering.
Event logs are chassis events, recorded in the BMC software which can be read using IPMI commands.
If the SEL is full and a new event is raised, the oldest record is removed and the new one is placed at the end of the SEL.
SEL may be accessed, even after BlueField failure, on the server through IPMI LAN access.
The following table lists the command to use in order to view event logs:
Displays information about SEL
|Displays list of events|
|Displays extended info list of events|
|Saves SEL events to a file|
The following subsections detail the messages which are added to the BMC SEL and the scenarios that trigger them.
Messages are added to the BMC SEL while the DPU UEFI is booting which describe the status of the UEFI boot.
PCI resource configuration
System boot initiated
Messages are added to the SEL in case of a change in the status of the QSFP cables. The messages describe the event and status of the sensor.
List of QSFP sensors:
P0_link– the QSFP 0 cable status
P1_link– the QSFP 1 cable status
Config Error– the QSFP cable is down
Connected– the QSFP cable is up
Messages are added to the SEL if temperature sensors detect a value higher than the sensor thresholds. The messages include a description of the event, DPU FRU device description, DPU BMC device description, and the status of the sensor.
List of temperature sensors:
bluefield_temp– Bluefield temperature
p0_temp– QSFP 0 cable temperature
p1_temp– QSFP 1 cable temperature
Upper Critical going high– crossing a upper critical threshold.
Upper Non-critical going high– crossing a upper non-critical threshold.
Lower Critical going low– crossing a lower critical threshold.
Lower Non-critical going low– crossing a lower non-critical threshold.
Messages are added to the SEL if the sensor voltage crosses the sensor's thresholds. The messages include a description of the event, DPU FRU device description, DPU BMC device description, and the status of the sensor.
List of ADC sensors:
Upper Non-critical going high– crossing a upper non-critical threshold
Lower Non-critical going low– crossing a lower non-critical threshold
Sensor Data Record (SDR) Repository
Supported SDR Commands
BMC software supports reading chassis sensor information using the IPMItool.
The following table lists commands which allow reading SDR data:
|Displays sensor data repository entry readings and their status|
|Displays extended sensor information|
|Displays sensors and thresholds in a wide table format|
|Displays information for sensor data records specified by sensor ID|
|Displays all records from the SDR repository of a specific type|
|Displays information for sensors specified by name|
|Displays readings for sensors specified by name (only for numeric sensors)|
If a threshold is crossed, a message is added to the Redfish event log, SEL, and journal.
SDR Entry List
SDR contains information about the type and number of sensors. The following is a list of the available SDR information:
|Managed Entity||ID||Sensor Name|
SFP link status
NIC thermal sensors
SFP temperature sensors
NIC voltage sensors
ADC voltage sensors:
Rebooting BlueField with BMC
BMC software enables resetting the BlueField.
To reset the main CPU, run:
BMC Retrieving Data from BlueField Via IPMB
The BMC can retrieve information on BlueField's sensors and FRUs via IPMI over IPMB protocol. IPMItool commands can be issued from the BMC using the following format:
List of IPMI Supported Sensors
Support NIC monitoring of BlueField’s temperature
Support monitoring of DDR0 temp (on memory controller 0)
Support monitoring of DDR1 temp (on memory controller 0)
Support monitoring of DDR0 temp (on memory controller 1)
Support monitoring of DDR1 temp (on memory controller 1)
Port 0 temperature
Port 1 temperature
Port0 link status
Port1 link status
List of IPMI Supported FRUs
set_emu_param.service is responsible for collecting data on sensors and FRUs every 3 seconds. This regular update is required for sensors but not for FRUs whose content is less susceptible to change. update_timer is used to sample the FRUs every hour instead. Users may need this timer in the case where they are issuing several raw IPMItool FRU read commands. This helps in assessing how much time users have to retrieve large FRU data before the next FRU update.
ConnectX firmware information, Arm firmware version, and MLNX_OFED version
NIC vendor ID, device ID, subsystem vendor ID, and subsystem device ID
CPU information reported in lscpu and /proc/cpuinfo
FRU for SPD MC0 DIMM 0 (MC = memory controller)
FRU for SPD MC0 DIMM1
FRU for SPD MC1 DIMM0
FRU for SPD MC1 DIMM1
eMMC size, list of its partitions, and partitions usage (in ASCII format).
|qsfp0_eeprom||9||FRU for QSFP 0 EEPROM page 0 content (256 bytes in binary format)|
|qsfp1_eeprom||10||FRU for QSFP 1 EEPROM page 0 content (256 bytes in binary format)|
This FRU is empty at start time. It can be used to write the BMC port 0 and port 1 IP addresses to the BlueField. They follow these formats:
The size of the written file should be 61 bytes exactly.
|dimms_ce_ue||12||FRU reporting the number of correctable and uncorrectable errors in the DIMMs.|
This FRU is updated once every 3 seconds.
|eth0||13||Network interface 0 information. Updated once every minute.|
Supported IPMI Commands
All of the following commands are prepended with
ipmitool on the command line.
|Commands||IPMItool Command||Relevant IPMI 2.0 Rev1.1 Spec Section|
Get Device ID
Broadcast "Get Device ID"
Get BMC Global Enables
Get Device SDR Info
Get Device SDR
Get Sensor Hysteresis
Set Sensor Threshold
Get Sensor Threshold
Get Sensor Event Enable
Get Sensor Reading
Get Sensor Type
Read FRU Data
Get SDR Repository Info
Get SEL Info
Get SEL Allocation Info
Get SEL Entry
Delete SEL Entry
BlueField Retrieving Data From BMC Via IPMB
The BMC has 2 IPMB modes. It can be used as a requester or responder.
- Requester Mode
When used as a requester, the BMC sends IPMB request messages to the BlueField via SMBus 0. The BlueField then processes the request and sends a message back to the BMC.
- Responder Mode
When used as a responder, the BMC receives IPMB request messages from the BlueField on SMBus 0. It then processes the message and sends a response back to the BlueField.
Both modes are enabled automatically at boot time.
For more information on how to use IPMI, please refer to the IPMI 2.0 standard.
Boot Order Config
BMC supports IPMI boot option selection commands. UEFI on BlueField-2 can query for the boot options through an IPMI command over IPMB. Currently the UEFI on BlueField-2 supports only the option to change the boot device selector flag with the following supported options: PXE boot or the default boot device as selected in the boot menu on BlueField-2.
- Get current setting –
ipmitool chassis bootparam get 5
- Force pxe boot –
ipmitool chassis bootparam set bootflag force_pxe
- Default boot device –
ipmitool chassis bootparam set bootflag none
The DPU boot override setting from BMC is persistent until it is set to none or the BFB image is updated again.
BMC supports reset control of BlueField-2 through the GPIOs connected to the BMC.
Issue the following command from the BMC to get the power status of the DPU:
To perform a reset of the DPU, use the following commands:
|Hard reset of BlueField DPU (Arm cores and NIC)|
|Hard reset of BlueField Arm cores|
Hard reset of the BlueField DPU is allowed only when the host asserts:
PERSTsignal on BlueField-2
All_STANDBYsignal on BlueField-3
0xA1 is defined for additional non-standard reset controls of BlueField-2 from BMC under the OEM NetFn group
NVIDIA OEM command to reset BlueField DPU: