BMC
OpenBMC is a Linux-based system for managing and monitoring hardware, primarily in data centers and enterprises. It offers tools for power management, hardware monitoring, and firmware updates. As an open-source project, it is continuously updated by a community of developers and industry leaders.
For NVIDIA® BlueField® networking platforms (SuperNIC/DPU), OpenBMC streamlines management, boosts security, and optimizes performance. BlueField, designed to accelerate data center infrastructure and offload workloads from the CPU, relies on OpenBMC for its advanced management needs.
The integration of OpenBMC with BlueField offers key benefits:
Enhanced security – OpenBMC protects BlueField with features like secure boot, firmware verification, and role-based access controls
Comprehensive hardware management – OpenBMC enables detailed monitoring and management of hardware, including temperature and power, to optimize performance and prevent failures
Simplified firmware updates – OpenBMC streamlines firmware updates via the Redfish reset interface, making it scalable across systems
Scalability – OpenBMC's modular design supports large data centers and can be customized for specific deployment needs
Community support – OpenBMC’s open-source community ensures continuous improvements and adaptation to new technologies
This guide provides troubleshooting tips for OpenBMC in BlueField, offering system administrators, developers, and IT managers the tools to debug and resolve issues effectively.
Refer to the Table of Common Commands page of the BMC documentation.
The system event log (SEL) and event log in OpenBMC provide robust mechanisms for monitoring, diagnosing, and troubleshooting hardware and system issues.
SEL
Functionality – the SEL captures and records significant system events related to hardware and firmware. This includes events such as hardware failures, temperature thresholds, power anomalies, and other critical system changes.
Access – the SEL can be accessed via IPMI/Redfish commands, allowing administrators to query and retrieve logs for analysis
Management – administrators can clear, save, and manage SEL entries to maintain system health and ensure critical events are recorded accurately
Event log
Functionality – the event log provides a comprehensive record of both hardware and software events, offering detailed insights into system operations and potential issues. This includes firmware updates, configuration changes, security alerts, and more.
Access – the event log is accessible via Redfish interface, enabling easy retrieval and management of event data.
Management – users can filter, sort, and analyze events to identify patterns, diagnose problems, and improve system reliability. The event log supports exporting logs for offline analysis and archiving.
Key Features
Scalability – both SEL and event log are designed to handle a high volume of events, ensuring no critical information is lost
Integration – these logs integrate seamlessly with existing management tools, providing a unified view of system health and events
Usability – user-friendly interfaces and command-line tools make it easy to access and manage logs, ensuring administrators can quickly respond to issues
Overall, the SEL and event log in OpenBMC are essential tools for maintaining system integrity, improving reliability, and ensuring swift resolution of any issues that arise.
Event Log Redfish Commands
Displaying Event Log Information
curl -k -u root:'<password>'
-H 'Content-Type: application/json'
-X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/EventLog/
Example output:
{
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/EventLog"
,
"@odata.type"
: "#LogService.v1_1_0.LogService"
,
"Actions"
: {
"#LogService.ClearLog"
: {
"target"
: "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Actions/LogService.ClearLog"
}
},
"DateTime"
: "2023-09-27T14:28:50+00:00"
,
"DateTimeLocalOffset"
: "+00:00"
,
"Description"
: "System Event Log Service"
,
"Entries"
: {
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries"
},
"Id"
: "EventLog"
,
"Name"
: "Event Log Service"
,
"Oem"
: {
"Nvidia"
: {
"@odata.type"
: "#NvidiaLogService.v1_0_0.NvidiaLogService"
,
"LatestEntryID"
: "4"
,
"LatestEntryTimeStamp"
: "2023-09-27T14:19:30+00:00"
}
},
"OverWritePolicy"
: "WrapsWhenFull"
}
Displaying List of Events
curl -k -u root:'<password>'
-H 'Content-Type: application/json'
-X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries
Example output:
{
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries"
,
"@odata.type"
: "#LogEntryCollection.LogEntryCollection"
,
"Description"
: "Collection of System Event Log Entries"
,
"Members"
: [
{
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/1"
,
"@odata.type"
: "#LogEntry.v1_9_0.LogEntry"
,
"AdditionalDataURI"
: "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/1/attachment"
,
"Created"
: "2023-09-27T14:18:39+00:00"
,
"EntryType"
: "Event"
,
"Id"
: "1"
,
"Message"
: "12V_ATX sensor crossed a warning low threshold going low. Reading=6.048000 Threshold=10.400000."
,
"MessageArgs"
: [
"12V_ATX"
,
"6.048000"
,
"10.400000"
],
"MessageId"
: "OpenBMC.0.1.SensorThresholdWarningLowGoingLow"
,
"Name"
: "System Event Log Entry"
,
"Resolution"
: ""
,
"Resolved"
: false
,
"Severity"
: "OK"
}
…
],
"Members@odata.count"
: 1
,
"Name"
: "System Event Log Entries"
}
SEL Redfish Commands
Displaying SEL Information
curl -k -u root:'<password>'
-H 'Content-Type: application/json'
-X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/SEL/
Example output:
{
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/SEL"
,
"@odata.type"
: "#LogService.v1_1_0.LogService"
,
"Actions"
: {
"#LogService.ClearLog"
: {
"target"
: "/redfish/v1/Systems/Bluefield/LogServices/SEL/Actions/LogService.ClearLog"
}
},
"DateTime"
: "2024-07-18T10:54:52+00:00"
,
"DateTimeLocalOffset"
: "+00:00"
,
"Description"
: "IPMI SEL Service"
,
"Entries"
: {
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/SEL/Entries"
},
"Id"
: "SEL"
,
"Name"
: "SEL Log Service"
,
"OverWritePolicy"
: "WrapsWhenFull"
}
Displaying List of Events
curl -k -u root:'<password>'
-H 'Content-Type: application/json'
-X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/SEL/Entries
Example output:
{
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/SEL/Entries"
,
"@odata.type"
: "#LogEntryCollection.LogEntryCollection"
,
"Description"
: "Collection of System Event Log Entries"
,
"Members"
: [
{
"@odata.id"
: "/redfish/v1/Systems/Bluefield/LogServices/SEL/Entries/1"
,
"@odata.type"
: "#LogEntry.v1_13_0.LogEntry"
,
"Created"
: "2024-07-16T15:34:32+00:00"
,
"EntryType"
: "SEL"
,
"Id"
: "1"
,
"Message"
: "12V_ATX sensor crossed a warning low threshold going low. Reading=6.048000 Threshold=10.400000."
,
"MessageArgs"
: [
"12V_ATX"
,
"6.048000"
,
"10.400000"
],
"MessageId"
: "OpenBMC.0.1.SensorThresholdWarningLowGoingLow"
,
"Name"
: "System Event Log Entry"
,
"Resolution"
: "Check the sensor or subsystem for errors."
,
"Resolved"
: false
,
"Severity"
: "OK"
},
…
],
"Members@odata.count"
: 22
,
"Name"
: "System Event Log Entries"
}
BlueField BMC OOB Network Configuration
The out-of-band (OOB) management network is shared between the BlueField BMC and BlueField via a single RJ-45 port, using an internal 3-port switch. The BlueField BMC controls this switch through a dedicated I2C interface.

3-port Switch
The 3-port operates as an L2 switch and has all ports enabled by default, granting both devices access to the OOB management network. Its non-volatile state retains configurations across power cycles.
To restore default settings, the BMC must perform a factory reset via the I2C interface.
Functionality |
Command |
Notes |
3-port switch reset |
ipmitool raw 0x32 0xA1 0x3 |
Executing this command resets the 3-port switch while keeping the configuration intact |
Disable BlueField OOB port |
ipmitool raw 0x32 0x98 0x1 |
This operation is executed on the 3-port switch by closing port 2, effectively eliminating the ability to send or receive any traffic to the BlueField OOB port |
3-port switch factory reset |
Executing this command resets all configurations on the 3-port switch to factory default |
To enhance network state debugging, the BlueField BMC now includes an internal tool. This tool enables capturing the following information:
Switch global state
State of each port
MIB counters for each port
Description |
Command |
Dump 3-port switch state on BlueField-3 |
|
Dump 3-port switch state on BlueField-2 |
|
Command output example:
Machine: Bluefield-3
BMC --> SWITCH I2C Line ID = 10
======================================
Global Chip ID 0 Register................ 0x00
Global Chip ID 1 Register................ 0x98
Global Chip ID 2 Register................ 0x93
Global Chip ID 3 Register................ 0x60
Global Chip ID 4 Register................ 0x0c
Switch Operation Register................ 0x01
Switch MAC Address 0 Register............ 0x00
Switch MAC Address 1 Register............ 0x10
Switch MAC Address 2 Register............ 0xa1
Switch MAC Address 3 Register............ 0xff
Switch MAC Address 4 Register............ 0xff
Switch MAC Address 5 Register............ 0xff
Switch Maximum Transmit Unit Register (H) 0x07
Switch Maximum Transmit Unit Register (L) 0xd0
Switch MAC Control 0 Register............ 0x0e
Switch MAC Control 1 Register............ 0xd0
Switch MAC Control 4 Register............ 0x00
Switch MAC Control 5 Register............ 0x10
KSZ9893 port 1 - RJ45
-------------------------
XMII Port Control0 Register........ 0x00
XMII Port Control1 Register........ 0x10
Port PME_WoL Event................. 0x03
Port PME_WoL Enable................ 0x00
Port Interrupt Status Register..... 0x00
Port Interrupt Mask Register....... 0x00
Port Operation Control 0........... 0x00
Port Status Register............... 0x14
PHY Basic Control Register......... 0x11 0x40
PHY Basic Status Register.......... 0x79 0x6d
PHY Auto-Neg Advertisement......... 0x0d 0xe1
PHY Auto-Neg Link Partner Ability.. 0xc1 0xe1
PHY Auto-Neg Expansion Status...... 0x00 0x6f
PHY Auto-Neg Next Page............. 0x20 0x01
PHY Auto-Neg Link Partner Next Page 0x60 0x01
PHY 1000Base-T Control Register.... 0x06 0x00
PHY 1000Base-T Status Register..... 0x78 0x00
PHY Extended Status Register....... 0x20 0x00
PHY Digital PMA/PCS Status......... 0x0f 0x7e
Port RXER Count Register........... 0x00 0x00
Port Interrupt Control/Status...... 0x00 0x29
PHY Auto MDI/MDI-X Register........ 0x24 0x00
PHY Control Register............... 0x03 0x4c
======================================
KSZ9893 port 2 - BF OOB
-------------------------
XMII Port Control0 Register........ 0xf8
XMII Port Control1 Register........ 0x00
Port PME_WoL Event................. 0x03
Port PME_WoL Enable................ 0x00
Port Interrupt Status Register..... 0x00
Port Interrupt Mask Register....... 0x00
Port Operation Control 0........... 0x00
Port Status Register............... 0x14
PHY Basic Control Register......... 0x11 0x40
PHY Basic Status Register.......... 0x79 0x69
PHY Auto-Neg Advertisement......... 0x0d 0xe1
PHY Auto-Neg Link Partner Ability.. 0xc1 0x41
PHY Auto-Neg Expansion Status...... 0x00 0x6f
PHY Auto-Neg Next Page............. 0x20 0x01
PHY Auto-Neg Link Partner Next Page 0x60 0x01
PHY 1000Base-T Control Register.... 0x06 0x00
PHY 1000Base-T Status Register..... 0x78 0x00
PHY Extended Status Register....... 0x20 0x00
PHY Digital PMA/PCS Status......... 0xbc 0x7e
Port RXER Count Register........... 0x00 0x00
Port Interrupt Control/Status...... 0x00 0x2f
PHY Auto MDI/MDI-X Register........ 0x24 0x00
PHY Control Register............... 0x03 0x4c
======================================
KSZ9893 port 3 - BMC Eth
-------------------------
XMII Port Control0 Register........ 0x50
XMII Port Control1 Register........ 0x4b
Port PME_WoL Event................. 0x02
Port PME_WoL Enable................ 0x00
Port Interrupt Status Register..... 0x00
Port Interrupt Mask Register....... 0x00
Port Operation Control 0........... 0x00
Port Status Register............... 0x14
PHY Basic Control Register......... 0x14 0x00
PHY Basic Status Register.......... 0x00 0x00
PHY Auto-Neg Advertisement......... 0x14 0x00
PHY Auto-Neg Link Partner Ability.. 0x00 0x00
PHY Auto-Neg Expansion Status...... 0x14 0x00
PHY Auto-Neg Next Page............. 0x00 0x00
PHY Auto-Neg Link Partner Next Page 0x14 0x00
PHY 1000Base-T Control Register.... 0x00 0x00
PHY 1000Base-T Status Register..... 0x14 0x00
PHY Extended Status Register....... 0x00 0x00
PHY Digital PMA/PCS Status......... 0x00 0x00
Port RXER Count Register........... 0x00 0x00
Port Interrupt Control/Status...... 0x00 0x00
PHY Auto MDI/MDI-X Register........ 0x14 0x00
PHY Control Register............... 0x00 0x00
======================================
KSZ9893 port 1 - RJ45
-------------------------
RxHiPriorityByte.... 0x00 0x00 0x00 0x00
RxUndersizePkt...... 0x00 0x00 0x00 0x00
RxFragments......... 0x00 0x00 0x00 0x00
RxOversize.......... 0x00 0x00 0x00 0x00
RxJabbers........... 0x00 0x00 0x00 0x00
RxSymbolError....... 0x00 0x00 0x00 0x00
RxCRCerror.......... 0x00 0x00 0x00 0x00
RxAlignmentError.... 0x00 0x00 0x00 0x00
RxControl8808Pkts... 0x00 0x00 0x00 0x00
RxPausePkts......... 0x00 0x00 0x00 0x00
RxBroadcast......... 0x00 0x58 0xe3 0x17
RxMulticast......... 0x03 0xbd 0x55 0xa6
RxUnicast........... 0x00 0x0b 0x5d 0xfe
TxHiPriorityByte.... 0x00 0x00 0x00 0x00
TxLateCollision..... 0x00 0x00 0x00 0x00
TxPausePkts......... 0x00 0x00 0x00 0x00
TxBroadcastPkts..... 0x00 0x00 0x00 0x4a
TxMulticastPkts..... 0x00 0x01 0xf3 0x03
TxUnicastPkts....... 0x00 0x06 0x52 0x36
TxDeferred.......... 0x00 0x00 0x00 0x00
TxTotalCollision.... 0x00 0x00 0x00 0x00
TxExcessiveCollision 0x00 0x00 0x00 0x00
TxSingleCollision... 0x00 0x00 0x00 0x00
TxMultipleCollision. 0x00 0x00 0x00 0x00
RxByteCnt........... 0xef 0xa5 0x78 0x31
TxByteCnt........... 0x02 0x67 0xcc 0xa2
RxDropPackets....... 0x00 0x1f 0xd6 0x3d
TxDropPackets....... 0x00 0x00 0x00 0x00
========================================
KSZ9893 port 2 - BF OOB
-------------------------
RxHiPriorityByte.... 0x00 0x00 0x00 0x00
RxUndersizePkt...... 0x00 0x00 0x00 0x00
RxFragments......... 0x00 0x00 0x00 0x00
RxOversize.......... 0x00 0x00 0x00 0x00
RxJabbers........... 0x00 0x00 0x00 0x00
RxSymbolError....... 0x00 0x00 0x00 0x00
RxCRCerror.......... 0x00 0x00 0x00 0x00
RxAlignmentError.... 0x00 0x00 0x00 0x00
RxControl8808Pkts... 0x00 0x00 0x00 0x00
RxPausePkts......... 0x00 0x00 0x12 0x1d
RxBroadcast......... 0x00 0x00 0x00 0x61
RxMulticast......... 0x00 0x00 0x05 0xfb
RxUnicast........... 0x00 0x00 0x31 0x9d
TxHiPriorityByte.... 0x00 0x00 0x00 0x00
TxLateCollision..... 0x00 0x00 0x00 0x00
TxPausePkts......... 0x00 0x00 0x00 0x00
TxBroadcastPkts..... 0x00 0x56 0x20 0x64
TxMulticastPkts..... 0x03 0xa2 0x2f 0xde
TxUnicastPkts....... 0x00 0x0f 0x66 0xc5
TxDeferred.......... 0x00 0x00 0x00 0x00
TxTotalCollision.... 0x00 0x00 0x00 0x00
TxExcessiveCollision 0x00 0x00 0x00 0x00
TxSingleCollision... 0x00 0x00 0x00 0x00
TxMultipleCollision. 0x00 0x00 0x00 0x00
RxByteCnt........... 0x00 0x16 0x81 0xcc
TxByteCnt........... 0xe0 0xe8 0x80 0x34
RxDropPackets....... 0x00 0x00 0x00 0x00
TxDropPackets....... 0x00 0x00 0x1f 0x24
========================================
KSZ9893 port 3 - BMC Eth
-------------------------
RxHiPriorityByte.... 0x00 0x00 0x00 0x00
RxUndersizePkt...... 0x00 0x00 0x00 0x00
RxFragments......... 0x00 0x00 0x00 0x00
RxOversize.......... 0x00 0x00 0x00 0x00
RxJabbers........... 0x00 0x00 0x00 0x00
RxSymbolError....... 0x00 0x00 0x00 0x00
RxCRCerror.......... 0x00 0x00 0x00 0x00
RxAlignmentError.... 0x00 0x00 0x00 0x00
RxControl8808Pkts... 0x00 0x00 0x00 0x00
RxPausePkts......... 0x00 0x00 0x00 0x00
RxBroadcast......... 0x00 0x00 0x00 0xbf
RxMulticast......... 0x00 0x01 0xed 0x31
RxUnicast........... 0x00 0x06 0x20 0xa6
TxHiPriorityByte.... 0x00 0x00 0x00 0x00
TxLateCollision..... 0x00 0x00 0x00 0x00
TxPausePkts......... 0x00 0x00 0x00 0x00
TxBroadcastPkts..... 0x00 0x56 0x21 0x95
TxMulticastPkts..... 0x03 0xa0 0x5b 0x11
TxUnicastPkts....... 0x00 0x0b 0x71 0x17
TxDeferred.......... 0x00 0x00 0x00 0x00
TxTotalCollision.... 0x00 0x00 0x00 0x00
TxExcessiveCollision 0x00 0x00 0x00 0x00
TxSingleCollision... 0x00 0x00 0x00 0x00
TxMultipleCollision. 0x00 0x00 0x00 0x00
RxByteCnt........... 0x02 0x56 0x8c 0xe4
TxByteCnt........... 0xe1 0x4b 0x9e 0xbe
RxDropPackets....... 0x00 0x00 0x00 0x00
TxDropPackets....... 0x00 0x00 0x0b 0x42
========================================
Restoring BMC to Default
BMC is restored to default using the following factory reset command:
ipmitool raw 0x32 0x66
After issuing the ipmitool raw
command for a factory reset, a BMC reboot is needed for it to take effect. This reset restores the BMC file system to default, erasing user modifications.
DPU BMC File System Full
If the BMC's file system runs out of space, it will enter emergency console mode on the next boot. To recover, the user must log in and clear the Read-Write File System (RWFS) to free up space and restore normal logging.
Change Root test failed! Invoking emergency shell.
Enter password to try to manually fix.
After fixing run exit to continue this script, or reboot -f to retry, or
touch /takeover and exit to become PID 1 allowing editing of this script.
Give root password for system maintenance
(or type Control-D for normal startup):
Enter either your root password or the system default password, 0penBmc
, to be logged into the emergency shell, run the following shell command to identify the RWFS partition (mtd6
in this case):
/ # cat /proc/mtd
dev: size erasesize name
mtd0: 04000000 00010000 "bmc"
mtd1: 000e0000 00010000 "u-boot"
mtd2: 00900000 00010000 "kernel"
mtd3: 02500000 00010000 "rofs"
mtd4: 10000000 00010000 "config"
mtd5: 00040000 00010000 "u-boot-env"
mtd6: 01000000 00010000 "rwfs"
mtd7: 00400000 00010000 "rwfs_failover"
mtd8: 02400000 00010000 "images_primary"
mtd9: 02400000 00010000 "images_secondary"
mtd10: 023c0000 00010000 "log"
Run the following command to erase the flash partition and restore the space and reboot the system:
/ # flash_eraseall /dev/mtd6
Erasing 64 Kibyte @ 1000000 - 100% complete.
/ # reboot -f
After these operations, the BlueField BMC and its default password are restored to factory settings.
Performing Arm Reboot While Keeping BMC Up and Capturing Arm Console
The BlueField BMC can be leveraged to debug and track the Arm OS boot sequence.
Log into the BlueField BMC either through the BlueField BMC serial connection or via SSH.
Execute the following commands on the BlueField BMC console to force a reset of the BlueField Arm, followed by opening the BlueField console client, enabling users to monitor the boot sequence:
ipmitool power reset
The following command opens the BlueField console client to monitor the boot sequence:
obmc-console-client -e exit
To return to the BlueField BMC console type
exit
.
OS Upgrade
Different Cards Versions
BF2 |
BF3 |
|
2.8.1 |
✓ |
- |
2.8.2 |
✓ |
- |
22.xx |
✓ |
✓ |
23.xx |
✓ |
✓ |
24.xx |
✓ |
✓ |
Update Methods
Firmware upgrade of BMC and eROT components using BMC can be performed from a remote server using the Redfish interface.
Run the following Redfish command over the 1GbE out-of-band interface on the BlueField BMC to trigger a secure BlueField BMC/eROT firmware update:
curl -k -u root:'<password>' -H "Content-Type: application/octet-stream" -X POST -T <package_path> https://<DPU-BMC-IP>/redfish/v1/UpdateService/update
Where:
<password>
– DPU BMC password<package_path>
– BMC/eROT firmware update package path<DPU-BMC-IP>
– BMC IP addressAfter pushing the image to the BlueField BMC, a new task is created. Example:
{ "@odata.id": "/redfish/v1/TaskService/Tasks/0", "@odata.type": "#Task.v1_4_3.Task", "Id": "0", "TaskState": "Running" }
To track the progress of the update, use the task
Id
received in the response above (i.e.,0
) in your query and monitor the value of the task’sPercentComplete
field:curl -k -u root:'<password>' -X GET https://<DPU-BMC-IP>/redfish/v1/TaskService/Tasks/<task_id> | jq -r ' .PercentComplete'
Where:
<password>
– DPU BMC password<DPU-BMC-IP>
– BMC IP addresstask_id>
– task ID of the update process as received in the response under theId
valueOutput example:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2123 100 2123 0 0 38600 0 --:--:-- --:--:-- --:--:-- 37910 20
When the process reaches 100%:
Update Type
BlueField-2
BlueField-3
BMC update
BMC reboot is required
eROT update
System power cycle is required
eROT self-reset command is required (
ipmitool raw 0x32 0xD2
)
Possible Error Codes During BMC/eROT Upgrade
Fault |
Diagnosis and Possible Solution |
Connection to BMC breaks during firmware package transfer |
A new firmware update can be attempted by the Redfish client. |
Connection to BMC breaks during firmware update |
A new firmware update can be attempted by the Redfish client. |
Two firmware update requests are initiated |
The Redfish server blocks the second firmware update request and returns the following:
Check the status of the ongoing firmware update by looking at the TaskCollection resource. |
Redfish task hangs |
A new firmware update can be attempted by the Redfish client. |
BMC-EROT communication failure during image transfer |
The Redfish task monitoring the firmware update indicates a failure:
The Redfish client may retry the firmware update. |
Firmware update fails |
The Redfish task monitoring the firmware update indicates a failure:
The Redfish client may retry the firmware update. |
ERoT failure (not responding) |
The Redfish task monitoring the firmware update indicates a failure:
The Redfish client may retry the firmware update. |
Firmware image validation failure |
The Redfish task monitoring the firmware update indicates a failure:
The Redfish client might retry the firmware update. |
Power loss before activation command is sent |
A new firmware update can be attempted by the Redfish client. |
Firmware activation failure |
The Redfish task monitoring the firmware update indicates a failure:
The Redfish client may retry the firmware update. |
Push to BMC firmware package greater than 200 MB |
|
Version Upgrade Limitations
BlueField-2 |
BlueField-3 |
|
2.8.1 |
Cannot be upgraded |
--- |
2.8.2 |
Not limited |
--- |
22.xx |
Not limited |
To upgrade, Glacier version must be above 00.02.0052.0000 |
23.xx |
Not limited |
Not limited |
24.xx |
Not limited |
Not limited |
Expected Upgrade Time Durations
BlueField-2 |
BlueField-3 |
Comments |
|
BMC |
~17 minutes |
~12 minutes |
Without including BMC reboot |
eROT |
~3 minutes |
~11 seconds |
Without including system power cycle |
Password Issues
Please refer to the pages Connecting to BMC Interfaces and User Management in NVIDIA BlueField BMC Software documentation for information.
Timeout or Failure of IPMItool Command from Arm (via I2C)
The BlueField BMC and BlueField Arm are interconnected via a dedicated I2C bus, which implements a bi-directional IPMB (Intelligent Platform Management Bus) protocol.
This setup enables both sides to send IPMI (Intelligent Platform Management Interface) commands to retrieve data.

To facilitate this bi-directional IPMB interface, each device must allocate a unique slave address per channel.
This enables the protocol to effectively send and receive data between the sender and the responder.
Running i2cdetect on that dedicated I2C bus (bus = 1) on the BlueField Arm should yield the following I2C slave addresses.
These addresses are pre-allocated and hard-coded in the relevant driver. The existence of these addresses may indicate the health and status of each device.
# i2cdetect -y 1
0 1 2 3 4 5 6 7 8 9 a b c d e f
00: -- -- -- -- -- -- -- --
10: 10 UU -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: 20 -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: UU -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --
In this example, we can observe that the BlueField BMC allocates addresses 0x10 and 0x20, while the BlueField Arm allocates addresses 0x11 and 0x30.
If the BlueField BMC's slave addresses are not detected, it typically indicates that the BlueField BMC has not booted. To recover from this state, try power cycling the system.
If the slave addresses of the BlueField Arm are not detected, it usually means that the service was not loaded correctly. In this case, please reboot the BlueField Arm.
Extract BMC and System Logs
The BMC and system logs can be collected using the Redfish interface .
Two types of dumps are supported:
BMC dump, which is a collection of logs from BMC
System dump, which is a collection of logs from BlueField. To create a system dump, users must provide the BlueField credentials and IP address of the
tmfifo_net0
network interface.
Please refer to BMC and Bluefield Logs for more information.
DPU Console
The DPU BMC features a dedicated UART connection between the DPU BMC and the DPU, primarily utilized for implementing the Serial-over-LAN (SOL) feature .
This interface can be used for:
Utilize SOL to connect to the DPU console over the network.
Log in to the BMC and initiate a local client to establish a connection with the DPU console.
Extract the DPU console log file using the dump services provided by the DPU BMC.
Access the DPU BMC to examine the local DPU console log file.

Serial-over-LAN
Serial-over-LAN (SOL) is a remote management feature that allows users to access a device's serial console over a network connection, providing convenient troubleshooting and management capabilities from remote locations.
Enabling SOL
ipmitool -C 17 -I lanplus -H 10.237.53.58 -U root -P '<password>' sol set enabled true 1
Connecting to DPU Console
ipmitool -C 17 -I lanplus -H 10.237.53.58 -U root -P '<password>' sol activate
BlueField BMC Console Client
The obmc-console-client tool facilitates access to the DPU console interface from within the BMC, enabling users to perform management and troubleshooting tasks.
To utilize the tool, users must first log into the DPU BMC either via the BMC serial interface or through an SSH connection.
Opening Console
obmc-console-client -e <escape-string>
<escape-string>
- once entered by the user, triggers an escape from the BlueField console, returning the user to the BlueField BMC console.
BlueField Console Log
The BlueField BMC captures and stores console output in 2MB of volatile memory, rotating logs once they exceed 1MB by overwriting the oldest data. Every 24 hours, the current log is saved as a snapshot, starting a new log cycle. Logs can be extracted using the BlueField BMC’s Redfish log dump service.
Logs are erased if the BMC reboots but remain accessible as long as it stays operational.
Extracting Log Files from BlueField BMC
To extract the log files from the BlueField BMC:
Create the BMC dump:
sudo curl -k -u root:'<password>' -d '{"DiagnosticDataType": "Manager"}' -X POST https://<ip_address>/redfish/v1/Managers/Bluefield_BMC/LogServices/Dump/Actions/LogService.CollectDiagnosticData
Track the dump task state:
sudo curl -k -u root:'<password>' -H 'Content-Type: application/json' -X GET https://<ip_address>/redfish/v1/TaskService/Tasks/<task_id>
Monitor the task until it reaches completion and ensure the following attributes are:
"PercentComplete": 100, "TaskState": "Completed", "TaskStatus": "OK"
Download the log:
sudo curl -k -u root:'<password>' -H 'Content-Type: application/json' -X GET https://<ip_address>/redfish/v1/Managers/Bluefield_BMC/LogServices/Dump/Entries/<entry_id>/attachment --output </path/to/tar/log_dump.tar.xz>
Extract the retrieved tar file, navigate to the BlueField logs directory, and examine the console logs. Each log entry begins with a timestamp showing when it was captured on the BlueField BMC.
The following is a sample of the provided log:
[06:21:49][50484.995566] mlx5_core 0000:03:00.0: poll_health:1037:(pid 0): Fatal error 3 detected [06:21:50]Nvidia BlueField-3 rev1 BL1 V1.0 [06:21:50]INFO: psc supervisor init. [06:21:50]INFO: psc_irq_init... [06:21:50]INFO: force_crs_enable=0 pcr.lock0 = 0, time = 109977 [06:21:50]INFO: enter idle task. [06:21:50]NOTICE: Running as 9009D3B400ENEA system [06:21:50]NOTICE: BL2: v2.2(release):4.7.0-12-g4d27fda [06:21:50]NOTICE: BL2: Built : 00:23:28, Mar 27 2024 [06:21:50]NOTICE: BL2 built for hw (ver 2) [06:21:50]NOTICE: # Finished initializing DDR MSS1 [06:21:53]NOTICE: DDR POST passed. [06:21:53]INFO: mailbox rx: channel = 2, code = 0x43544c44 [06:21:53]NOTICE: BL31: v2.2(release):4.7.0-12-g4d27fda [06:21:53]NOTICE: BL31: Built : 00:23:28, Mar 27 2024 [06:21:53]NOTICE: BL31 built for hw (ver 2), lifecycle Production [06:21:53] [06:21:53]PTM:171226:2:0:6~ [06:21:54]I/TC: [06:21:54]I/TC: OP-TEE version: 4.1.0-25-ga07c623 (gcc version 8.3.0 (GCC)) #1 Wed Mar 27 00:21:33 UTC 2024 aarch64 [06:21:54]I/TC: Primary CPU initializing [06:21:54]I/TC: GIC redistributor base address not provided [06:21:54]I/TC: Assuming default GIC group status and modifier [06:21:54]INFO: mailbox rx: channel = 2, code = 0x41545452 [06:21:54]I/TC: Primary CPU switching to normal world boot [06:21:54]UEFI firmware (version BlueField:4.7.0-28-g3fe21fd-BId13090 built at 00:37:12 on Mar 27 2024) [06:22:05] Current Secure Boot State: disabled [06:22:09]Secure Boot Mode : Setup Mode [06:22:09]PK is not configured [06:22:09]Redfish enabled [06:22:16] DHCP Session Start [06:22:36] [06:22:36]Press ESC/F2/DEL twice to enter UEFI Menu. [06:22:36]Press ENTER to skip countdown. [06:22:36] ....
BlueField-to-BMC Redfish Communication
The Redfish server enables remote management of BIOS settings via a standard RESTful API. BIOS configurations can be modified through HTTP requests and responses. A private VLAN 4040 interface is set up on the BlueField BMC and switch to streamline communication between the BIOS and the Redfish server.
On the 3-port switch, only ports 2 and 3 are configured to be associated with VLAN 4040 to ensure a complete separation of the VLAN traffic from the management network.

If the Redfish server fails to provide the BIOS configuration, that would typically indicate that the initial communication between the BIOS (UEFI) and the BlueField BMC failed to establish. To troubleshoot the issue, follow the steps detailed in the following subsections.
Reset BlueField Arm While Keeping BMC Active
The BlueField BMC and BlueField boot independently. This may lead to a scenario where, during a system power cycle, the UEFI may try to connect to the Redfish server before the BlueField BMC is ready, leading to the UEFI resuming boot without waiting. This would cause the BIOS information to not be transmitted to the BlueField BMC when the BlueField BMC eventually boots up, resulting in an incomplete population of BIOS data.
To reset the BlueField Arm, run the following command on the BlueField BMC:
# ipmitool power reset
Verify BlueField BMC Network Interface for VLAN 4040
Log into the BlueField BMC and execute the following command to display the available network interfaces:
# ifconfig
The expected output of the ifconfig
command should display the network interfaces configured on the
BlueField
BMC.
In this instance, it is crucial to ensure that the output includes the VLAN 4040 interface, as it facilitates communication between the BIOS ( BlueField UEFI) and the BlueField BMC redfish server.
The expected output of the
ifconfig
command should include the following data for VLAN 4040:
vlan4040 Link encap:Ethernet HWaddr 02:6D:AE:4F:0F:C8
inet addr:192.168.240.1 Bcast:192.168.240.7 Mask:255.255.255.248
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:318 (318.0 B)
VLAN4040 Interface Not Present
Check BMC version. Ensure that the BMC version is at least 23.07 (support for Redfish BIOS was introduced in this version). If the BMC version is lower, upgrade to the latest available version.
Perform factory reset. If the BMC version is 23.07 or above and the VLAN 4040 interface is still not present, perform a factory reset on the BlueField BMC and reboot it to restore default configurations.
# ipmitool raw 0x32 0x66 # reboot
Verify UEFI Acquisition of Redfish Host Bootstrap Credentials
Retrieve the RShim log to confirm if UEFI obtained the Redfish interface bootstrap account credentials successfully.
Enable the RShim interface on either the BlueField BMC or the host, set log level, and dump the log:
# systemctl start rshim
# echo "DISPLAY_LEVEL 2" > /dev/rshim0/misc
# cat /dev/rshim0/misc
Review the following lines to get more information of the Redfish status:
DISPLAY_LEVEL 2 (0:basic, 1:advanced, 2:log)
BOOT_MODE 1 (0:rshim, 1:emmc, 2:emmc-boot-swap)
BOOT_TIMEOUT 150 (seconds)
DROP_MODE 0 (0:normal, 1:drop)
SW_RESET 0 (1: reset)
DEV_NAME pcie-0000:03:00.0
DEV_INFO BlueField-3(Rev 1)
OPN_STR 9009D3B400ENEA
---------------------------------------
Log Messages
---------------------------------------
INFO[MISC]: PSC BL1 START
INFO[BL2]: start
INFO[BL2]: boot mode (emmc)
INFO[BL2]: VDDQ: 1118 mV
INFO[BL2]: DDR POST passed
INFO[BL2]: UEFI loaded
INFO[BL31]: start
INFO[BL31]: lifecycle Production
INFO[BL31]: MB8: VDD adjustment complete
INFO[BL31]: VDD: 881 mV
INFO[BL31]: power capping disabled
INFO[BL31]: runtime
INFO[UEFI]: eMMC init
INFO[UEFI]: eMMC probed
INFO[UEFI]: UPVS valid
INFO[UEFI]: PCIe enum start
INFO[UEFI]: PCIe enum end
INFO[UEFI]: UEFI Secure Boot (disabled)
INFO[UEFI]: Redfish enabled
INFO[UEFI]: DPU-BMC RF credentials found
INFO[UEFI]: exit Boot Service
INFO[MISC]: Linux up
INFO[MISC]: DPU is ready
Reset BlueField BMC and Reset BlueField Arm
To reset the BlueField BMC, run the following on the BlueField BMC:
# ipmitool mc reset cold
Wait for BlueField BMC Linux prompt then reset the BlueField Arm:
# ipmitool power reset
Check Connectivity Between UEFI and Redfish Server
Access the BlueField console via RShim, console, or BMC SOL interface and perform a reset to the BlueField Arm.
The console will show the system boot sequence, and when UEFI connects to the BMC Redfish server, the following lines will appear.
Press ESC/F2/DEL twice to enter UEFI Menu.
Press ENTER to skip countdown.
3 seconds remain...
2 seconds remain...
1 seconds remain...
0 seconds remain...
** Redfish GET https://192.168.240.1/redfish/v1/, Success
ProcessVendorIdentification: Vendor: Nvidia
** XAUTH POST https://192.168.240.1/redfish/v1/SessionService/Sessions, Success
** Redfish GET https://192.168.240.1/redfish/v1/Registries/BiosAttributeRegistry/BiosAttributeRegistry, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/Settings, Success
** Redfish PATCH https://192.168.240.1/redfish/v1/Systems/Bluefield, Success
** Redfish PATCH https://192.168.240.1/redfish/v1/Systems/Bluefield, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/Bios/Settings, Success
** Redfish PATCH https://192.168.240.1/redfish/v1/Systems/Bluefield/Bios, Success
** Redfish PATCH https://192.168.240.1/redfish/v1/UpdateService/FirmwareInventory, Access Denied
** Redfish GET https://192.168.240.1/redfish/v1/TaskService/Tasks/, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/Oem/Nvidia/Truststore/Certificates, Success
** Redfish GET https://192.168.240.1/redfish/v1/TaskService/Tasks/, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/PK/Certificates, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/KEK/Certificates, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/db/Certificates, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/db/Certificates/1, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/db/Certificates/2, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/db/Certificates/3, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/db/Certificates/4, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/db/Certificates/5, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/db/Certificates/6, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Certificates, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/1, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/2, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/3, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/4, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/5, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/6, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/7, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/8, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/9, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/10, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/11, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/12, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/13, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/14, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/15, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/16, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/17, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/18, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/19, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/20, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/21, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/22, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/23, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/24, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/25, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/26, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/27, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/28, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/29, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/30, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/31, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/32, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/33, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/34, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/35, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/36, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/37, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/38, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/39, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/40, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/41, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/42, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/43, Success
** Redfish GET https://192.168.240.1/redfish/v1/Systems/Bluefield/SecureBoot/SecureBootDatabases/dbx/Signatures/44, Success
EFI stub: Booting Linux Kernel...
Enable Redfish in the UEFI Menu
Access the UEFI menu during system boot. This is typically done by pressing a specific key (e.g., F2 or Del) as the system powers on.
Navigate to Device Manager > System Configuration > Redfish Configuration and enable Redfish support. If it is already enabled, no action is needed. If it is disabled, enable it and save the changes.
Exit the UEFI menu by selecting Reset for the new configuration to take affect: