System Event Log
The system event log (SEL) is non-volatile repository for system events and certain system configuration information. SEL entries have a unique "record ID" field. This field is used for retrieving log entries from the SEL. Record IDs are not required to be sequential or consecutive. Applications should not assume that the SEL record ID follows any particular numeric ordering.
Event logs are chassis events, recorded in the BMC software which can be read using IPMI commands.
If the SEL is full and a new event is raised, the oldest record is removed and the new one is placed at the end of the SEL.
SEL may be accessed, even after BlueField failure, on the server through IPMI LAN access.
Display SEL Information
curl -k -u root:'<password>' -H 'Content-Type: application/json' -X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/EventLog/
{
"@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog",
"@odata.type": "#LogService.v1_1_0.LogService",
"Actions": {
"#LogService.ClearLog": {
"target": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Actions/LogService.ClearLog"
}
},
"DateTime": "2023-09-27T14:28:50+00:00",
"DateTimeLocalOffset": "+00:00",
"Description": "System Event Log Service",
"Entries": {
"@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries"
},
"Id": "EventLog",
"Name": "Event Log Service",
"Oem": {
"Nvidia": {
"@odata.type": "#NvidiaLogService.v1_0_0.NvidiaLogService",
"LatestEntryID": "4",
"LatestEntryTimeStamp": "2023-09-27T14:19:30+00:00"
}
},
"OverWritePolicy": "WrapsWhenFull"
}
Display List of Events
curl -k -u root:'<password>' -H 'Content-Type: application/json' -X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries
{
"@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries",
"@odata.type": "#LogEntryCollection.LogEntryCollection",
"Description": "Collection of System Event Log Entries",
"Members": [
{
"@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/1",
"@odata.type": "#LogEntry.v1_9_0.LogEntry",
"AdditionalDataURI": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/1/attachment",
"Created": "2023-09-27T14:18:39+00:00",
"EntryType": "Event",
"Id": "1",
"Message": "12V_ATX sensor crossed a warning low threshold going low. Reading=6.048000 Threshold=10.400000.",
"MessageArgs": [
"12V_ATX",
"6.048000",
"10.400000"
],
"MessageId": "OpenBMC.0.1.SensorThresholdWarningLowGoingLow",
"Name": "System Event Log Entry",
"Resolution": "",
"Resolved": false,
"Severity": "OK"
}
…
],
"Members@odata.count": 1,
"Name": "System Event Log Entries"
}
Clear SEL
curl -k -u root:'<password>' -H 'Content-Type: application/json' -X POST https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/EventLog/Actions/LogService.ClearLog
The following table lists the command to use to view event logs:
Command |
Description |
|
Displays information about SEL |
|
Displays list of events |
|
Displays extended info list of events |
|
Saves SEL events to a file |
|
Clears SEL |
The following subsections detail the messages which are added to the BMC SEL and the scenarios that trigger them.
UEFI Boot
Messages are added to the BMC SEL while the DPU UEFI is booting which describe the status of the UEFI boot.
SEL messages:
SMBus initialization
PCI resource configuration
System boot initiated
Example:
SEL Record ID : 0037
Record Type : 02
Timestamp : 06:36:06 UTC 06:36:06 UTC
Generator ID : 0001
EvM Revision : 04
Sensor Type : System Firmwares
Sensor Number : 06
Event Type : Sensor-specific Discrete
Event Direction : Assertion Event
Event Data : c207ff
Description : PCI resource configuration
IPMB Sensors
QSFP Sensors
Messages are added to the SEL in case of a change in the status of the QSFP cables. The messages describe the event and status of the sensor.
List of QSFP sensors:
P0_link – the QSFP 0 cable status
P1_link – the QSFP 1 cable status
SEL messages:
Config Error – the QSFP cable is down
Connected – the QSFP cable is up
Example:
SEL Record ID : 003e
Record Type : 02
Timestamp : 07:08:28 UTC 07:08:28 UTC
Generator ID : 0020
EvM Revision : 04
Sensor Type : Cable / Interconnect
Sensor Number : 00
Event Type : Sensor-specific Discrete
Event Direction : Assertion Event
Event Data (RAW) : 010f0f
Event Interpretation : Missing
Description : Config Error
Sensor ID : p0_link (0x0)
Entity ID : 31.1
Sensor Type (Discrete): Cable / Interconnect
States Asserted : Cable / Interconnect
[Config Error]
Temperature Sensors
Messages are added to the SEL if temperature sensors detect a value higher than the sensor thresholds. The messages include a description of the event, DPU FRU device description, DPU BMC device description, and the status of the sensor.
List of temperature sensors:
bluefield_temp – Bluefield temperature
p0_temp – QSFP 0 cable temperature
p1_temp – QSFP 1 cable temperature
SEL messages:
Upper Critical going high – crossing a upper critical threshold
Upper Non-critical going high – crossing a upper non-critical threshold
Lower Critical going low – crossing a lower critical threshold
Lower Non-critical going low – crossing a lower non-critical threshold
Example:
SEL Record ID : 003c
Record Type : 02
Timestamp : 07:01:06 UTC 07:01:06 UTC
Generator ID : 0020
EvM Revision : 04
Sensor Type : Temperature
Sensor Number : 03
Event Type : Threshold
Event Direction : Assertion Event
Event Data (RAW) : 592802
Trigger Reading : 40.000degrees C
Trigger Threshold : 2.000degrees C
Description : Upper Critical going high
Sensor ID : p0_temp (0x3)
Entity ID : 0.1
Sensor Type (Threshold) : Temperature
Sensor Reading : 40 (+/- 0) degrees C
Status : ok
Lower Non-Recoverable : na
Lower Critical : -5.000
Lower Non-Critical : 0.000
Upper Non-Critical : 70.000
Upper Critical : 75.000
Upper Non-Recoverable : na
Positive Hysteresis : Unspecified
Negative Hysteresis : Unspecified
Assertion Events :
Event Enable : Event Messages Disabled
Assertions Enabled : lnc- lcr- unc+ ucr+
Deassertions Enabled : lnc+ lcr+ unc- ucr-
FRU Device Description : Nvidia-BMCMezz (ID 169)
Board Mfg Date : Tue Jan 3 23:16:00 2023 UTC
Board Mfg : Nvidia
Board Product : Nvidia-BMCMezz
Board Serial : MT2251XZ02W5
Board Part Number : 900-9D3B6-00CV-AAA
FRU Device Description : BlueField-3 Smar (ID 250)
Board Mfg Date : Tue Jan 3 23:16:00 2023 UTC
Board Mfg : Nvidia
Board Product : BlueField-3 SmartNIC Main Card
Board Serial : MT2251XZ02W5
Board Part Number : 900-9D3B6-00CV-AAA
Product Manufacturer : Nvidia
Product Name : BlueField-3 SmartNIC Main Card
Product Part Number : 900-9D3B6-00CV-AAA
Product Version : A3
Product Serial : MT2251XZ02W5
Product Asset Tag : 900-9D3B6-00CV-AAA
ADC Sensors
Messages are added to the SEL if the sensor voltage crosses the sensor's thresholds. The messages include a description of the event, DPU FRU device description, DPU BMC device description, and the status of the sensor.
List of ADC sensors:
1V_BMC
1_2V_BMC
1_8V
1_8V_BMC
2_5V
3_3V
3_3V_RGM
5V
12V_ATX
12V_PCIe
DVDD
HVDD
VDD
VDDQ
VDD_CPU_L
VDD_CPU_R
SEL messages:
Upper Non-critical going high – crossing a upper non-critical threshold
Lower Non-critical going low – crossing a lower non-critical threshold
Example:
SEL Record ID : 0042
Record Type : 02
Timestamp : 09:20:50 UTC 09:20:50 UTC
Generator ID : 0020
EvM Revision : 04
Sensor Type : Voltage
Sensor Number : 06
Event Type : Threshold
Event Direction : Assertion Event
Event Data (RAW) : 50a9ff
Trigger Reading : 1.200Volts
Trigger Threshold : 1.810Volts
Description : Lower Non-critical going low
Sensor ID : 1_2V_BMC (0x6)
Entity ID : 0.1
Sensor Type (Threshold) : Voltage
Sensor Reading : 1.200 (+/- 0) Volts
Status : ok
Lower Non-Recoverable : na
Lower Critical : na
Lower Non-Critical : 1.143
Upper Non-Critical : 1.257
Upper Critical : na
Upper Non-Recoverable : na
Positive Hysteresis : Unspecified
Negative Hysteresis : Unspecified
Assertion Events :
Event Enable : Event Messages Disabled
Assertions Enabled : lnc- unc+
Deassertions Enabled : lnc+ unc-
FRU Device Description : Nvidia-BMCMezz (ID 169)
Board Mfg Date : Tue Jan 3 23:16:00 2023 UTC
Board Mfg : Nvidia
Board Product : Nvidia-BMCMezz
Board Serial : MT2251XZ02W5
Board Part Number : 900-9D3B6-00CV-AAA
FRU Device Description : BlueField-3 Smar (ID 250)
Board Mfg Date : Tue Jan 3 23:16:00 2023 UTC
Board Mfg : Nvidia
Board Product : BlueField-3 SmartNIC Main Card
Board Serial : MT2251XZ02W5
Board Part Number : 900-9D3B6-00CV-AAA
Product Manufacturer : Nvidia
Product Name : BlueField-3 SmartNIC Main Card
Product Part Number : 900-9D3B6-00CV-AAA
Product Version : A3
Product Serial : MT2251XZ02W5
Product Asset Tag : 900-9D3B6-00CV-AAA