Storage Health Monitoring
The BMC continuously monitors system resources including CPU utilization, memory usage, and storage space. When these metrics exceed configured thresholds, warning or critical event logs are automatically generated to alert administrators.
Event logs can be viewed via the Redfish API at /redfish/v1/Systems/system/LogServices/EventLog/Entries.
The following table defines the thresholds for CPU utilization alerts:
Metric | Alert Type | Warning | Critical | Action | Description |
CPU | Upper | >80% | >95% | Log only | Total BMC CPU utilization |
CPU user | Upper | >80% | >95% | Log only | CPU time in user-space applications |
CPU kernel | Upper | >80% | >95% | Log only | CPU time in kernel operations |
Metric Log Example
To check the current event log for CPU alerts:
curl -k -u <usr>:<password> -H 'content-type: application/json' -X GET https://<bmc_ip>/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries
Example output
{
"@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/468",
"@odata.type": "#LogEntry.v1_15_0.LogEntry",
"AdditionalDataURI": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/468/attachment",
"Created": "2026-01-19T15:57:22+00:00",
"EntryType": "Event",
"Id": "468",
"Message": "CPU sensor crossed a critical high threshold going high. Reading=96.881238 Threshold=95.000000.",
"MessageArgs": [
"CPU",
"96.881238",
"95.000000"
],
"MessageId": "OpenBMC.0.4.SensorThresholdCriticalHighGoingHigh",
"Name": "System Event Log Entry",
"Resolution": "None",
"Resolved": false,
"Severity": "Critical"
}
Metric | Alert Type | Warning | Critical | Action | Description |
Memory available | Lower | <30% | <10% | Log only | Memory available for applications |
Memory shared | Upper | - | >35% | Log only | Shared memory usage |
The BMC system provides real-time monitoring of read-write (RW) flash usage. You can query free storage space, receive notifications when usage crosses defined thresholds, and rely on automatic cleanup when limits are exceeded.
The following table defines the thresholds for storage usage alerts. Note that percentages refer to used space:
Metric | Alert Type | Warning | Critical | Path | Action | Description |
Storage RW | Lower | <10% | <5% |
| Auto cleanup | Root overlay filesystem; primary writable storage for BMC runtime data and configuration changes |
Storage TMP | Lower | <20% | <5% |
| Log only | Temporary files storage; used by services for transient data, cleared on reboot |
Storage LOGGING | Lower | <30% | <20% |
| Log only | Event logs and dump storage; contains Redfish logs, SEL entries, and debug dumps |
Retrieving Free Storage Space
To check the current free RW flash space:
curl -k -H "X-Auth-Token: $token" -X GET https://${bmc}/redfish/v1/Managers/Bluefield_BMC/ManagerDiagnosticData'
Example output:
{
"@odata.id": "/redfish/v1/Managers/Bluefield_BMC/ManagerDiagnosticData",
"@odata.type": "#ManagerDiagnosticData.v1_2_0.ManagerDiagnosticData",
"FreeStorageSpaceKiB": 1488,
"Id": "ManagerDiagnosticData",
"MemoryStatistics": {
"AvailableBytes": 725983232,
"BuffersAndCacheBytes": 170594304,
"FreeBytes": 605347840,
"SharedBytes": 60747776,
"TotalBytes": 917188608
},
"Name": "Manager Diagnostic Data",
"ProcessorStatistics": {
"KernelPercent": 0.6058,
"UserPercent": 0.5048
},
"ServiceRootUptimeSeconds": 1282378.351
}
Storage Cleanup Notifications
When RW flash usage exceeds 90%, a Redfish event log entry is generated to alert that manual cleanup is required.
Example log:
{
"@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/7",
"@odata.type": "#LogEntry.v1_15_0.LogEntry",
"Created": "2025-09-15T13:30:43+00:00",
"EntryType": "Event",
"Id": "7",
"Message": "Processes consuming HIGH Resource Storage_RW are 91%",
"MessageArgs": [
"Storage_RW",
"91%"
],
"MessageId": "OpenBMC.0.4.BMCSystemResourceInfo",
"Name": "System Event Log Entry",
"Resolution": "None.",
"Resolved": false,
"Severity": "OK"
}
Automatic Cleanup
When RW flash usage exceeds 95%, the BMC automatically purges space by deleting:
All dump files
All event logs
Files in home directories
Files in system log directories
Exceeding 99% RW flash usage can make BMC functionality unstable and impede automatic cleanup.
After automatic cleanup, a Redfish event log entry is generated. For example:
{
"@odata.id": "/redfish/v1/Systems/Bluefield/LogServices/EventLog/Entries/3",
"@odata.type": "#LogEntry.v1_15_0.LogEntry",
"Created": "2025-09-24T10:40:34+00:00",
"EntryType": "Event",
"Id": "3",
"Message": "RWFS cleanup completed.",
"Modified": "2025-09-22T10:40:34+00:00",
"Name": "System Event Log Entry",
"Resolved": false,
"Severity": "OK"
}