Out-of-Band Management#

Using cmsh to get BMC events#

BMC integrates with BMC interfaces via the Redfish API. You can use cmsh to get events.

  1. Get a list of physical nodes for a specific rack, filtering for a status of DOWN.

    root@a03-p1-head-01:~# cmsh
    
    [a03-p1-head-01]% device list -t physicalnode -r a06 | grep DOWN
    
  2. Show events from the BMC interface.

    [a03-p1-head-01]% device use a06-p1-dgx-02-c18
    [a03-p1-head-01->device[a06-p1-dgx-02-c18]]% bmceventlog
    
    Device            id              message                                                          severity        timestamp                   type            Result  Error
    ------------------ ----------------  ----------------------------------------------------------------  ----------------  --------------------------  ----------------  --------  --------------------------------
    a06-p1-dgx-02-c18  5424             The state of resource `Chassis_0_LeakDetector_0_Manifold` has    Warning          2025-05-12T10:28:41+00:00                       good
                                        changed to Degraded.
    

Using cmsh to access BMC SoL#

The BMC on DGX nodes allows for two types of remote access during POST and BOOT via Serial over LAN and a graphical console, KVM. To access SoL, you can start this via cmsh.

This is covered in Section 14.7 in the BMC 11 administration manual.

root@a03-p1-head-01:~# cmsh

[a03-p1-head-01]% device
[a03-p1-head-01->device]% use a07-p1-dgx-03-c08
[a03-p1-head-01->device[a07-p1-dgx-03-c08]]% rconsole

===============================================================================
ipmiconsole
To exit IPMI SOL, type <ENTER> "&" "."
===============================================================================

Using cmsh to get BMC IPs#

You can get the IP for a given device’s out-of-band management interface using cmsh. In this example, we’ll investigate why a node is in a down status using the node’s BMC web interface.

  1. Get a list of physical nodes for a specific rack, filtering for a status of DOWN.

    root@a03-p1-head-01:~# cmsh
    
    [a03-p1-head-01]% device list -t physicalnode -r a06 | grep DOWN
    PhysicalNode     a06-p1-dgx-02-c18  0E:63:D1:83:48:82  dgx-gb200        7.241.18.46      dgxnet1          [  DOWN  ], health check failed
    
  2. List the BMC interface.

    [a03-p1-head-01]% device use a06-p1-dgx-02-c18
    [a03-p1-head-01->device[a06-p1-dgx-02-c18]]% interfaces
    [a03-p1-head-01->device[a06-p1-dgx-02-c18]->interfaces]% list bmc
    
    Type         Network device name  IP              Network          Start if
    ------------ -------------------- ----------------  ----------------  --------
    bmc          rf0                  7.241.2.118      ipminet2         always
    
  3. With the IP, we can now use a browser to access the BMC interface.

BMC Web Interface Login Screen BMC Web Interface Dashboard BMC Web Interface Navigation BMC Web Interface Detailed View