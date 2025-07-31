NVIDIA high end management modular switch systems support redundant management modules. Chassis HA reduces downtime as it assures continuity of the work even when a management module dies. Chassis HA management allows the systems administrator to associate a single IP address with the appliance. Connecting to that IP address allows the user to change and review the system’s chassis parameters regardless of the active management module.

Every node in the Chassis HA has one of the following roles/modes:

Master—the node that manages chassis configurations and services the chassis IP addresses

Slave—the node that replaces the Master node and takes over its responsibilities once the Master node is down

Note The master node is the only node that has access to chassis components such as temperature, inventory and firmware.

The CPU role of the current management node can be recognized by following one these methods:

Running the command “show chassis ha” Copy Copied! switch (config) # show chassis ha 2 -node HA state: Box management IPv4: 10.7 . 146.44 / 24 Box management IPv6: fdfd:fdfd: 7 : 145 :: 1033 :47fd/ 64 interface : mgmt0 local role : master local slot : 1 other state : not-present reset count : 0

Check the LEDs in the management modules as displayed in the figure below

Go to the WebUI → System → Modules page and see the information on the LEDs

When a CPU in not responding to an internal communication with the other CPU, the non responding CPU will be reset by the other CPU. Each time a CPU resets, a counter is incremented. After 5 resets a CPU is considered malfunctioned and will be shut down.

To verify how many times a CPU is reset, run:

To verify if a CPU has been shut down, either run:

Or check the system page in the WebUI, the management figure will be grayed out.

To enable the malfunctioned CPU, first replace it and run “chassis ha reset other”.

Box IP (BIP) centralized management infrastructure enables you to configure and monitor the system. The BIP continues to function even if one of the management blades dies. Box IP is defined by running the command “chassis ha bip <board IP address>”. The created BIP is used as the master IP’s alias. For example:

Copy Copied! switch [standalone: master] (config) # chassis ha bip 192.168 . 10.100 255.255 . 255.0





System configuration changes should be performed by the master using the BIP otherwise they are overridden by the master configuration.

Chassis HA is based on database replication enabling the entire master configuration to be replicated to the slave. Data such as chassis configuration is replicated. However, run time information such as time, logs, active user lists, is not copied. Additionally, node specific configuration information such as host name and IP address is not copied.

Note Chassis HA requires connectivity of both management modules (mgmt0, mgmt1) in the same broadcast domain. The SM commands are only visible to the SM HA master in a modular system. This is node would display "master" in its CLI prompt. Copy Copied! switch [standalone: master] (config) # If the node shows "slave" or "unknown", the node is not the "master" and thus would not be able to use the IB SM commands. "unknown" indicates that mgmt0 is not LinkUp and is not assigned a valid IPv4 address. On modular systems, the mgmt0 interface on all installed management modules must be: LinkUp

With a valid IPv4 address

In the same L2 broadcast domain Even if only one module is installed, it must have a mgmt0 interface that is LinkUp and with a valid IPv4 address.





Management CPU functional takeover takes up to 20-30 seconds. However, when plugging in a module, you need to wait for approximately 3 minutes before making any other hardware change. During the takeover process, the Master LED status is differentiated by a color scheme. To verify the system’s status, run the “show chassis ha” command on both managements.

If the CPU malfunctions, the system resets it 5 times in an attempt to solve the issue. If the CPU is not activated after the reset, the system powers it off as well as its attached spine. Once the CPU is powered off, the user should replace the malfunctioned CPU module. To power on the CPU and the attached spine, plug the module in, log into the Master CPU and run the “chassis ha power enable other” command.

Note Although the LEDs are functional during the takeover, wait for approximately 3 minutes before making any other hardware change.

Master example:

Slave example:

