MLAG Upgrade Procedure
To upgrade the MLAG cluster, the standby switch should be upgraded first, then (after reboot with the upgraded software) the slave will rejoin the MLAG cluster.
After that, the master can be upgraded.
When the master reboots with the upgraded software, the other standby node (which is running) becomes the master. After the old master reboots, it joins the cluster and then the configuration is set.
For a more detailed description of NVIDIA Onyx upgrade procedure, please refer to the following posts:
- HowTo Upgrade MLNX-OS Software on NVIDIA switch systems
- HowTo Upgrade MLNX-OS Software on an MLAG Switch Pair
Monitoring and Troubleshooting
This section provides information and tools to monitor and debug the deployed fabric.
It is recommended to ensure that the below conditions are followed:
- Both switches are part of the same management subnet (connected to the same switch or more but on the same subnet).
- The management network is connected on mgmt0 port.
- The mlag-port-channel number is identical in both switches (recommended but not obligatory).
- The same switch version is installed on both switches.
- The IPL link is in UP state. try to ping the other switch via the IPL ping.
- Align the MLAG interface mode on both the server and the switch.
For example, if you select LACP mode on the MLAG interface (active), mode 4 should be configured on the bond interface.
Below are failure scenarios followed by monitoring and debug instructions.
The following scenarios are discussed:
- IPL link Down
- 'Inactive Ports' and 'Active-Partial' Status on the “show mlag” command
- Management Port is Down but IPL port is UP
- MLAG Cluster issues
- IPL issues
- MLAG port issues
IPL link Down
The IPL link should be configured as port-channel with 2 or more ports, but in some scenarios both ports may be in “Down” state. In this case only the master switch will pass traffic.
If we run “show mlag” command when only one “mlag-port-channel” port is configured, we will get the following:
When shutting down the IPL port on the master switch:
'Inactive Ports' and 'Active-Partial' Status on the “show mlag” command
By default, all ethernet ports are admin UP, while the mlag-port-channels are down, as in most cases the full network configuration is done first and then the mlag-port-channel is enabled. Make sure to enable the ports when creating mlag-port-channel and adding ethernet interface to it (either static or LACP).
Note: When one port is down, it doesn't mean that the whole mlag-port-channel is down.
MLAG Ports Status Summary:
- Inactive - all ports in the mlag-port-channel are down (on both switches).
- Active-partial - some ports are down (example below, on one switch)
- Active-full - normal condition, all is good.
When one mlag-port-channel is down, we will see the following output:
To enable it:
Management Port is Down but IPL port is UP
When there is no ping between the two servers on mgmt0 (e.g. mgmt0 port is Down, or any management switch problem that blocks traffic between the switches on mgmt0) - both switches will pass traffic.
There is no mentioning of the second switch in the cluster.
The “show mlag” and “show mlag-vip” output will look like this:
MLAG Cluster Issues
After adding the two switches to the cluster, wait for a few seconds. One switch will become Master, while the other one will become the slave. When performing remove/add/cluster change operations, always wait for the switch to go to “standalone master” before continuing.
Run "show mlag-vip"
Verify that the two switches are in the cluster. The other MLAG switch must reflect the same information.
If one switch does not see this MLAG-Domain do the following:
Run "show ip route":
The management subnet must only point out of the MGMT port. inband management is acceptable. If there is a conflict, the MGMT Keep alive is sent out on the wrong port and not advertised to another switch.
In case the switch still does not see the cluster: The MGMT keep alive is broadcast to a well known multicast DNS group – 18.104.22.168. Check to see if both switches are advertising to this group. It is likely that the mgmt. port will see a lot of traffic. This output will need to be captured and analyzed.
This is a transmission from master to the multicast group. Before we have a master, both switches will see this frame, and both will transmit it. After the cluster is formed, only the master will transmit this. If this frame is not seen, the cluster will not form.
IPL Link needs to be up for MLAG peer ports and sync data to be available. The IPL VLAN is local to the MLAG switches and can be any number. VLAN 4000 or higher is typically used for control vlans and is recommended.
The “show mlag” command shows IPL link state and other valuable information.
The IPL link needs to be Up. Both switches must be in Up State in the “Member” summary. Peering or down are not a good state. Peering could be a transient state but should move to UP eventually.
In case IPL is up and still member ports are not visible, try ping the remote IPL interface. Ping the local switch and then the MLAG Peer switch IPL IP address. If ping doesn’t go through use tcpdump to debug this case. In case link is up and ping is lossy, check for traffic on the IPL interface. During normal operation, IPL traffic is a few frames per second at the most. If you see a lot of traffic, it is likely an indication of a loop in the setup.
The other usual suspects are checking if both sides are set to static, or LACP. Check interface transceiver for matching serial numbers to identify cabling issues.
MLAG Port Issues
A healthy MLAG should show all ports as UP (P) and MLAG must be (U).
“Partial” means that all ports are down on the MLAG-peer switch side. This could be a result of interface MLAG being shut on the remote side or mlag protocol shut on remote side.
Peer ports not being visible means that ports in the MLAG-Peer switch are either not added in the MLAG or there are cluster issues.
If the physical port shows (S) that could result from either receiving no PDUs from the remote side or by receiving a PDU that doesn’t match what is being received on other members of the MLAG port-channel
Check the LACP counters to see continuous increment of counters, both sent and receive must increment. One every second for fast retransmit and one every 30 seconds for slow retransmit.
In case the lacp counters are incrementing and port is still down, then check the SID received on different port of the MLAG. They should match across all MLAG ports.
To check the SID used by the NVIDIA switch use this command:
Check the lacp property across all ports in an MLAG: