Troubleshooting

As each PSU is plugged in, make sure that the green power LEDs on the PSU comes on.

Warning

If the power supplies cannot supply enough power, the management module may shut down some of the leafs.

Issue 1. The AC power LED is off

  1. Check that the power cable is the correct power cable for your country.

  2. Check that the power cable is plugged into a working outlet.

  3. Check that the power cable has a voltage within the range of 180-240 volts AC.

  4. Remove and reinstall the power cable.

  5. Check the circuit breakers to be sure that the breaker has not tripped.

  6. Check that the power cable is good. Replace the power cable.

  7. If the AC power LED is green but the DC power LED amber, replace the PSU.

  8. Make sure the power supply is inserted.

Issue 1. The power LED for the leaf module is off

  1. Make sure that all of the PSUs are showing DC OK.

  2. Uninstall and reinstall the Leaf module.

  3. When the amber LED is on, this indicates a fault in the module, uninstall and reinstall the leaf module.

  4. If uninstalling and reinstalling the leaf module does not work, burn the latest FW on the leaf module and uninstall and reinstall the leaf module.

  5. Replace the leaf module with a new one.

Warning

Should any of the modules shut down due to over temperature, wait 5 minutes and then follow the procedure starting with Step 2.

Issue 2. The physical link LED for the InfiniBand connector does not come on

  1. Check that both ends of the cable are connected.

  2. Check that the locks on the ends are secured.

  3. Make sure that the latest FW version is installed on both the HCA card and the switch.

  4. If media adapters are used check that the all connections are good, tight, and secure.

  5. Replace the cable.

Issue 3. The activity indication does not come on

Check that the Subnet Manager has been started.

Issue 1. Amber Status LED for the chassis on the management module is lit

  1. Check the MLNX-OS management for confirmation and possible explanation of the alert.

  2. Reset the master management module by pushing the rest button. If you have two management modules installed this will convert the master management module to the slave and convert the slave to the master.

    Warning

    If there is only one management module in the chassis, all of the leafs and ports are reset by bringing them down and powering them up when the management module is removed.

  3. Make sure the S.Fans and L.Fans LEDs are green.

  4. Make sure that the spine and the leafs both have the same version of FW.

  5. Reburn the FW and remove and reinstall the management module.

  6. If you are running the chassis with only one management module, remove and reinstall the management module. Make sure the mating connectors of the unit are free of any dirt and/or obstacles. See Management Module.

  7. If you are running the chassis with only one management module, replace the management module.

Issue 2. Amber LED for the leaf fan on the management module is lit

  1. Check the MLNX-OS management for confirmation and possible explanation of the alert.

  2. Make sure that there is nothing blocking the front or rear of the chassis and that the fan modules and ventilation holes are not blocked (especially dust over the holes).

  3. If you find dust blocking the holes it is recommended to clean the fan unit and remove the dust from the front and rear panels of the switch using a vacuum cleaner.

  4. Determine which fan module is problematic by checking the status LED on each fan module.

  5. Remove and reinstall the problematic fan unit. Make sure the mating connector of the new unit is free of any dirt and/or obstacles. See Fan Modules.

  6. Replace the Leaf fan module.

Important

Replace defective leaf fan modules as soon as they are identified.

Important

Should any of the modules shut down due to over temperature, follow the procedure starting in Step 2.

Issue 3. Amber LED for the spine fan on the management module is lit

  1. Check the MLNX-OS management for confirmation and possible explanation of the alert.

  2. Determine which spine has a defective fan by checking the Fan LEDs on all of the spines.

  3. Make sure that there is nothing blocking the front or rear of the chassis and that the fan modules and ventilation holes are not blocked (especially dust over the holes).

  4. If you find dust blocking the holes it is recommended to clean the fan unit and remove the dust from the front and rear panels of the switch using a vacuum cleaner.

  5. Remove and reinstall the fan unit of the spine. Make sure the mating connector of the new unit is free of any dirt and/or obstacles. See Fan Modules.

  6. Replace the spine fan module.

Important

Replace defective spine fan modules as soon as they are identified.

Issue 1. Amber LED on the spine module is lit

  1. Check the MLNX-OS management for confirmation and possible explanation of the alert.

  2. Make sure that there is nothing blocking the front or rear of the chassis and that the fan modules and ventilation holes are not blocked (especially dust over the holes).

  3. If you find dust blocking the holes it is recommended to clean the fan unit and remove the dust from the front and rear panels of the switch using a vacuum cleaner.

  4. Remove and reinstall the spine module. Make sure the mating connectors of the unit is free of any dirt and/or obstacles. See Spine Modules.

  5. Make sure that the spine and the Leafs both have the same version of FW.

  6. Reinstall the FW and remove and reinstall the spine.

  7. Replace the spine module.

Issue 1. The last software update did not succeed

  1. Connect the RS232 (console) connector to a laptop.

  2. Push the reset button on the switch or management module.

  3. You will have ~ 5 seconds to stop the U-Boot by pressing Control-B.

  4. Choose the image to upload. Only use image 1 or image 2.

    Copy
    Copied!
                

    U-Boot 2009.01-mlnx1.4 (May 12 2010 - 14:08:15)   CPU: AMCC PowerPC 460EX Rev. A at 1000 MHz (PLB=200, OPB=100, EBC=100 MHz) Security/Kasumi support Bootstrap Option H - Boot ROM Location I2C (Addr 0x52) Internal PCI arbiter disabled 32 kB I-Cache 32 kB D-Cache Board: Mellanox PPC460EX Board FDEF: No I2C: ready DRAM: 2 GB (ECC enabled, 400 MHz, CL3) FLASH: 16 MB NAND: 1024 MiB PCI: Bus Dev VenId DevId Class Int PCIE0: link is not up. PCIE1: successfully set as root-complex 01 00 15b3 bd34 0c06 00 Net: ppc_4xx_eth0, ppc_4xx_eth1 Hit Ctrl+B to stop autoboot: 0   Mellanox FabricIT   Boot Menu: 1. EFM_PPC_M460EX EFM_1.1.1000 2010-06-24 16:32:03 ppc 2. EFM_PPC_M460EX EFM_1.1.1200 2010-06-25 18 :00:03 ppc 3. U-Boot prompt Choice:

  5. Select the image to boot.

For more detailed instructions concerning MLNX-OS® software see the Mellanox MLNX-OS® Switch-IB™ and Switch-IB™ 2 Software WebUI User’s Manual or the Mellanox MLNX-OS® Switch-IB™ Software User Manual.

© Copyright 2023, NVIDIA. Last updated on May 22, 2023.