Power Supply Carrier Replacement

This chapter describes how to replace a failed DGX-2 System power supply carrier.

The power supply carrier can fail due to a power distribution board failure, or a bad carrier fan.

Power Supply Carrier Replacement Overview

This is a high-level overview of the steps needed to replace a power supply.
  1. Identify failed power supply carrier using the BMC and submit a service ticket.
  2. Get replacement power supply carrier from NVIDIA Enterprise Support.
  3. Identify the power supply carrier using the diagram as a reference.
  4. Power off the system.
  5. Remove the power cords from the three power supplies in the carrier to be replaced.
  6. Remove the failed power supply carrier and place on a solid, stable work surface.
  7. Move the power supplies to the new carrier.
  8. Insert new power supply carrier and secure in place with the thumbscrew.
  9. Insert the power cords and make sure the LEDs light up green (IN/OUT).
  10. Use the BMC to confirm that the power supply carrier, power supplies, and fans are working correctly.
  11. Power on the system.

Identifying the Failed Power Supply Carrier

  1. Log on to the BMC.
  2. Click Sensor from the left navigation menu and review the PDB entries.

    The following diagram shows the PSU carrier location corresponding to each PSU carrier fan (FANx_PDB).

    If necessary, work with NVES to identify the failed power supply carrier – this could be due to a power distribution board failure.
  3. Request a new power supply carrier from NVES.
  4. When the replacement arrives, unpack the item and save the packaging.

Replacing the Power Supply Carrier

  1. Identify a solid work surface where the components can be rested for the procedure.
  2. Power off system before replacing power supply carrier.
  3. Unplug the three power cords connected to the power supply carrier cage.
  4. Remove the power supply carrer.
    1. Release the power supply carrier by turning the indicated thumbscrew.

      The following diagram show the the right power supply carrier as an example.

    2. Use the chrome handle to pull out the power supply carrier.
      Important: The module will be heavy as it holds three power supplies

  5. Move the power supply units to the new carrier.
    1. Pull the power supplies out of the old carrier.

    2. Insert the power supplies into the new carrier.

  6. Replace the power supply carrier.
    1. Insert the power supply carrier into the chassis.

    2. Tighten the thumbscrew to secure the power supply carrier.

Verifying the PSU Carrier is Working

This section describes the steps needed to verify that the PSU carrier has been replaced correctly.

  1. Plug in the three power cords that were previously unplugged.
  2. Confirm power supply LEDs light up (IN and OUT).
  3. Log on to the BMC.
  4. Go to sensor information and confirm the new power supply carrier is operational.

    Power distribution board, PDB fans and power supplies should be active and working.

  5. Power on the system.
Ship back the power supply carrier in the packaging that the new one arrived in.