Power Supply Replacement#

This topic describes how to replace the power supplies (PSUs) of the NVIDIA DGX™ GB200 system.

Power Supply Replacement Overview#

This section provides a high-level overview of the PSU replacement process.

  1. Identify the failed PSU

  2. Remove the failed PSU

  3. Install the new PSU

  4. Update the new PSU firmware, if required

  5. Verify the new PSU is operational

  6. If requested, return the failed unit to NVIDIA Enterprise Support using the provided packaging

Identify the Failed Power Supply#

Identify the failed PSU by visually inspecting the LEDs on the power supply modules. The failed module will have an amber LED indicator.

Close-up view of the LED indicators on a PSU

The power supplies are N+N redundant, so any one power supply can be replaced as long as at least four power shelves are fully active and healthy.

Close-up view of a DGX GB200 rack showing the amber LED indicator for a failed unit

Replace the Power Supply#

  1. Once you’ve identified the failed power supply, push the release tab to the left and pull on the handle to eject it from the power shelf.

    Image showing the release tab on a PSU
  2. Pull the power supply straight out of the power shelf and set it aside.

    Image showing the PSU being removed from the power shelf
  3. Insert the new PSU until it locks into place. You’ll hear an audible click when it’s fully seated.

  4. Verify that the new PSU is operational by running the sudo nvsm show psus command. Verify that all PSUs report Status_Health=OK.

  5. If requested, return the failed unit to NVIDIA Enterprise Support using the provided packaging.