Power Shelf Management Module Replacement#

This topic describes how to replace a power shelf management module (PSMM) in an NVIDIA DGX™ GB200 system.

Power Shelf Management Module Replacement Overview#

This section provides a high-level overview of the PSMM replacement process.

  1. Identify the failed PSMM

  2. Remove the failed PSMM

  3. Install the new PSMM

  4. Update the new PSMM firmware, if required

  5. Verify the new PSMM is operational

  6. If requested, return the failed unit to NVIDIA Enterprise Support using the provided packaging

Identify the Failed Power Shelf Management Module#

  1. Identify the failed PSMM by visually inspecting the LEDs on the power shelf management module. The failed module will have an amber LED indicator.

    Close-up view of a DGX GB200 rack showing the amber LED indicator for a failed unit
  2. After unpacking the new PSMM, note the MAC address on the label and provide it to your system administrator.

    Side view of a PSMM showing the MAC address label

Replace the Power Shelf Management Module#

Note

The power supplies will continue to operate during the power shelf management module replacement process.

  1. Once you’ve identified the failed PSMM, remove its network management cable.

    Image showing the network management cable location on the power shelf PSMM
  2. Push the release tab up and pull on the handle to remove the PSMM from the power shelf.

  3. Pull the PSMM straight out of the power shelf and set it aside.

    Image showing the PSMM being removed from the power shelf
  4. Insert the new module until it locks into place. You’ll hear an audible click when it’s fully seated.

  5. If requested, return the failed module to NVIDIA Enterprise Support using the provided packaging.