M.2 Boot Drive Assembly Replacement

This section applies when you must replace both M.2 operating system drives. In this case, a replacement assembly—which includes both M.2 NVMe drives—should be ordered.

M.2 Boot Drive Riser Assembly Replacement Overview

This is a high-level overview of the procedure to replace the boot drive riser assembly.

Note

If your organization purchased a media retention policy, you might be able to keep failed drives for destruction. Check with NVIDIA Enterprise Support on the status of the policy for specifics.

  1. Get a replacement M.2 boot drive assembly from NVIDIA Enterprise Support

  2. Make sure the system is shut down

  3. If cables don’t reach, label all cables and unplug them from the motherboard tray

  4. Slide motherboard out until it locks in place

  5. Open rear compartment

  6. Pull out the M.2 riser card with both M.2 disks attached

  7. Install the M.2 riser card with both M.2 disks

  8. Close the rear motherboard compartment

  9. Slide the motherboard back into the system

  10. Plug in all cables using the labels as a reference

  11. Power on the system

  12. Re-install using the latest DGX Operating System

  13. Ship back the failed unit to NVIDIA Enterprise Support using the packaging provide

Preparing the System for Replacement

This failure is hard to diagnose because the system won’t boot, as both boot drives are unavailable.

After the replacement part arrives from NVIDIA, shut down the system from the front power button or from the BMC user interface and proceed by opening the IO door of the motherboard. Refer to Motherboard Tray - Opening and Closing the IO door to get access to the M.2 boot drive carrier.

Remove the M.2 Boot Drive Carrier

Before attempting to remove M.2 boot drive carrier, make sure that you performed the following prerequisites:

  • Label all network, monitor, and USB cables connected to the motherboard tray for easy identification when reconnecting.

  • Unplug all power cords, and all network, monitor, and USB cables.

Refer to Motherboard Tray - Opening and Closing the IO door for more information.

  1. After the IO section of the motherboard is open, unlock the M.2 drive carrier by loosening the PCI card locking mechanism by loosening the black captive thumbscrew on the right side of the motherboard:

    _images/dgx-h100-unlock-m2-carrier.png
  2. Rotate the locking mechanism for the PCI carrier out of the way:

    _images/dgx-h100-lock-remove.png
  3. Lossen the captive screw on the support bracket of the M.2 riser card:

    _images/dgx-h100-pci-riser-loosen.png
  4. Pull the M.2 riser card from the slot:

    _images/card-remove-pci.png
  5. Lift the M.2 riser card to remove it from the system:

    _images/dgx-h100-pci-riser-lift.png

Install the M.2 Boot Drive Carrier and Close the System

  1. Position the M.2 riser card into the system:

    _images/card-m2-riser.png
  2. Install the M.2 carrier card into the PCI riser by aligning it with the slot and then pressing it against the riser:

    _images/card-m2-riser-2.png
  3. Tighten the captive screw on the support bracket of the M.2 riser card:

    _images/dgx-h100-pci-riser-tighten.png
  4. Close the latch to secure the M.2 carrier and secure it in place:

    _images/pci-carrier-lock.png
  5. Tighten the thumb screw to make sure the locking mechanism stays in place:

    _images/rear-captive-lock.png

Re-Install the System and Complete the Procedure

  1. Close the lid and insert the motherboard tray. Refer to Motherboard Tray - Opening and Closing the IO door for more information.

  2. Reinstall the system following the instructions in the DGX OS User Guide.

  3. Confirm the system is in working order by running:

    sudo nvsm show health
    
  4. Use the packaging from the new component to ship back the failed one back to NVIDIA Enterprise Support