M.2 Boot Drive Assembly Replacement#

When you must replace both M.2 operating system drives, a replacement assembly, which includes both M.2 NVMe drives, should be ordered.

M.2 Boot Drive Riser Assembly Replacement Overview#

This is a high-level overview of the procedure to replace the boot drive riser assembly.

Note

If your organization purchased a media retention policy, you might be able to keep the failed drives for destruction. Check with NVIDIA Enterprise Support on the status of the policy for specifics.

  1. Get a replacement M.2 boot drive assembly from NVIDIA Enterprise Support.

  2. Make sure the system is shut down

  3. If the cables do not reach, label all cables and unplug them from the motherboard tray.

  4. Slide the motherboard out until it locks in place.

  5. Open the rear compartment.

  6. Pull out the M.2 riser card with both M.2 disks attached.

  7. Install the M.2 riser card with both M.2 disks.

  8. Close the rear motherboard compartment.

  9. Slide the motherboard back into the system.

  10. Plug in all cables using the labels as a reference.

  11. Power on the system.

  12. Re-install using the latest DGX operating system.

  13. Ship the failed unit to NVIDIA Enterprise Support using the packaging provided.

Preparing the System for Replacement#

This failure is hard to diagnose because the system does not boot as both boot drives are unavailable.

After the replacement part arrives from NVIDIA, shut down the system and proceed by opening the I/O door of the motherboard. Refer to Motherboard Tray - Opening and Closing the I/O Door to access the M.2 boot drive carrier.

Remove the M.2 Boot Drive Carrier#

Before attempting to remove the M.2 boot drive carrier, perform the following prerequisites:

  • Label all network, monitor, and USB cables connected to the motherboard tray for easy identification when reconnecting.

  • Unplug all power cords, network, monitor, and USB cables.

For more information, refer to Motherboard Tray - Opening and Closing the I/O Door.

  1. After the I/O section of the motherboard is open, loosen the black captive thumbscrew on the right side of the motherboard for the PCI card locking mechanism:

    _images/dgx-b200-unlock-m2-carrier.png
  2. Rotate the locking mechanism for the PCI carrier out of the way:

    _images/dgx-h100-lock-remove.png
  3. Loosen the captive screw on the support bracket of the M.2 riser card:

    _images/dgx-h100-pci-riser-loosen.png
  4. Pull the M.2 riser card from the slot:

    _images/card-remove-pci.png
  5. Lift the M.2 riser card to remove it from the system:

    _images/dgx-h100-pci-riser-lift.png

Install the M.2 Boot Drive Carrier and Close the System#

  1. Lower the M.2 riser card into the slot:

    _images/card-m2-riser.png
  2. Install the M.2 carrier card into the PCI riser by aligning it with the slot and then pressing it against the PCI slot riser:

    _images/card-m2-riser-2.png
  3. Tighten the captive screw on the support bracket of the M.2 PCI riser card:

    _images/dgx-h100-pci-riser-tighten.png
  4. Close the latch to secure the M.2 carrier card and secure it in place:

    _images/pci-carrier-lock.png
  5. Tighten the thumbscrew to ensure the locking mechanism stays in place:

    _images/dgx-b200-rear-captive-lock.png

Re-install the System and Complete the Procedure#

  1. Close the lid and insert the motherboard tray. Refer to Motherboard Tray - Opening and Closing the I/O Door for more information.

  2. Reinstall the system following the instructions in the DGX OS User Guide.

  3. Confirm the system is in working order by running:

    sudo nvsm show health
    
  4. Use the packaging from the new component to send the failed unit to NVIDIA Enterprise Support.