Installing NVIDIA DOCA OFED

The NVIDIA DGX™ Software Stack for Red Hat Enterprise Linux does not include the NVIDIA DOCA™ OFED (OpenFabrics Enterprise Distribution) software for Linux. This is to ensure that the DOCA OFED software, a subset of the full DOCA package, is in sync with the Red Hat distribution kernel. This topic describes how to download, install, and upgrade the DOCA OFED software on systems that are running Red Hat Enterprise Linux.

DOCA-Host Installation Profiles

The DOCA software package contains several subsets called the DOCA-Host installation profiles, which are fully validated and tested installation packages. The following table lists the available DOCA-Host profiles:

DOCA-Host Profile

Description

doca-ofed

Allows you to install the same drivers and tools of MLNX_OFED using the DOCA-Host package, but without other DOCA functionality.

doca-network

Intended for users who want to use only the networking functionality of the DOCA-Host package.

doca-all

Intended for users who want to use the full extent of DOCA drivers and libraries, the full DOCA-Host installation.

For more information, refer to NVIDIA DOCA Profiles.

Prerequisites

  1. Before installing a different version of DOCA OFED software, you must remove the installed DOCA OFED or MLNX_OFED software on your system.

    • Debian-based Linux

      # Remove the installed DOCA OFED software.
      $ for f in $( dpkg --list | grep doca | awk '{print $2}' ); do echo $f ; sudo apt remove --purge $f -y ; done
      
      # Remove the installed MLNX_OFED software.
      $ sudo /usr/sbin/ofed_uninstall.sh --force
      
      $ sudo apt-get autoremove
      
    • RPM-based Linux

      # Remove the installed DOCA OFED software from the host.
      for f in $(rpm -qa | grep -i doca ) ; do sudo yum -y remove $f; done
      
      # Remove the installed MLNC_OFED software.
      sudo /usr/sbin/ofed_uninstall.sh --force
      
      sudo yum autoremove
      sudo yum makecache
      
  2. Download and install the NVIDIA RPM GPG key.

    1. Download the NVIDIA RPM-GPG-KEY-Mellanox-SHA256 key.

      wget http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox-SHA256
      
    2. Install the key.

      sudo rpm --import RPM-GPG-KEY-Mellanox-SHA256
      
    3. Verify that the key was successfully imported.

      sudo rpm -q gpg-pubkey --qf '%{NAME}-%{VERSION}-%{RELEASE}\t%{SUMMARY}\n' | grep Mellanox
      

Installation Steps

The DOCA-Host package includes drivers, libraries, and tools to support the NVIDIA® ConnectX®-7 adapter cards and the NVIDIA® BlueField®-3 DPUs:

DGX Systems with NVIDIA ConnectX-7 Adapter Cards

To install the DOCA-Host package of the doca-ofed profile on the host,

  1. Open the Installation Files page, download the DOCA-Host installation file based on the OS and Arch options you want.

    Alternatively, you can download the installation file using the DOCA Downloads page.

  2. Unpack the RPM package.

    sudo rpm -Uvh <repo_file>.rpm
    
  3. Perform an update using the yum command.

    sudo yum makecache
    
  4. Determine if the kernel version on your host is supported as shown in Supported Host OS per DOCA-Host Installation Profile.

    If the kernel version is not supported, follow the instructions described in DOCA Extra Package.

  5. Run the yum install command for the doca-ofed profile installation.

    sudo yum install -y doca-ofed
    
  6. Re-create an initramfs image.

    sudo dracut -f
    
  7. Reboot the system.

    sudo systemctl reboot
    
  8. Register your new Red Hat Enterprise Linux system to the Customer Portal using Red Hat Subscription-Manager.

    For more information, refer to How to register and subscribe a RHEL system to the Red Hat Customer Portal using Red Hat Subscription-Manager?.

For more information about the doca-ofed profile installation on the host, refer to Installing Software on Host.

DGX systems with BlueField-3 DPU in NIC Mode (Optional)

If your system is equipped with the NVIDIA BlueField-3 DPU, ensure that the DPU is set in NIC mode (NIC Mode for BlueField-3) and then proceed with the following instructions.

  1. Install the RShim driver to manage and flash the BlueField-3 DPU.

    Follow the procedure described in Installing Prerequisites on Host for Target BlueField.

    • Choose the procedure for the RPM-based Linux.

  2. Determine the BlueField-3 device ID.

    Follow the instructions described in Determining BlueField Device ID.

  3. Install the DOCA-Host software on the host.

    Follow the instructions for the selected DOCA-Host profile to install the DOCA drivers and tools as described in Installing Software on Host.

Additional Information