Appendix A: Installing Software#
RHEL uses the dnf package manager to install, update, and remove packages. The utility can also be used to manage repositories. For more information about using dnf, refer to the Red Hat Managing Software with the DNF tool article.
A.1 NVIDIA DOCA-OFED#
NVIDIA MLNX_OFED has transitioned to DOCA-OFED. Refer to the following resources for more information:
To install NVIDIA DOCA-OFED on RHEL:
Add the NVIDIA DOCA-OFED repository to your system.
RHEL 9:
sudo dd status=none of=/etc/yum.repos.d/doca.repo << EOF [doca] name=DOCA latest baseurl=https://linux.mellanox.com/public/repo/doca/latest/rhel9/arm64-sbsa/ enabled=1 gpgcheck=0 EOF sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm -y sudo dnf makecache
RHEL 10:
sudo dd status=none of=/etc/yum.repos.d/doca.repo << EOF [doca] name=DOCA latest baseurl=https://linux.mellanox.com/public/repo/doca/latest/rhel10/arm64-sbsa/ enabled=1 gpgcheck=0 EOF sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-10.noarch.rpm -y sudo dnf makecache
To install the dependent software, run the following commands.
sudo dnf install kernel-headers-$(uname -r| sed 's/+64k//g') -y sudo dnf install kernel-64k-devel-matched-$(uname -r| sed 's/+64k//g') -y sudo dnf install kernel-devel-matched-$(uname -r| sed 's/+64k//g') -y sudo dnf install kernel-modules-extra-$(uname -r| sed 's/+64k//g') -y sudo dnf install kernel-64k-modules-extra-$(uname -r| sed 's/+64k//g') -y sudo dnf install bison flex gcc make openssl-devel perl lsof rpm-build automake libtool patch kernel-rpm-macros autoconf gcc-gfortran tcl tk createrepo python3-devel checkpolicy policycoreutils dkms patch -y
Note
These steps will install the 4k kernel because NVIDIA DOCA-OFED currently has a dependency on the 4k kernel on RHEL, even when it is not being used.
Install DOCA.
sudo dnf install doca-ofed -y
For GPUDirect Storage (GDS) functionality, NVIDIA recommends that you install mlnx-nfsrdma-dkms and mlnx-nvme-dkms.
When using NVIDIA BlueField-3 SoC Management Interface, enable the Rshim service.
sudo systemctl daemon-reload sudo systemctl enable rshim sudo systemctl start rshim
To confirm that the NVIDIA BlueField-3 SoC Management Interface is on the system, run the following commands.
# Print the PCI BDF for the BlueField-3 Soc Management Interface sudo lspci | grep "BlueField-3 SoC Management Interface" | awk '{print $1}' 0006:03:00.2 0016:03:00.2Update the device firmware.
sudo dnf install mlnx-fw-updater -y
If the BlueField-3 SoC Management Interface is on the system, install the BlueField bundle.
sudo bfb-install --rshim rshim<N> --bfb <image_path.bfb>
Where <N> is the rshim device identifier.
Update the boot image to include NVIDIA DOCA OFED support and reboot the system.
sudo dracut -f sudo reboot now
A.2 NVIDIA GPU Driver and CUDA Toolkit#
Refer to the NVIDIA CUDA Installation Guide for Linux for information about installing the NVIDIA GPU driver and CUDA support for RHEL. The R535.129.03 driver is the minimum level required for the Hopper GPU.
The following commands can be used to install the latest levels required for the Hopper and Blackwell GPUs:
RHEL 9:
sudo dnf install kernel-headers-$(uname -r | sed 's/+64k//g') -y
sudo dnf install kernel-devel-matched-$(uname -r | sed 's/+64k//g') -y
sudo dnf install kernel-64k-devel-matched-$(uname -r| sed 's/+64k//g') -y
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm -y
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/sbsa/cuda-rhel9.repo
sudo dnf clean expire-cache
sudo dnf install cuda-toolkit-13-0 -y
sudo dnf module install nvidia-driver:580-open -y
sudo systemctl enable nvidia-persistenced
sudo reboot now
RHEL 10:
sudo dnf install kernel-headers-$(uname -r | sed 's/+64k//g') -y
sudo dnf install kernel-devel-matched-$(uname -r | sed 's/+64k//g') -y
sudo dnf install kernel-64k-devel-matched-$(uname -r| sed 's/+64k//g') -y
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-10.noarch.rpm -y
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel10/sbsa/cuda-rhel10.repo
sudo dnf clean expire-cache
sudo dnf install cuda-toolkit-13-0 -y
sudo dnf install nvidia-open-580 -y
sudo systemctl enable nvidia-persistenced
sudo reboot now
Note
The open-source GPU driver is required for Hopper and Blackwell GPUs.