MLNX_OFED Installation
Prerequisite Packages for Installing MLNX_OFED
MLNX_OFED installation requires some prerequisite packages to be installed on the system
Currently, CentOS installed on the DPU has a private network to the host via the RShim connection, and it can be used to Secure Copy Protocol (SCP) all the required packages. However, it is recommended for the DPU to have a direct access to the network to use "yum install" to install all the required packages. For direct access to the network, set up the routing on the host via:
$ iptables -t nat -o em1 -A POSTROUTING -j MASQUERADE $ echo 1 > /proc/sys/net/ipv4/ip_forward $ systemctl restart dhcpd
Note"em1" is the outgoing network interface on the host. Change this according to your system requirements.
NoteThese commands are not saved in Linux startup script, and might be needed again after host machine reboots.
Reset the DPU network for Internet connection (access to the web) as long as the host is connected:
[root@localhost ~]# ifdown eth0; ifup eth0 [root@localhost ~]# ping google.com PING google.com (172.217.10.142) 56(84) bytes of data. 64 bytes from lga34s16-in-f14.1e100.net (172.217.10.142): icmp_seq=1 ttl=53 time=19.2 ms 64 bytes from lga34s16-in-f14.1e100.net (172.217.10.142): icmp_seq=2 ttl=53 time=17.7 ms 64 bytes from lga34s16-in-f14.1e100.net (172.217.10.142): icmp_seq=3 ttl=53 time=15.8 ms
Run "yum install" to install all the required MLNX_OFED packages:
$ yum install rpm-build $ yum group install "Development Tools" $ yum install kernel-devel-`uname -r` $ yum install valgrind-devel libnl3-devel python-devel $ yum install tcl tk
Note that this is not needed if you installed CentOS 7 with the kickstart ("-k") option.
Removing Pre-installed Kernel Module
There are cases where the kernel is shipped with an earlier version of the mlx5_core driver taken from the upstream Linux code. This version does not support the BlueField® Arm, but is loaded before the MLNX_OFED driver, and therefore, needs to be removed.
To remove the kernel module from the initramfs, run the following command:
$ mkdir /boot/tmp
$ cd /boot/tmp
$ gunzip < ../initramfs-4*64.img | cpio -i
$ rm -f lib/modules/4*/updates/mlx5_core.ko
$ rm -f lib/modules/4*/updates/tmfifo*.ko
$ cp ../initramfs-4*64.img ../initramfs-4.11.0-22.el7a.aarch64.img-bak
$ find | cpio -H newc -o | gzip -9 > ../initramfs-4*64.img
$ rpm -e mlx5_core
$ depmod -a
To burn the firmware which comes with OFED after OFED is installed, run:
For CentOS and Ubuntu:
$ /opt/mellanox/mlnx-fw-updater/firmware/mlxfwmanager_sriov_dis_aarch64_<device_id> –force
For Yocto:
$ /lib/firmware/mellanox/mlfwmanager_sriov_dis_aarch64_<device_id>
These instructions provide an example of MLNX_OFED_LINUX installation on RHEL7.4 ALT or CentOS 7.4ALT where the in-box kernel is 4.11.0-22.el7a.aarch64. For a different CentOS or RHEL version, the kernel version 4.11.0-22.el7a.aarch64 should be replaced by the corresponding in-box kernel version.
Transfer the MLNX_OFED image over to the BlueField. This can be done over the 1G OOB interface or RShim. The latter is used in this procedure. The MLNX_OFED images should be provided in the software drop:
$ scp MLNX_OFED_LINUX-4.2-X.X.X.X-rhel7.4alternate-aarch64.tgz root@192.168.100.2:/tmp
Install MLNX_OFED.
NoteIf the date is not set correctly while installing MLNX_OFED, first, set the date (e.g date -s 'Mon Feb 5 15:02:10 EST 2018'), then run the installation.
If the kernel on the BlueField is 4.11.0-22.el7a.aarch64, run:
$ cd /tmp $ tar xzf MLNX_OFED_LINUX-4.2-X.X.X.X-rhel7.4alternate-aarch64.tgz $ ./mlnxofedinstall --bluefield
If the kernel is different than 4.11.0-22.el7a.aarch64, run:
$ cd /tmp/MLNX_OFED_LINUX-4.2-X.X.X.X-rhel7.4alternate-aarch64 $ ./mlnxofedinstall --add-kernel-support --skip-repo --bluefield
Alternatively, the following command may be run regardless of whether in-box or customized kernel is used:
$ cd /mnt $ ./mlnxofedinstall --bluefield --auto-add-kernel-support
This step might take longer than expected to be completed. If you are using a different package than the required one, run "yum install".
NoteTo get MLNX_OFED_LINUX installation with upstream rdma-core package (required for DPDK, SPDK, nvme-snap, etc.) add the parameters "--upstream-libs" and "--dpdk" to the mlnxofedinstall command.
Disable rshim-getty service. Run:
$ systemctl disable rshim-getty
Disable NetworkManager. Run:
$ systemctl disable NetworkManager.service $ systemctl disable NetworkManager-wait-online.service
To bring up network interfaces when the NetworkManager is disabled, run:
$ sed -i -e "\$s@\$@\, RUN+=\"/sbin/ip link set dev '%k' up\"\, RUN+=\"/sbin/ethtool -L '%k' combined 4\"@" /etc/udev/rules.d/82-net-setup-link.rules $ echo "SUBSYSTEM==\"net\", ACTION==\"add\", RUN+=\"/sbin/ip link set dev '%k' up\"" >> /etc/udev/rules.d/82-net-setup-link.rules
Make sure that mlnx_snap.service is down. Run:
$ systemctl stop mlnx_snap.service
Restart openibd:
$ /etc/init.d/openibd restart
MLNX_OFED should be installed on any host using the DPU. This includes the host used to provision the DPU as well as the final system where the DPU is attached to.
To install MLNX_OFED on the host:
$ mount MLNX_OFED_LINUX-4.2-X.X.X.X-rhel7.4-x86_64.iso /mnt
$ cd /mnt
$ ./mlnxofedinstall
The last step of installing MLNX_OFED is to check and update the firmware. If it is possible to flash the firmware, flash it back according to the instructions in Installing MLNX_OFED on the DPU.
Manually load the mlx5_core driver on the BlueField Arm before loading it on the host, as the BlueField Arm is responsible for managing the memory. Manually blacklist the mlx5_core driver on the host and load it only after the BlueField Arm loading process is complete. To blacklist the driver, run:
$ echo "blacklist mlx5_core" > /etc/modprobe.d/blacklist-mlx5_core.conf
To prevent the Linux kernel from loading the mlx5_core driver included inside of the initramfs, open /boot/grub/grub.conf and append the following to the vmlinux line:
$ rdblacklist=mlx5_core
Also, change to "ONBOOT=no" in /etc/infiniband/openib.conf.
Once the BlueField Arm driver is loaded, manually load the driver via:
$ modprobe mlx5_core
When rebooting CentOS on the Arm-side, the host-side driver should be unloaded first. This is done with "rmmod mlx5_ib mlx5_core ib_core mlx_compat mlxfw". Reload the host driver after the Arm driver is loaded.