The installation script, mlnxofedinstall, performs the following:

Discovers the currently installed kernel

Uninstalls any software stacks that are part of the standard operating system distribution or another vendor's commercial stack

Installs the MLNX_OFED_LINUX binary RPMs (if they are available for the current kernel)

Identifies the currently installed InfiniBand and Ethernet network adapters and automatically upgrades the firmware Note: To perform a firmware upgrade using customized firmware binaries, a path can be provided to the folder that contains the firmware binary files, by running --fw-image-dir. Using this option, the firmware version embedded in the MLNX_OFED package will be ignored. Example: Copy Copied! ./mlnxofedinstall --fw-image-dir /tmp/my_fw_bin_files

Note If the driver detects unsupported cards on the system, it will abort the installation procedure. To avoid this, make sure to add --skip-unsupported-devices-check flag during installation.

Usage

Copy Copied! ./mnt/mlnxofedinstall [OPTIONS]

The installation script removes all previously installed OFED packages and re-installs from scratch. You will be prompted to acknowledge the deletion of the old packages.

Note Pre-existing configuration files will be saved with the extension “.conf.rpmsave”.

If you need to install OFED on an entire (homogeneous) cluster, a common strategy is to mount the ISO image on one of the cluster nodes and then copy it to a shared file system such as NFS. To install on all the cluster nodes, use cluster-aware tools (suchaspdsh).

If your kernel version does not match with any of the offered pre-built RPMs, you can add your kernel version by using the “mlnx_add_kernel_support.sh” script located inside the MLNX_OFED package. Note On Redhat and SLES distributions with errata kernel installed there is no need to use the mlnx_add_kernel_support.sh script. The regular installation can be performed and weak-updates mechanism will create symbolic links to the MLNX_OFED kernel modules. Note If you regenerate kernel modules for a custom kernel (using --add-kernel-support ), the packages installation will not involve automatic regeneration of the initramfs. In some cases, such as a system with a root filesystem mounted over a ConnectX card, not regenerating the initramfs may even cause the system to fail to reboot. In such cases, the installer will recommend running the following command to update the initramfs: Copy Copied! dracut -f On some OSs, dracut -f might result in the following error message which can be safely ignore. libkmod: kmod_module_new_from_path: kmod_module 'mdev' already exists with different path The “mlnx_add_kernel_support.sh” script can be executed directly from the mlnxofedinstall script. For further information, please see '--add-kernel-support' option below. Note On Ubuntu and Debian distributions drivers installation use Dynamic Kernel Module Support (DKMS) framework. Thus, the drivers' compilation will take place on the host during MLNX_OFED installation. Therefore, using "mlnx_add_kernel_support.sh" is irrelevant on Ubuntu and Debian distributions. Example: The following command will create a MLNX_OFED_LINUX ISO image for RedHat 7.3 under the /tmp directory. Copy Copied! # ./MLNX_OFED_LINUX-x.x-x-rhel7. 3 -x86_64/mlnx_add_kernel_support.sh -m /tmp/MLNX_OFED_LINUX-x.x-x-rhel7. 3 -x86_64/ --make-tgz Note: This program will create MLNX_OFED_LINUX TGZ for rhel7. 3 under /tmp directory. All Mellanox, OEM, OFED, or Distribution IB packages will be removed. Do you want to continue ?[y/N]:y See log file /tmp/mlnx_ofed_iso. 21642 .log Building OFED RPMs. Please wait... Removing OFED RPMs... Created /tmp/MLNX_OFED_LINUX-x.x-x-rhel7. 3 -x86_64-ext.tgz

The script adds the following lines to /etc/security/limits.conf for the userspace components such as MPI: * soft memlock unlimited * hard memlock unlimited These settings set the amount of memory that can be pinned by a userspace application to unlimited. If desired, tune the value unlimited to a specific amount of RAM.



For your machine to be part of the InfiniBand/VPI fabric, a Subnet Manager must be running on one of the fabric nodes. At this point, OFED for Linux has already installed the OpenSM Subnet Manager on your machine.

For the list of installation options, run:

Copy Copied! ./mlnxofedinstall --h





This section describes the installation procedure of MLNX_OFED on NVIDIA adapter cards.

Log in to the installation machine as root. Mount the ISO image on your machine. Copy Copied! host1# mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Run the installation script. Copy Copied! /mnt/mlnxofedinstall Logs dir: /tmp/MLNX_OFED_LINUX-x.x-x.logs This program will install the MLNX_OFED_LINUX package on your machine. Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed. Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them. Starting MLNX_OFED_LINUX-x.x.x installation ... ........ ........ Installation finished successfully. Attempting to perform Firmware update... Querying Mellanox devices firmware ... Note For unattended installation, use the --force installation option while running the MLNX_OFED installation script: /mnt/mlnxofedinstall --force Note MLNX_OFED for Ubuntu should be installed with the following flags in chroot environment: ./mlnxofedinstall --without-dkms --add-kernel-support --kernel --without-fw-update --force For example: ./mlnxofedinstall --without-dkms --add-kernel-support --kernel 3.13.0-85-generic --without-fw-update --force Note that the path to kernel sources (--kernel-sources) should be added if the sources are not in their default location. ./mlnxofedinstall --without-dkms --add-kernel-support --kernel Note In case your machine has the latest firmware, no firmware update will occur and the installation script will print at the end of installation a message similar to the following: Device #1: ---------- Device Type: ConnectX-X Part Number: MCXXXX-XXX PSID: MT_ PCI Device Name: 0b:00.0 Base MAC: 0000e41d2d5cf810 Versions: Current Available FW XX.XX.XXXX Status: Up to date PSID: MT_ Note In case your machine has an unsupported network adapter device, no firmware update will occur and one of the error messages below will be printed. Please contact your hardware vendor for help with firmware updates. Error message #1: Device #1: ---------- Device Type: ConnectX-X Part Number: MCXXXX-XXX PSID: MT_ PCI Device Name: 0b:00.0 Base MAC: 0000e41d2d5cf810 Versions: Current Available FW XX.XX.XXXX Status: No matching image found Error message #2: The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor. PSID: MT_ Case A: If the installation script has performed a firmware update on your network adapter, you need to either restart the driver or reboot your system before the firmware update can take effect. Refer to the table below to find the appropriate action for your specific card. Action \ Adapter Driver Restart Standard Reboot (Soft Reset) Cold Reboot (Hard Reset) Standard ConnectX-4/ConnectX-4 Lx or higher - + - Adapters with Multi-Host Support - - + Socket Direct Cards - - + Case B: If the installations script has not performed a firmware upgrade on your network adapter, restart the driver by running: “/etc/init.d/openibd restart”. (InfiniBand only) Run the hca_self_test.ofed utility to verify whether or not the InfiniBand link is up. The utility also checks for and displays additional information such as:

HCA firmware version

Kernel architecture

Driver version

Number of active HCA ports along with their states

Node GUID For more details on hca_self_test.ofed, see the file docs/readme_and_user_manual/hca_self_test.readme.

After installation completion, information about the OFED installation, such as prefix, kernel version, and installation parameters can be retrieved by running the command /etc/infiniband/info. Most of the OFED components can be configured or reconfigured after the installation, by modifying the relevant configuration files. See the relevant chapters in this manual for details.

The list of the modules that will be loaded automatically upon boot can be found in the /etc/infiniband/openib.conf file.

Note Installing OFED will replace the RDMA stack and remove existing 3rd party RDMA connectors.





Action \ Adapter Driver Restart Standard Reboot (Soft Reset) Cold Reboot (Hard Reset) Standard ConnectX-4/ConnectX-4 Lx or higher - + - Adapters with Multi-Host Support - - + Socket Direct Cards - - + Software Most of MLNX_OFED packages are installed under the “/usr” directory except for the following packages which are installed under the “/opt” directory: fca and ibutils iproute2 (rdma tool) - installed under /opt/Mellanox/iproute2/sbin/rdma

The kernel modules are installed under /lib/modules/`uname -r`/updates on SLES and Fedora Distributions /lib/modules/`uname -r`/extra/mlnx-ofa_kernel on RHEL and other RedHat like Distributions /lib/modules/`uname -r`/updates/dkms/ on Ubuntu

Firmware The firmware of existing network adapter devices will be updated if the following two conditions are fulfilled: The installation script is run in default mode; that is, without the option ‘--without- fw-update’ The firmware version of the adapter device is older than the firmware version included with the OFED ISO image Note: If an adapter’s Flash was originally programmed with an Expansion ROM image, the automatic firmware update will also burn an Expansion ROM image.

In case your machine has an unsupported network adapter device, no firmware update will occur and the error message below will be printed. "The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor."

While installing MLNX_OFED, the install log for each selected package will be saved in a separate log file.

The path to the directory containing the log files will be displayed after running the installation script in the following format:

Example:

Copy Copied! Logs dir: /tmp/MLNX_OFED_LINUX- 4.4 - 1.0 . 0.0 .IBMM2150110033.logs



