Installation Related Issues
| Issue | Cause | Solution | 
| Driver installation fails. | The install script may fail for the following reasons: 
 | 
 | 
| After driver installation, the openibd service fail to start. This message is logged by the driver: Unknown symbol | The driver was installed on top of an existing In-box driver. | 
 | 
This section is relevant for RedHat and SLES distributions only.
Overview
MLNX_OFED package for RedHat comes with RPMs that support KMP (weak-modules), meaning that when a new errata kernel is installed, compatibility links will be created under the weak-updates directory for the new kernel. Those links allow using the existing MLNX_OFED kernel modules without the need for recompilation. However, at times, the ABI of the new kernel may not be compatible with the MLNX_OFED modules, which will prevent loading them. In this case, the MLNX_OFED modules must be rebuilt against the new kernel.
Detecting ABI Incompatibility with MLNX_OFED Modules
When MLNX_OFED modules are not compatible with a new kernel from a new OS or errata kernel, no links will be created under the weak-updates directory for the new kernel, causing the driver load to fail. Checking for the existence of needed module links under weak-updates directory can be done by reloading the MLNX_OFED modules. If one or more modules are missing, the driver reload will fail with an error message.
Example:
            
            ********************************************************************************
# /etc/init.d/openibd restart
Unloading HCA driver:                                      [  OK  ]
Loading HCA driver and Access Layer:                       [  OK  ]
Module rdma_cm belong to kernel which is not a part of MLNX[FAILED]kipping...
Loading rdma_ucm                                           [FAILED]
********************************************************************************
    
    
    
        
Resolving ABI Incompatibility with MLNX_OFED Modules
In order to fix ABI incompatibility with MLNX_OFED modules, the modules should be recompiled against the new kernel, using the mlnx_add_kernel_support.sh script, available in MLNX_OFED installation image.
There are two ways to recompile the MLNX_OFED modules:
- Local recompilation and installation on one server. 
 Run the mlnxofedinstall command to recompile the kernel modules and reinstall the whole MLNX_OFED on the server. Mount MLNX_OFED ISO image or extract the TGZ file:- # cd <MLNX_OFED dir> # ./mlnxofedinstall --skip-distro-check --add-kernel-support --kmp --force - Notes: - - The --kmp flag will enable rebuilding RPMs with KMP (weak-updates) support for the new kernel. Therefore, in the next OS/kernel update, the same modules can be used with the new kernel (assuming that the ABI compatibility was not broken again). - - The command above will rebuild only the kernel RPMs (using mlnx_add_kernel_support.sh), and will save the resulting MLNX_OFED package under /tmp and start installing it automatically. This package can be used for installation on other servers using regular mlnxofedinstall command or yum. 
- Preparing a new image on one server and deploying it on the cluster. - Use the mlnx_add_kernel_support.sh script directly only to rebuild the kernel RPMs (without running any installations) on one server. Mount MLNX_OFED ISO image or extract the TGZ file: - # cd <MLNX_OFED dir> # ./mlnx_add_kernel_support.sh -m $PWD --kmp -y - Note: This command will save the resulting MLNX_OFED package under /tmp. - Example: - ******************************************************************************** # cd /tmp/MLNX_OFED_LINUX- - 3.3-- 1.0.- 0.0-DB-rhel7.- 0-x86_64 # ./mlnx_add_kernel_support.sh -m $PWD --kmp -y Note: This program will create MLNX_OFED_LINUX TGZ- forrhel7.- 1under /tmp directory. See log file /tmp/mlnx_ofed_iso.- 23852.log Building OFED RPMS . Please wait... Creating metadata-rpms- for- 3.10.- 0-- 229.14.- 1.el7.x86_64 ... WARNING: Please note that- thisMLNX_OFED repository contains an unsigned rpms, WARNING: therefore, you should set- 'gpgcheck=0'in the repo conf file. Created /tmp/MLNX_OFED_LINUX-- 3.3-- 1.0.- 0.0-rhel7.- 1-x86_64-ext.tgz ********************************************************************************
- Install the newly created MLNX_OFED package on the cluster: - Option 1: Copy the package to the servers and install it using the mlnxofedinstall script. - Option 2: Deploy the MLNX_OFED package using YUM (for YUM installation instructions, refer to Installing MLNX_OFED Using YUM section): - i. Extract the resulting MLNX_OFED image and copy it to a shared NFS location. - ii. Create a YUM repository configuration. - iii. Install the new MLNX_OFED kernel RPMs on the servers: # yum update Example: - ******************************************************************************** ... ... ======================================================================================================================== Package Arch Version Repository Size ======================================================================================================================== Updating: epel-release noarch - 7-- 7epel- 14k kmod-iser x86_64- 1.8.- 0-OFED.- 3.3.- 1.0.- 0.1.gf583963.- 201606210906.rhel7u1 mlnx_ofed- 35k kmod-isert x86_64- 1.0-OFED.- 3.3.- 1.0.- 0.1.gf583963.- 201606210906.rhel7u1 mlnx_ofed- 32k kmod-kernel-mft-mlnx x86_64- 4.4.- 0-- 1.201606210906.rhel7u1 mlnx_ofed- 10k kmod-knem-mlnx x86_64- 1.1.- 2.90mlnx1-OFED.- 3.3.- 0.0.- 1.0.- 3.1.ga04469b.- 201606210906.rhel7u1 mlnx_ofed- 22k kmod-mlnx-ofa_kernel x86_64- 3.3-OFED.- 3.3.- 1.0.- 0.1.gf583963.- 201606210906.rhel7u1 mlnx_ofed- 1.4M kmod-srp x86_64- 1.6.- 0-OFED.- 3.3.- 1.0.- 0.1.gf583963.- 201606210906.rhel7u1 mlnx_ofed- 39k Transaction Summary ======================================================================================================================== Upgrade- 7Packages ... ... ********************************************************************************- Note: The MLNX_OFED user-space packages will not change; only the kernel RPMs will be updated. However, “YUM update” can also update other inbox packages (not related to OFED). In order to install the MLNX_OFED kernel RPMs only, make sure to run: - # yum install mlnx-ofed-kernel-only - Note: mlnx-ofed-kernel-only is a metadata RPM that requires the MLNX_OFED kernel RPMs only. 
- Verify that the driver can be reloaded: - # /etc/init.d/openibd restart