Known Issues
The following is a list of general limitations and known issues of the current version of the release.For the list of old known issues, please refer to NVIDIA EN Archived Known Issues file at: http://www.mellanox.com/pdf/prod_software/MLNX_EN_Archived_Known_Issues.pdf
Internal Ref. Number |
Issue |
3253255 |
Description: RHEL 7 does not include built-in support for Python3. There are two potential ways to install it, and both install a package with a different name: Python3 support is needed for using Pyverbs and the Python support of Open vSwitch. MLNX_OFED assumes that on RHEL7.x, if using Python3, that python36 from EPEL is used (otherwise the optional Python3 support cannot be used). |
Workaround: To use Python3 support on RHEL7, install python36 from the RHEL7 EPEL repository. |
|
Keywords: RHEL7, Python3 |
|
Discovered in Release: 5.4-3.6.8.1 |
|
3201193 |
Description: When installing version 5.4-3.5.8.0 using the yam intallation method will produce the following error, on some systems: |
Workaround: Uninstall the pcp-pmda-infiniband package and rerun installation. |
|
Keywords: Installation |
|
Discovered in Release: 5.4-3.5.8.0 |
|
3200967 |
Description: kernel-macros package does not support building with KMP enabled. KMP needs to be disabled. |
Workaround: Build and install the package with KMP disabled (without --kmp flag). |
|
Keywords: Installation |
|
Discovered in Release: 5.4-3.5.8.0 |
|
3175833 |
Description: Using RHEL 7.9 with Errata kernel may require add-kernel-support flag, while installing the package. |
Workaround: N/A |
|
Keywords: Installation |
|
Discovered in Release: 5.4-3.5.8.0 |
|
3179313 |
Description: On RHEL 9.0, unbound-devel package is missing to build OVS-DPDK. OVS-DPDK will be disabled for this release until RHEL provides the missing package. |
Workaround: Manually install unbound-devel prior to the installation. |
|
Keywords: OVS DPDK |
|
Discovered in Release: 5.4-3.5.8.0 |
|
2657392 |
Description: OFED installation caused CIFS to break in RHEL 8.4 and above. A dummy module was added so that CIFS will be disabled after OFED installation in RHEL 8.4 and above. |
Workaround: N/A |
|
Keywords: Installation, RHEL, CIFS |
|
Discovered in Release: 5.4-0.5.1.1 |
|
2782406 |
Description: Running yum update will upgrade kylin-release to a higher version. The version of this package is used for kylin10sp2 detection so the script will detect kylin 10 instead of kylin10sp2 and use its repository by mistake. |
Workaround: Because there are no special cases for kylin10sp2, the repository that was detected with adding --add-kernel-support to the installation command can be used. |
|
Keywords: Upgrade, kylin |
|
Discovered in Release: 5.4-3.0.3.0 |
|
2752622 |
Description: On SLES 15, the inbox modules in the directory mlxsw (such as mlxsw_spectrum) are not supported. If they are installed when installing MLNX_OFED, they will no longer work (as they depend on a different version of the mlx* modules) and may cause an error at time of installation. |
Workaround: Either remove the package kernel-default-extra or manually remove them: rm /lib/modules/`uname -r`/kernel/drivers/net/ethernet/mellanox/mlxsw/*.ko |
|
Keywords: Installation |
|
Discovered in Release: 5.4-3.0.3.0 |
|
2755632 |
Description: On dual port cards with SR-IOV, when one port link is configured to InfiniBand and the other port link is configured to Ethernet, the Ethernet port will not be able to support VST and QinQ. |
Workaround: N/A |
|
Keywords: SR-IOV, VST, QinQ |
|
Discovered in Release: 5.4-3.0.3.0 |
|
2780436 |
Description: Non-default MTU (>1500) is not supported with IPsec crypto offload and may cause packet drops. |
Workaround: N/A |
|
Keywords: IPsec, Crypto Offload, MTU |
|
Discovered in Release: 5.4-3.0.3.0 |
|
2726021 |
Description: Building packages on openEuler with kmp enabled requires kernel-rpm-macros package installed. kernel-rpm-macros-30-13.oe1 does not support -p option and kernel-rpm-macros-30-18.oe1 should be installed instead. |
Workaround: N/A |
|
Keywords: Installation, openEuler |
|
Discovered in Release: 5.4-3.0.3.0 |
Internal Ref. Number |
Issue |
2750653 |
Description: Running fragmented traffic in RHEL 8.3 (4.18.0-240.el8.x86_64) may cause call trace in build_skb. |
Workaround: Update to RHEL 8.3 z-stream 4.18.0-240.22.1.el8_3.x86_64. |
|
Keywords: RHEL 8.3, Kernel Panic, Call Trace, fr |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2629375 |
Description: Matching on CT label is only supported when matching on lower 32 bits. Full match on all 128 bits of CT label is not supported. |
Workaround: N/A |
|
Keywords: ASAP2, Connection Tracking, Label |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2707997 |
Description: Installation in the package manager mode under SLES 15.x may require user-intervention if the original libibverbs is installed. |
Workaround: zypper install --force-resolution mlnx-ofed-all |
|
Keywords: Installation, libibverbs |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2708531 |
Description: Installation in the package manager mode under SLES 15.x may require user-intervention if the original libopenvswitch is installed. |
Workaround: zypper install --force-resolution mlnx-ofed-all |
|
Keywords: Installation |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2703043 |
Description: Congested TCP lock for kTLS TX device offload traffic compromises the performance. |
Workaround: Disable TCP selective acknowledgement: echo 0 > /proc/sys/net/ipv4/tcp_sack |
|
Keywords: kTLS TX |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2676405 |
Description: If the package interface-rename is active (on XenServer, for example), the interface renaming by the OFED will not be done to eliminate conflicts. |
Workaround: N/A |
|
Keywords: Interface Renaming |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2687943 |
Description: Offload of rules which redirect from VF on one PF to VF on second PF is not supported on socket-direct devices. |
Workaround: N/A |
|
Keywords: ASAP2, Socket-Direct |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2677225 |
Description: Conducting a driver restart while in VF LAG mode may cause unwanted behaviour such as kernel crashes. |
Workaround: Set link down for both PFs. |
|
Keywords: ASAP2, Bonding, Driver Restart, VF LAG |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2678672 |
Description: When disabling switchdev mode, the qdisc in tunnel device cannot be destroyed and mlx5e_stats_flower() is still called by OVS resulting in NULL pointer panic and memory leak. |
Workaround: N/A |
|
Keywords: SwitchDev, mlx5, Tunnel Traffic |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2566548 |
Description: On PPC systems when EEH is enabled, running fw sync reset (either by mlxfwreset with flag --sync 1 or by devlink dev reload action fw_activate), the EEHmay catch the PCI reset and take ownership on the flow. When run few times in sequence, the EEH may also decide to disable the device. |
Workaround: Administrator may disable EEH before running firmware sync reset on the device. |
|
Keywords: PPC, EEH |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2617950 |
Description: TX port timestamp feature is supported for kernel versions 3.15 and greater. On older kernel versions, the feature will not be supported and ptp_tx |
Workaround: N/A |
|
Keywords: Ethtool |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2390731 |
Description: Ethtool does not display Port Speed advertised/capability above 100Gb/s over and below kernels 5.0, even when supported. |
Workaround: N/A |
|
Keywords: Ethtool, Port Speed |
|
Discovered in Release: 5.4-1.0.3.0 |
Internal Ref. Number |
Issue |
2585575 |
Description: After disabling sync reset by setting enable_remote_dev_reset to false, running firmware sync reset a few times may lead to general protection fault and system may get stuck. |
Workaround: N/A |
|
Keywords: Firmware Upgrade |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2582565 |
Description: Conducting a firmware reset or unbinding the PF while in switchdev mode may cause a kernel crash. |
Workaround: N/A |
|
Keywords: SwitchDev, ASAP2, Unbind, Firmware Reset |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2587802 |
Description: PTP synchronization may be lost while using tx_port_ts private flag. |
Workaround: Toggle private flag: |
|
Keywords: PTP Synchronization |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2574943 |
Description: When running kernel 5.8 and bellow or RHEL 8.2 and below, sampled packets do not support tunnel information. |
Workaround: N/A |
|
Keywords: ASAP2, sFLOW |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2438392 |
Description: VXLAN with IPsec crypto offload does not work. |
Workaround: N/A |
|
Keywords: VXLAN; IPsec crypto |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2568417 |
Description: Upon upgrade to version 5.3, the package manager tool will install the new packages and then remove the old packages, a depmod WARNING on "mlx5_fpga_tools" will appear. This warning can be safely ignored. mlx5_fpga_tools is a module that existed in version 5.2 and was removed in 5.3. |
Workaround: N/A |
|
Keywords: Upgrade; mlx5_fpga_tools |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2506425 |
Description: When installing kmod packages on EulerOS 2.0SP9 or OpenEuler 20.03, the following error appears: "modprobe: FATAL: could not get modversions of |
Workaround: N/A |
|
Keywords: Installation; modules; kmod |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2492509 |
Description: When installing the driver on OpenEuler or on EulerOS 2.0SP9, rebuilding the drivers (--add-kernel-support) with the --kmp option (to create kmod packages) generates packages that are uninstallable because they have a dependency on "/sbin/depmod" that the system does not provide. This dependency is created by a buggy kmod package building tool included with the distribution. |
Workaround: N/A |
|
Keywords: add-kernel-support |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2479327 |
Description: On SLES 12 SP5, if the kernel was upgraded to 4.12.14-122.46, it is not possible to rebuild kernel modules (--add-kernel-support) without upgrading gcc as well to at least 4.8.5-31.23.2. |
Workaround: N/A |
|
Keywords: Upgrade; SLES 12; add-kernel-support |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2584441 |
Description: On SLES 12 SP5, if the kernel was upgraded to 4.12.14-122.46, it is not possible to rebuild kernel modules (--add-kernel-support) without upgrading gcc as well to at least 4.8.5-31.23.2. |
Workaround: N/A |
|
Keywords: Upgrade; SLES 12; add-kernel-support |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2460865 |
Description: When setting MTU to low values, such as 68 bytes, packets may fail on oversize. |
Workaround: N/A |
|
Keywords: MTU |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2383318 |
Description: On kernels based on RedHat 7.2, the "tx_port_ts" feature, as set by ethtool —set-priv-flags, is disabled. |
Workaround: N/A |
|
Keywords: RedHat; tx_port_ts |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2575647 |
Description: An OvS-DPDK crash might occur while doing live-migration for VMs that use virtio-interfaces that are accelerated using OvS-DPDK vDPA ports. |
Workaround: N/A |
|
Keywords: OvS-DPDK vDPA, Live-migration |
|
Discovered in Release: 5.3-1.0.0.1 |
Internal Ref. Number |
Issue |
2395082 |
Description: A call trace may take place when moving from SwitchDev mode back to Legacy mode in Kernel v5.9 due to a kernel issue in tcf_block_unbind. |
Workaround: N/A |
|
Keywords: ASAP2;SwitchDev; call trace; kernel; tcf_block_unbind |
|
Discovered in Release: 5.2-1.0.4.0 |
Internal Ref. Number |
Issue |
2209987 |
Description: aRFS feature (activated using "ethtool ntuple on") is disabled for kernel 4.1 or below. |
Workaround: N/A |
|
Keywords: aRFS |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2248996 |
Description: Downgrading the firmware version for ConnectX-6 cards using "install --fw-update-only --force-fw-update" fails. |
Workaround: Manually downgrade the firmware version - please see Firmware Update Instructions. |
|
Keywords: Firmware, ConnectX-6 |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2175930 |
Description: When using MLNX_EN v5.1 on PPC architectures with kernels v5.5 or v5.6 and an old ethtool utility, a harmless warning call trace may appear in the dmesg due to mismatch between user space and kernel. The warning call trace mentions ethtool_notify. |
Workaround: Update the ethtool utility to version 5.6 on such systems in order to avoid the call trace. |
|
Keywords: PPC, ethtool_notify, kernel |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2198764 |
Description: If MLNX_EN is installed on a Debian or Ubuntu system that is run in chroot environment, the openibd service will not be enabled. If the chroot files are being used as a base of a full system, the openibd service is left disabled. |
Workaround: Currently, openibd is a sysv-init script that you can enable manually by running: update-rc.d openibd defaults |
|
Keywords: chroot, Debian , Ubuntu, openibd |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2237134 |
Description: Running connection tracking (CT) with FW steering may cause CREATE_FLOW_TABLE command to fail with syndrome. |
Workaround: Configure OVS to use a single handler-thread: #ovs-vsctl set Open_vSwitch . other_config:n-handler-threads=1 |
|
Keywords: Connection tracking, ASAP, OVS, FW steering |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2239894 |
Description: Running OpenVSwitch offload with high traffic throughput can cause low insertion rate due to high CPU usage. |
Workaround: Reduce the number of combined channels of the uplink using "ethtool -L". |
|
Keywords: Insertion rate, ASAP2 |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2240671 |
Description: Header rewrite action is not supported over RHEL/CentOS 7.4. |
Workaround: N/A |
|
Keywords: ASAP, header rewrite, RHEL, RedHat, CentOS, OS |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2242546 |
Description: Tunnel offload (encap/decap) may cause kernel panic if nf_tables module is not probed. |
Workaround: Make sure to probe the nf_tables module before inserting any rule. |
|
Keywords: Kernel v5.7, ASAP, kernel panic |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2143007 |
Description: IPsec packets are dropped during heavy traffic due to a bug in net/xfrm Linux Kernel. |
Workaround: Make sure the Kernel is modified to apply the following patch: "xfrm: Fix double ESP trailer insertion in IPsec crypto offload". |
|
Keywords: IPsec, xfrm |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2225952 |
Description: VF mirroring with TC policy skip_sw is not supported on RHEL/CentOS 7.4, 7.5 and 7.6 OSs. |
Workaround: N/A |
|
Keywords: ASAP2, Mirroring, RHEL, RedHat, OS |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2216521 |
Description: After upgrading MLNX_EN from v5.0 or earlier, ibdev2netdev utility changes the installation prefix to /usr/sbin. Therefore, it cannot be found while found in the same SHELL environment. |
Workaround: After installing MLNX_EN, log out and log in again to refresh the SHELL environment. |
|
Keywords: ibdev2netdev |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2202520 |
Description: Rules with VLAN push/pop, encap/decap and header rewrite actions together are not supported. |
Workaround: N/A |
|
Keywords: ASAP2, SwitchDev, VLAN push/pop, encap/decap, header rewrite |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2210752 |
Description: Switching from Legacy mode to SwitchDev mode and vice-versa while TC rules exist on the NIC will result in failure. |
Workaround: Before attempting to switch mode, make sure to delete all TC rules on the NIC or stop OpenvSwitch. |
|
Keywords: ASAP2, Devlink, Legacy SR-IOV |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2125036/2125031 |
Description: Upgrading the MLNX_EN from an UPSTREAM_LIBS based version to an MLNX_LIBS based version fails unless the driver is uninstalled and then re-installed. |
Workaround: Make sure to uninstall and re-install MLNX_EN to complete the upgrade. |
|
Keywords: Installation, UPSTREAM_LIBS, MLNX_LIBS |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2105447 |
Description: hns_roce warning messages will appear in the dmesg after reboot on Euler2 SP3 OSs. |
Workaround: N/A |
|
Keywords: hns_roce, dmesg, Euler |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2112251 |
Description: On kernels 4.10-4.14, when Geneve tunnel's remote endpoint is defined using IPv6, packets larger than MTU are not fragmented, resulting in no traffic sent. |
Workaround: Define geneve tunnel's remote endpoint using IPv4. |
|
Keywords: Kernel, Geneve, IPv4, IPv6, MTU, fragmentation |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2102902 |
Description: A kernel panic may occur over RH8.0-4.18.0-80.el8.x86_64 OS when opening kTLS offload connection due to a bug in kernel TLS stack. |
Workaround: N/A |
|
Keywords: TLS offload, mlx5e |
|
Discovered in Release: 5.1-1.0.4.0 |
|
2111534 |
Description: A Kernel panic may occur over Ubuntu19.04-5.0.0-38-generic OS when opening kTLS offload connection due to a bug in the Kernel TLS stack. |
Workaround: N/A |
|
Keywords: TLS offload, mlx5e |
|
Discovered in Release: 5.1-1.0.4.0 |
Internal Ref. Number |
Issue |
2094176 |
Description: When running in a large scale in VF-LAG mode, bandwidth may be unstable. |
Workaround: N/A |
|
Keywords: VF LAG |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2044544 |
Description: When working with OSs with Kernel v4.10, bonding module does not allow setting MTUs larger than 1500 on a bonding interface. |
Workaround: Upgrade your Kernel version to v4.11 or above. |
|
Keywords: Bonding, MTU, Kernel |
|
Discovered in Release: 5.0-1.0.0.0 |
|
1882932 |
Description: Libibverbs dependencies are removed during OFED installation, requiring manual installation of libraries that OFED does not reinstall. |
Workaround: Manually install missing packages. |
|
Keywords: libibverbs, installation |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2058535 |
Description: ibdev2netdev command returns duplicate devices with different ports in SwitchDev mode. |
Workaround: Use /opt/mellanox/iproute2/sbin/rdma link show command instead. |
|
Keywords: ibdev2netdev |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2072568 |
Description: In RHEL/CentOS 7.2 OSs, adding drop rules when act_gact is not loaded may cause a kernel crash. |
Workaround: Preload all needed modules to avoid such a scenario (cls_flower, act_mirred, act_gact, act_tunnel_key and act_vlan). |
|
Keywords: RHEL/CentOS 7.2, Kernel 4.9, call trace, ASAP |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2093698 |
Description: VF LAG configuration is not supported when the NUM_OF_VFS configured in mlxconfig is higher than 64. |
Workaround: N/A |
|
Keywords: VF LAG, SwitchDev mode, ASAP |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2093746 |
Description: Devlink health dumps are not supported on kernels lower than v5.3. |
Workaround: N/A |
|
Keywords: Devlink, health report, dump |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2083427 |
Description: For kernels with connection tracking support, neigh update events are not supported, requiring users to have static ARPs to work with OVS and VxLAN. |
Workaround: N/A |
|
Keywords: VxLAN, VF LAG, neigh, ARP |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2067012 |
Description: MLNX_EN cannot be installed on Debian 9.11 OS in SwitchDev mode. |
Workaround: Install OFED with the flag --add-kernel-support. |
|
Keywords: ASAP, SwitchDev, Debian, Kernel |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2036572 |
Description: When using a thread domain and the lockless rdma-core ibv_post_send path, there is an additional CPU penalty due to required barriers around the device MMIO buffer that were omitted in MLNX_EN. |
Workaround: N/A |
|
Keywords: rdma-core, write-combining, MMIO buffer |
|
Discovered in Release: 5.0-1.0.0.0 |
Internal Ref. Number |
Issue |
- |
Description: The argparse module is installed by default in Python versions =>2.7 and >=3.2. In case an older Python version is used, the argparse module is not installed by default. |
Workaround: Install the argparse module manually. |
|
Keywords: Python, MFT, argparse, installation |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1997230 |
Description: Running mlxfwreset or unloading mlx5_core module while contrak flows are offloaded may cause a call trace in the kernel. |
Workaround: Stop OVS service before calling mlxfwreset or unloading mlx5_core module. |
|
Keywords: Contrak, ASAP, OVS, mlxfwrest, unload |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1955352 |
Description: Moving 2 ports to SwitchDev mode in parallel is not supported. |
Workaround: N/A |
|
Keywords: ASAP, SwitchDev |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1979958 |
Description: VxLAN IPv6 offload is not supported over CentOS/RHEL v7.2 OSs. |
Workaround: N/A |
|
Keywords: Tunnel, VXLAN, ASAP, IPv6 |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1991710 |
Description: PRIO_TAG_REQUIRED_EN configuration is not supported and may cause call trace. |
Workaround: N/A |
|
Keywords: ASAP, PRIO_TAG, mstconfig |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1967866 |
Description: Enabling ECMP offload requires the VFs to be unbound and VMs to be shut down. |
Workaround: N/A |
|
Keywords: ECMP, Multipath, ASAP2 |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1821235 |
Description: When using mlx5dv_dr API for flow creation, for flows which execute the "encapsulation" action or "push vlan" action, metadata C registers will be reset to zero. |
Workaround: Use the both actions at the end of the flow process. |
|
Keywords: Flow steering |
|
Discovered in Release: 4.7-1.0.0.1 |
|
1921981 |
Description: On Ubuntu, Debian and RedHat 8 and above OSS, parsing the mfa2 file using the mstarchive might result in a segmentation fault. |
Workaround: Use mlxarchive to parse the mfa2 file instead. |
|
Keywords: MFT, mfa2, mstarchive, mlxarchive, Ubuntu, Debian, RedHat, operating system |
|
Discovered in Release: 4.7-1.0.0.1 |
|
1840288 |
Description: MLNX_EN does not support XDP features on RedHat 7 OS, despite the declared support by RedHat. |
Workaround: N/A |
|
Keywords: XDP, RedHat |
|
Discovered in Release: 4.7-1.0.0.1 |
|
1892663 |
Description: mlnx_tune script does not support python3 interpreter. |
Workaround: Run mlnx_tune with python2 interpreter only. |
|
Keywords: mlnx_tune, python3, python2 |
|
Discovered in Release: 4.7-1.0.0.1 |
Internal Ref. Number |
Issue |
1753629 |
Description: A bonding bug found in Kernels 4.12 and 4.13 may cause a slave to become permanently stuck in BOND_LINK_FAIL state. As a result, the following message may appear in dmesg: bond: link status down for interface eth1, disabling it in 100 ms |
Workaround: N/A |
|
Keywords: Bonding, slave |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1712068 |
Description: Uninstalling MLNX_EN automatically results in the uninstallation of several libraries that are included in the MLNX_EN package, such as InfiniBand-related libraries. |
Workaround: If these libraries are required, reinstall them using the local package manager (yum/dnf). |
|
Keywords: MLNX_EN libraries |
|
Discovered in Release: 4.6-1.0.1.1 |
|
- |
Description: Due to changes in libraries, MFT v4.11.0 and below are not forward compatible with MLNX_EN v4.6-1.0.0.0 and above. Therefore, with MLNX_EN v4.6-1.0.0.0 and above, it is recommended to use MFT v4.12.0 and above. |
Workaround: N/A |
|
Keywords: MFT compatible |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1730840 |
Description: On ConnectX-4 HCAs, GID index for RoCE v2 is inconsistent when toggling between enabled and disabled interface modes. |
Workaround: N/A |
|
Keywords: RoCE v2, GID |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1717428 |
Description: On kernels 4.10-4.14, MTUs larger than 1500 cannot be set for a GRE interface with any driver (IPv4 or IPv6). |
Workaround: Upgrade your kernel to any version higher than v4.14. |
|
Keywords: Fedora 27, gretap, ip_gre, ip_tunnel, ip6_gre, ip6_tunnel |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1748343 |
Description: Driver reload takes several minutes when a large number of VFs exists. |
Workaround: N/A |
|
Keywords: VF, SR-IOV |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1733974 |
Description: Running heavy traffic (such as 'ping flood') while bringing up and down other mlx5 interfaces may result in “INFO: rcu_preempt dectected stalls on CPUS/tasks:” call traces. |
Workaround: N/A |
|
Keywords: mlx5 |
|
Discovered in Release: 4.6-1.0.1.1 |
|
- |
Description: On ConnectX-6 HCAs and above, an attempt to configure advertisement (any bitmap) will result in advertising the whole capabilities. |
Workaround: N/A |
|
Keywords: 200GbE, advertisement, Ethtool |
|
Discovered in Release: 4.6-1.0.1.1 |
Internal Ref. Number |
Issue |
581631 |
Description: GID entries referenced to by a certain user application cannot be deleted while that user application is running. |
Workaround: N/A |
|
Keywords: RoCE, GID |
|
Discovered in Release: 4.5-1.0.1.0 |
|
1403313 |
Description: Attempting to allocate an excessive number of VFs per PF in operating systems with kernel versions below v4.15 might fail due to a known issue in the Kernel. |
Workaround: Make sure to update the Kernel version to v4.15 or above. |
|
Keywords: VF, PF, IOMMU, Kernel, OS |
|
Discovered in Release: 4.5-1.0.1.0 |
|
1521877 |
Description: On SLES 12 SP1 OSs, a kernel tracepoint issue may cause undefined behavior when inserting a kernel module with a wrong parameter. |
Workaround: N/A |
|
Keywords: mlx5 driver, SLES 12 SP1 |
|
Discovered in Release: 4.5-1.0.1.0 |
Internal Ref. Number |
Issue |
504073 |
Description: When using ConnectX-5 with LRO over PPC systems, the HCA might experience back pressure due to delayed PCI Write operations. In this case, bandwidth might drop from line-rate to ~35Gb/s. Packet loss or pause frames might also be observed. |
Workaround: Look for an indication of PCI back pressure (“outbound_pci_stalled_wr” counter in ethtools advancing). Disabling LRO helps reduce the back pressure and its effects. |
|
Keywords: Flow Control, LRO |
|
Discovered in Release: 4.4-1.0.0.0 |
|
1424233 |
Description: On RHEL v7.3, 7.4 and 7.5 OSs, setting IPv4-IP-forwarding will turn off LRO on existing interfaces. Turning LRO back on manually using ethtool and adding a VLAN interface may cause a warning call trace. |
Workaround: Make sure IPv4-IP-forwarding and LRO are not turned on at the same time. |
|
Keywords: IPv4 forwarding, LRO |
|
Discovered in Release: 4.4-1.0.1.0 |
|
1442507 |
Description: Retpoline support in GCC causes an increase in CPU utilization, which results in IP forwarding’s 15% performance drop. |
Workaround: N/A |
|
Keywords: Retpoline, GCC, CPU, IP forwarding, Spectre attack |
|
Discovered in Release: 4.4-1.0.1.0 |
|
1425129 |
Description: MLNX_EN cannot be installed on SLES 15 OSs using Zypper repository. |
Workaround: Install MLNX_EN using the standard installation script instead of Zypper repository. |
|
Keywords: Installation, SLES, Zypper |
|
Discovered in Release: 4.4-1.0.1.0 |
|
1241056 |
Description: When working with ConnectX-4/ConnectX-5 HCAs on PPC systems with Hardware LRO and Adaptive Rx support, bandwidth drops from full wire speed (FWS) to ~60Gb/s. |
Workaround: Make sure to disable Adaptive Rx when enabling Hardware LRO: ethtool -C <interface> adaptive-rx off ethtool -C <interface> rx-usecs 8 rx-frames 128 |
|
Keywords: Hardware LRO, Adaptive Rx, PPC |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1090612 |
Description: NVMEoF protocol does not support LBA format with non-zero metadata size. Therefore, NVMe namespace configured to LBA format with metadata size bigger than 0 will cause Enhanced Error Handling (EEH) in PowerPC systems. |
Workaround: Configure the NVMe namespace to use LBA format with zero sized metadata. |
|
Keywords: NVMEoF, PowerPC, EEH |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1309621 |
Description: In switchdev mode default configuration, stateless offloads/steering based on inner headers is not supported. |
Workaround: To enable stateless offloads/steering based on inner headers, disable encap by running: devlink dev eswitch show pci/0000:83:00.1 encap disable Or, in case devlink is not supported by the kernel, run: echo none > /sys/kernel/debug/mlx5/<BDF>/compat/encap Note: This is a hardware-related limitation. |
|
Keywords: switchdev, stateless offload, steering |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1275082 |
Description: When setting a non-default IPv6 link local address or an address that is not based on the device MAC, connection establishments over RoCEv2 might fail. |
Workaround: N/A |
|
Keywords: IPV6, RoCE, link local address |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1307336 |
Description: In RoCE LAG mode, when running ibdev2netdev -v , the port state of the second port of the mlx4_0 IB device will read “NA” since this IB device does not have a second port. |
Workaround: N/A |
|
Keywords: mlx4, RoCE LAG, ibdev2netdev, bonding |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1296355 |
Description: Number of MSI-X that can be allocated for VFs and PFs in total is limited to 2300 on Power9 platforms. |
Workaround: N/A |
|
Keywords: MSI-X, VF, PF, PPC, SR-IOV |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1259293 |
Description: On Fedora 20 operating systems, driver load fails with an error message such as: “ [185.262460] kmem_cache_sanity_check (fs_ftes_0000:00:06.0): Cache name already exists. ” This is caused by SLUB allocators grouping multiple slab kmem_cache_create into one slab cache alias to save memory and increase cache hotness. This results in the slab name to be considered stale. |
Workaround: Upgrade the kernel version to kernel-3.19.8-100.fc20.x86_64. Note that after rebooting to the new kernel, you will need to rebuild |
|
Keywords: Fedora, driver load |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1264359 |
Description: When running perftest (ib_send_bw, ib_write_bw, etc.) in rdma-cm mode, the resp_cqe_error counter under /sys/class/infiniband/mlx5_0/ports/1/hw_counters/resp_cqe_error might increase. This behavior is expected and it is a result of receive WQEs that were not consumed. |
Workaround: N/A |
|
Keywords: perftest, RDMA CM, mlx5 |
|
Discovered in Release: 4.3-1.0.1.0 |
|
1264956 |
Description: Configuring SR-IOV after disabling RoCE LAG using sysfs (/sys/bus/pci/drivers/mlx5_core/ |
Workaround: Make sure to disable RoCE LAG once again. |
|
Keywords: RoCE LAG, SR-IOV |
|
Discovered in Release: 4.3-1.0.1.0 |
Internal Ref. Number |
Issue |
1263043 |
Description: On RHEL7.4, due to an OS issue introduced in kmod package version 20-15.el7_4.6, parsing the depmod configuration files will fail, resulting in either of the following issues:
|
Workaround: Go to RedHat webpage to upgrade the kmod package version. |
|
Keywords: driver restart, kmod, kmp, nvmf, nvmet_rdma |
|
Discovered in Release: 4.2-1.2.0.0 |
|
- |
Description: Packet Size (Actual Packet MTU) limitation for IPsec offload on Innova IPsec adapter cards: The current offload implementation does not support IP fragmentation. The original packet size should be such that it does not exceed the interface's MTU size after the ESP transformation (encryption of the original IP packet which increases its length) and the headers (outer IP header) are added:
This mostly affects forwarded traffic into smaller MTU, as well as UDP traffic. TCP does PMTU discovery by default and clamps the MSS accordingly. |
Workaround: N/A |
|
Keywords: Innova IPsec, MTU |
|
Discovered in Release: 4.2-1.0.1.0 |
|
- |
Description: No LLC/SNAP support on Innova IPsec adapter cards. |
Workaround: N/A |
|
Keywords: Innova IPsec, LLC/SNAP |
|
Discovered in Release: 4.2-1.0.1.0 |
|
- |
Description: No support for FEC on Innova IPsec adapter cards. When using switches, there may be a need to change its configuration. |
Workaround: N/A |
|
Keywords: Innova IPsec, FEC |
|
Discovered in Release: 4.2-1.0.1.0 |
|
955929 |
Description: Heavy traffic may cause SYN flooding when using Innova IPsec adapter cards. |
Workaround: N/A |
|
Keywords: Innova IPsec, SYN flooding |
|
Discovered in Release: 4.2-1.0.1.0 |
|
- |
Description: Priority Based Flow Control is not supported on Innova IPsec adapter cards. |
Workaround: N/A |
|
Keywords: Innova IPsec, Priority Based Flow Control |
|
Discovered in Release: 4.2-1.0.1.0 |
|
- |
Description: Pause configuration is not supported when using Innova IPsec adapter cards. Default pause is global pause (enabled). |
Workaround: N/A |
|
Keywords: Innova IPsec, Global pause |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1045097 |
Description: Connecting and disconnecting a cable several times may cause a link up failure when using Innova IPsec adapter cards. |
Workaround: N/A |
|
Keywords: Innova IPsec, Cable, link up |
|
Discovered in Release: 4.2-1.0.1.0 |
|
- |
Description: On Innova IPsec adapter cards, supported MTU is between 512 and 2012 bytes. Setting MTU values outside this range might fail or might cause traffic loss. |
Workaround: Set MTU between 512 and 2012 bytes. |
|
Keywords: Innova IPsec, MTU |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1125184 |
Description: In old kernel versions, such as Ubuntu 14.04 and RedHat 7.1, VXLAN interface does not reply to ARP requests for a MAC address that exists in its own ARP table. This issue was fixed in the following newer kernel versions: Ubuntu 16.04 and RedHat 7.3. |
Workaround: N/A |
|
Keywords: ARP, VXLAN |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1134323 |
Description: When using kernel versions older than version 4.7 with IOMMU enabled, performance degradations and logical issues (such as soft lockup) might occur upon high load of traffic. This is caused due to the fact that IOMMU IOVA allocations are centralized, requiring many synchronization operations and high locking overhead amongst CPUs. |
Workaround: Use kernel v4.7 or above, or a backported kernel that includes the following patches:
|
|
Keywords: IOMMU, soft lockup |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1135738 |
Description: On 64k page size setups, DMA memory might run out when trying to increase the ring size/number of channels. |
Workaround: Reduce the ring size/number of channels. |
|
Keywords: DMA, 64K page |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1159650 |
Description: When configuring VF VST, VLAN-tagged outgoing packets will be dropped in case of ConnectX-4 HCAs. In case of ConnectX-5 HCAs, VLAN-tagged outgoing packets will have another VLAN tag inserted. |
Workaround: N/A |
|
Keywords: VST |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1157770 |
Description: On Passthrough/VM machines with relatively old QEMU and libvirtd, CMD timeout might occur upon driver load. After timeout, no other commands will be completed and all driver operations will be stuck. |
Workaround: Upgrade the QEMU and libvirtd on the KVM server. Tested with (Ubuntu 16.10) are the following versions:
|
|
Keywords: QEMU |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1147703 |
Description: Using dm-multipath for High Availability on top of NVMEoF block devices must be done with “directio” path checker. |
Workaround: N/A |
|
Keywords: NVMEoF |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1152408 |
Description: RedHat v7.3 PPCLE and v7.4 PPCLE operating systems do not support KVM qemu out of the box. The following error message will appear when attempting to run virt-install to create new VMs: Cant find qemu-kvm packge to install |
Workaround: Acquire the following rpms from the beta version of 7.4ALT to 7.3/7.4 PPCLE (in the same order):
|
|
Keywords: Virtualization, PPC, Power8, KVM, RedHat, PPC64LE |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1012719 |
Description: A soft lockup in the CQ polling flow might occur when running very high stress on the GSI QP (RDMA-CM applications). This is a transient situation from which the driver will later recover. |
Workaround: N/A |
|
Keywords: RDMA-CM, GSI QP, CQ |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1078630 |
Description: When working in RoCE LAG over kernel v3.10, a kernel crash might occur when unloading the driver as the Network Manager is running. |
Workaround: Stop the Network Manager before unloading the driver and start it back once the driver unload is complete. |
|
Keywords: RoCE LAG, network manager |
|
Discovered in Release: 4.2-1.0.1.0 |
|
1149557 |
Description: When setting VGT+, the maximal number of allowed VLAN IDs presented in the sysfs is 813 (up to the first 813). |
Workaround: N/A |
|
Keywords: VGT+ |
|
Discovered in Release: 4.2-1.0.1.0 |
Internal Ref. Number |
Issue |
995665/1165919 |
Description: In kernels below v4.13, connection between NVMEoF host and target cannot be established in a hyper-threaded system with more than 1 socket. |
Workaround: On the host side, connect to NVMEoF subsystem using --nr-io-queues <num_queues> flag. Note that num_queues must be lower or equal to num_sockets multiplied with num_cores_per_socket. |
|
Keywords: NVMEoF |
|
1039346 |
Description: Enabling multiple namespaces per subsystem while using NVMEoF target offload is not supported. |
Workaround: To enable more than one namespace, create a subsystem for each one. |
|
Keywords: NVMEoF Target Offload, namespace |
|
1030301 |
Description: Creating virtual functions on a device that is in LAG mode will destroy the LAG configuration. The boding device over the Ethernet NICs will continue to work as expected. |
Workaround: N/A |
|
Keywords: LAG, SR-IOV |
|
1047616 |
Description: When node GUID of a device is set to zero (0000:0000:0000:0000), RDMA_CM user space application may crash. |
Workaround: Set node GUID to a nonzero value. |
|
Keywords: RDMA_CM |
|
1051701 |
Description: New versions of iproute which support new kernel features may misbehave on old kernels that do not support these new features. |
Workaround: N/A |
|
Keywords: iproute |
|
1007830 |
Description: When working on Xenserver hypervisor with SR-IOV enabled on it, make sure the following instructions are applied:
|
Workaround: N/A |
|
Keywords: SR-IOV |
|
1005786 |
Description: When using ConnectX-5 adapter cards, the following error might be printed to dmesg, indicating temporary lack of DMA pages: “mlx5_core ... give_pages:289:(pid x): Y pages alloc time exceeded the max permitted duration mlx5_core ... page_notify_fail:263:(pid x): Page allocation failure notification on func_id(z) sent to fw mlx5_core ... pages_work_handler:471:(pid x): give fail -12” Example: This might happen when trying to open more than 64 VFs per port. |
Workaround: N/A |
|
Keywords: mlx5_core, DMA |
|
1008066/1009004 |
Description: Performing some operations on the user end during reboot might cause call trace/panic, due to bugs found in the Linux kernel. For example: Running get_vf_stats (via iptool) during reboot. |
Workaround: N/A |
|
Keywords: mlx5_core, reboot |
|
1009488 |
Description: Mounting MLNX_EN to a path that contains special characters, such as parenthesis or spaces is not supported. For example, when mounting MLNX_EN to “/media/CDROM(vcd)/”, installation will fail and the following error message will be displayed: # cd /media/CDROM\(vcd\)/ # ./install sh: 1: Syntax error: "(" unexpected |
Workaround: N/A |
|
Keywords: Installation |
|
982144 |
Description: When offload traffic sniffer is on, the bandwidth could decrease up to 50%. |
Workaround: N/A |
|
Keywords: Offload Traffic Sniffer |
|
981362 |
Description: On several OSs, setting a number of TC is not supported via the tc tool. |
Workaround: Set the number of TC via the /sys/class/net/ |
|
Keywords: Ethernet, TC |
|
979457 |
Description: When setting IOMMU=ON, a severe performance degradation may occur due to a bug in IOMMU. |
Workaround: Make sure the following patches are found in your kernel:
Note: These patches are already available in Ubuntu 16.04.02 and 17.04 OSs. |
|
Keywords: Performance, IOMMU |