Known Issues
The following is a list of general limitations and known issues of the current version of the release.
Internal Ref. Number |
Issue |
3916656 |
Description: If multicast join request is received during the Heavy Sweep, the InfiniBand Subnet Manager 5.19 may hang. |
Keywords: Multicast, Subnet Manager |
|
Workaround: Restart the Subnet Manager. For a comprehensive solution, upgrade to Subnet Manager version 5.19.1 which includes the fix from NVIDIA website or contact NVIDIA Support. |
|
Discovered in Release: 24.04-0.6.6.0 |
|
3856101 |
Description: In Debian 12, using dhcpcd instead of dhclient to configure the network interface (using Networkmanager) will result in wrong network interface configuration. |
Keywords: dhcpcd, dhclient, Debian 12, Networkmanager |
|
Workaround: Use dhclient to configure the network interface. |
|
Discovered in Release: 24.04-0.6.6.0 |
|
3964215 |
Description: Driver might try to access privileged registers resulting in an error with syndrome. |
Keywords: Unbind and bind the function or restart the driver. |
|
Workaround: N/A |
|
Discovered in Release: 24.04-0.6.6.0 |
|
3640907 |
Description: When using a kernel version lower than v5.5, application termination on PCIe Gen5 servers could lead to kernel problems, such as IOMMU call traces, because of a lack of support in the AMD IOMMU kernel component. |
Keywords: PCIe Gen5, IOMMU, Call Trace |
|
Workaround: To resolve the issue either:
or
|
|
Discovered in Release: 24.04-0.6.6.0 |
|
3004304 |
Description: Setting NVMe num_p2p_queues module parameter value to be greater than 0, may cause a harmless warning "irq #XXX: nobody cared" with Call Trace afterwards. |
Keywords: NVMe, Call Trace, num_p2p_queues |
|
Workaround: N/A |
|
Discovered in Release: 24.01-0.3.3.1 |
|
3735400 |
Description: The NVMF connect command does not work on IB setups when AR (Adaptive Routing) is enabled, since the PI (the Protection Information that is used by the NVMF) and AR are not supported simultaneously . |
Keywords: NVMF connect, PI, Adaptive Routing |
|
Workaround: Disable the AR at the opensm, or, alternatively, disable the PI at the nvme_rdma with a new module parameter. |
|
Discovered in Release: 24.01-0.3.3.1 |
|
3774149 |
Description: In some cases, there could be a race condition between RDMA_WRITE and shared memory write, leading to the MPI receiving invalid data with large messages or collective operations between ranks on the same node. |
Keywords: Race condition, RDMA_WRITE, shared memory write |
|
Workaround: Set UCX_RNDV_SCHEME=get_zcopy to force using RDMA_READ protocol. |
|
Discovered in Release: 24.01-0.3.3.1 |
|
3565433 |
Description: An error may occur when creating a DCI due to oversized WQEs. This is caused by a loose enforcement of the allowed max quantity of SGEs. |
Keywords: DCI, SGEs |
|
Workaround: N/A |
|
Discovered in Release: 24.01-0.3.3.1 |
|
3732632 |
Description: Geneve offload does not opeate together with FLEX_PARSER. |
Keywords: Geneve offload, FLEX_PARSER |
|
Workaround: Make sure that the firmware is appropriately configured by verifying that the FLEX_PARSER_PROFILE_ENABLE mlxconfig flag is set to 0. |
|
Discovered in Release: 24.01-0.3.3.1 |
|
3644590 |
Description: When working in switchdev mode, the number of XFRM IN rules that can be added is limited to 2047. |
Keywords: switchdev mode, XFRM IN rules |
|
Workaround: N/A |
|
Discovered in Release: 24.01-0.3.3.1 |
|
3563584 |
Description: In case of a steering loop, the packet would loop indefinitely, causing a device hang. |
Keywords: Steering loop |
|
Workaround: Enable firmware infinite loop protection. |
|
Discovered in Release: 24.01-0.3.3.1 |
Internal Ref. Number |
Issue |
3678715 |
Description: When attempting to restart drivers using openIbd service while the nvme_rdma module is loaded, the process may fail. This behavior is intentional, as unloading nvme_rdma during the driver restart can lead to connectivity issues in other applications within the setup. |
Keywords: openIbd service, nvme_rdma module |
|
Workaround: Manually unload the nvme_rdma module before performing the driver restart. This can be achieved using the modprobe -r nvme_rdma command. |
|
Discovered in Release: 23.10-1.1.9.0 |
|
3676223 |
Description: When using kernel version 4.12 or above, it is advised to run echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_drivers_autoprobe to avoid VF probing |
Keywords: VF probing |
|
Workaround: N/A |
|
Discovered in Release: 23.10-1.1.9.0 |
|
3682658 |
Description: While using the RDMA-CM user application and the AF_IB parameter, the kernel uses only the first byte of the private data to set the CMA version. In such scenario, any user data written to this byte will be overwritten. |
Keywords: RDMA-CM user application, AF_IB, private data |
|
Workaround: Do not use AF_IB for application's private data. |
|
Discovered in Release: 23.10-0.5.5.0 |
|
3640082 |
Description: A potential null pointer dereference might occur due to a missing update in the PCI subsystem code when creating the maximum number of VFs. All kernel versions lacking the following fix are impacted: "PCI: Avoid enabling PCI atomics on VFs." |
Keywords: Maximal VF number |
|
Workaround: N/A |
|
Discovered in Release: 23.10-0.5.5.0 |
|
3653417 |
Description: When offloading IPsec policy rules while in legacy mode there are two options:
2. Changing the steering mode to firmware steering will return unsupported. |
Keywords: IPsec, legacy mode |
|
Workaround: Perform a devlink reload after changing the steering mode. |
|
Discovered in Release: 23.10-0.5.5.0 |
|
3612274 |
Description: Currently, either IPsec offload or TC offload for a specific interface is allowed. The offloading TC rule to an interface will fail if an IPSec rule is already offloaded on it, and vice-versa. |
Keywords: IPsec offload, TC offload |
|
Workaround: N/A |
|
Discovered in Release: 23.10-0.5.5.0 |
|
3596126 |
Description: OVS mirroring of both egress and ingress together with modified TTL is not supported by Connectx-5 cards, and may cause packets checksum issues and errors in the dmesg command. |
Keywords: OVS mirroring, Connectx-5 |
|
Workaround: N/A |
|
Discovered in Release: 23.10-0.5.5.0 |
|
3538463 |
Description: A Kernel ABI problem in Sles15SP4 may lead to issues during driver start. This impacts kernels starting from version 5.14.21-150400.24.11.1 up to version 5.14.21-150400.24.63.1 (July 2022 to May 2023), inclusive. For more information, see https://www.suse.com/support/kb/doc/?id=000021137. |
Keywords: Kernel ABI, Sles15SP4, driver start |
|
Workaround: Upgrade to a kernel version newer than 5.14.21-150400.24.63.1 (May 2023). |
|
Discovered in Release: 23.10-0.5.5.0 |
|
3637252 |
Description: When running over REHL7.6 with excessive RDMA/RoCE workload, kernel warnings may be triggered. |
Keywords: REHL7.6, RDMA, RoCE |
|
Workaround: N/A |
|
Discovered in Release: 23.10-0.5.5.0 |
Internal Ref. Number |
Issue |
3046655 |
Description: A package manager upgrade with zypper (on an SLES system) may prompt a question about vendor change from "Mellanox Technologies" to "OpenFabrics". |
Keywords: Installation, SLES |
|
Workaround: Either accept the prompted change, or add the /etc/zypp/vendors.d/mlnx_ofed file with the following content: [main] vendors = Mellanox,OpenFabrics |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3392477 |
Description: The ConnectX-7 firmware embedded in this MLNX_OFED version cannot be burnt using the MLNX_OFED installer script. |
Keywords: ConnectX-7, MLNX_OFED installer script |
|
Workaround: Please download and install the dedicated firmware from the web https://network.nvidia.com/support/firmware/connectx7ib/ |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3532756 |
Description: The kernel may crash when restarting the driver while IP sec rules are configured. |
Keywords: IP sec |
|
Workaround: Flush the IP sec configuration before reloading the driver: ip xfrm state flush ip xfrm policy flush |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3472979 |
Description: When a large number of virtual functions are present, the output of the "ip link show" command may be truncated. |
Keywords: virtual functions, ip link show |
|
Workaround: N/A |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3413938 |
Description: When using the mlnx-sf script, creating and deleting an SF with the same ID number in a stressful manner may cause the setup to hang due to a race between the create and delete commands. |
Keywords: Hang; mlnx-sf |
|
Workaround: N/A |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3461572 |
Description: Configuring Multiport Eswitch LAG mode can be performed only via devlink from this release onwards. The compat sysfs should not be used to configure mpesw LAG. |
Keywords: Multiport Eswitch, compat sysfs, mpesw LAG |
|
Workaround: N/A |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3464337 |
Description: Simultaneously adding or removing TC rules while operating on kernel version 6.3 could potentially result in stability issues. |
Keywords: ASAP, rules, TC |
|
Workaround: Make sure the following fix is part of the kernel: https://lore.kernel.org/netdev/20230504181616.2834983-3-vladbu@nvidia.com/T/ |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3469484 |
Description: Mirror and connection tracking (CT) offload actions are not supported simultaneously if the kernel version does not support hardware miss to TC actions. Thus, when performing a CT offload test, the actual number of offloaded connections may be lower than expected. |
Keywords: ASAP, CT offload |
|
Workaround: Make sure to have the following offending commit in the tree: net/sched: act_ct: offload UDP NEW connections Make sure to to have https://www.spinics.net/lists/stable-commits/msg303536.html in the kernel tree to fix this issue. |
|
Discovered in Release: 23.07-0.5.0.0 |
|
3473331 |
Description: When performing a CT offload test, the actual number of offloaded connections may be lower than expected. |
Keywords: ASAP, CT offload |
|
Workaround: The fix is external to the driver, make sure to have this commit in the tree: offending commit: net/sched: act_ct: offload UDP NEW connections Make sure you have: https://www.spinics.net/lists/stable-commits/msg303536.html in the kernel tree to fix this issue. |
|
Discovered in Release: 23.07-0.5.0.0 |
Internal Ref. Number |
Issue |
3360710 |
Description: Configuring PFC in parallel to buffer size and prio2buffer commands may lead to misalignment between firmware and software in regards to receiving buffer ownership. |
Keywords: NetDev, PFC, Buffer Size, prio2buffer |
|
Workaround: First, configure PFC on all ports, and then perform other needed QoS (i.e., buffer_size or prio2buffer) configurations accordingly. |
|
Discovered in Release: 23.04-0.5.3.3 |
|
3413879 |
Description: OpenSM may not be started automatically if chkconfig was not installed before OpenSM is installed. Note, however, that chkconfig will fail to install if the directory (rather than symbolic link to directory) /etc/init.d already exists (e.g., from a previous installation of MLNX_OFED). |
Keywords: Installation, OpenSM, chkconfig |
|
Workaround: Install chkconfig before installing MLNX_OFED. If installing it fails, make sure /etc/init.d does not exist at the time of installing it. |
|
Discovered in Release: 23.04-0.5.3.3 |
|
3424596 |
Description: On SLES 15.4, installing MLNX_OFED using a package repository (with zypper) may trigger an error message about missing dependency for 'librte_eal.so.20.0()(64bit)' . This is because the inbox package libdpdk-20_0 is being uninstalled as it is incompatible with the MLNX_OFED rdma-core packages. |
Keywords: Installation, SLES 15.4 |
|
Workaround: Uninstall the relevant packages: 'zypper uninstall libdpdk-20_0' before installing MLNX_OFED. This will also remove the inbox openvswitch package. |
|
Discovered in Release: 23.04-0.5.3.3 |
|
3433416 |
Description: On systems that were installed with MLNX_OFED 5.9 or older and include a CUDA package (ucx-cuda / hcoll-cuda), an upgrade to MLNX_OFED 23.04 using the package manager ("yum") method will fail. This is because MLNX_OFED up to 5.9 is built with CUDA 11. MLNX_OFED 23.04 is built with CUDA 12 and those CUDA versions are incompatible. |
Keywords: Installation, CUDA, yum |
|
Workaround: Remove CUDA packages included with OFED (ucx-cuda, hcoll-cuda) before upgrading. This will allow to upgrade MLNX_OFED regardless of CUDA version installed. To install them later, CUDA 12 must be installed on the system. |
|
Discovered in Release: 23.04-0.5.3.3 |
|
3420831 |
Description: mlx-steering-dump is not supported on systems in which Python3 is not the default. |
Keywords: mlx-steering-dump, Python3 |
|
Workaround: N/A |
|
Discovered in Release: 23.04-0.5.3.3 |
|
3351989 |
Description: If the underlying persistent device name exceeds 15 characters in length, the operating system will not be able to perform renaming (i.e., the device name will remain "eth |
Keywords: Persistant Interface Names |
|
Workaround: Add the --copy-ifnames-udev flag to the OFED installation command. Note that this flag is only applicable if the persistent name provided by the kernel, without the 'np |
|
Discovered in Release: 23.04-0.5.3.3 |
Internal Ref. Number |
Issue |
3324094 |
Description: When working in legacy rq (striding rq off), with large MTU > 3712, a 10-20% degradation in performance might be seen when running UDP stream with 64 bytes message size. |
Keywords: NetDev, MTU, UDP Stream |
|
Workaround: N/A |
|
Discovered in Release: 5.9-0.5.6.0 |
|
3313137 |
Description: Virtual Functions depend on Physical Functions for device access (e.g, firmware host PAGE management). In addition, VF may need to access safely the PF 'driver data' to use the command interface as in the VFIO usage to support live migration. While the PF is missing its driver, the VFs are completely unusable. As such, upon PF unload, the SR-IOV is disabled by the PF itself. This is the standard widely seen behavior in Linux drivers today. |
Keywords: Core, SR-IOV, VF, PF |
|
Workaround: N/A |
|
Discovered in Release: 5.9-0.5.6.0 |
|
3320947 |
Description: When the system is overloaded, there is a possibility that one hour will pass between the creation of DevLink port and it usage/assignment, due to some locking. This will trigger a trace starting with: "Type was not set for devlink port." |
Keywords: Core, DevLink, System Overload |
|
Workaround: N/A |
|
Discovered in Release: 5.9-0.5.6.0 |
|
3046222 |
Description: Installing OFED with Open vSwitch packages failed over Ubuntu22 OS with inbox Open vSwitch installed on it. Inbox Open vSwitch packages should be removed first. |
Keywords: Installation, Ubuntu22 |
|
Workaround: Use --with-openvswitch flag along with the installation command. |
|
Discovered in Release: 5.9-0.5.6.0 |
|
3262725 |
Description: Devlink reload while deleting namespace may causes a deadlock on kernels older than Linux-6.0. |
Keywords: Devlink, Namespace |
|
Workaround: N/A |
|
Discovered in Release: 5.9-0.5.6.0 |
|
3253255 |
Description: RHEL 7 does not include built-in support for Python3. There are two potential ways to install it, and both install a package with a different name: 1. EPEL for RHEL7: python36 2. Rhel extra repository Python3 support is needed for using Pyverbs and the Python support of Open vSwitch. MLNX_OFED assumes that on RHEL7.x, if using Python3, that python36 from EPEL is used (otherwise the optional Python3 support cannot be used). |
Keywords: RHEL7, Python3 |
|
Workaround: To use Python3 support on RHEL7, install python36 from the RHEL7 EPEL repository. |
|
Discovered in Release: 5.9-0.5.6.0 |
Internal Ref. Number |
Issue |
3215514 |
Description: On EulerOS 2.0SP11, installation with the yum method may fail with an error that mlnx-iproute2 is missing a dependency on libdb-5.3.so()(64bit). |
Keywords: Installation, EulerOS 2.0SP11, yum |
|
Workaround: Install in advance the mlnx-iproute2 package with rpm and with the --nodeps option. For example: rpm -Uv --nodeps RPMS/mlnx-iproute2-5.19.0-1.58101.x86_64.rpm |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3191223 |
Description: In old kernels, /etc/init.d/openibd stop will fail because of an existing TC rule. Because mlx5_ib is already unloaded, mlx5_core and mlx5_ib will be in an inconsistent state. |
Keywords: ASAP2, eSwitch, TC Rules |
|
Workaround: Set eSwitch mode to legacy before enabling SR-IOV or reload mlx5_core to change eSwitch mode to legacy. |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3199628 |
Description: ping -6 -i <interface name> is broken in v5.18. |
Keywords: NetDev, -i flag |
|
Workaround: In all operating systems that are running Kernel 5.18 and below, remove the -i flag. |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3002932 |
Description: Jumbo MTU must be set on all uplinks (i.e., uplinks of *_sf and *_sf_r) at all times. |
Keywords: NetDev, MTU, Uplink |
|
Workaround: Configure jumbo MTU (9216) on all uplink-related interfaces. |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3130859 |
Description: The yum install method might be broken on installer regenerated with --add-kernel-support-build-only. |
Keywords: Installation, yum |
|
Workaround: Delete the original mlnx-ofed-all-5.* package and recreate the repository with: createrepo RPMS/ |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3149387 |
Description: The package neohost-backend (included in MLNX_OFED) has a strict dependency on Python 2.7 and on the existance of /usr/bin/python. This dependency is because of a pre-installation test (which is a rather non-standard method) for /usr/bin/python will fail the installation if without Python 2.7. As a result, default installation of this on newer systems that do not have a default of Python 2 has been disabled. If there is an explicit request for this installation using the command-line option --with-neohost-backend, this sanity check will be overriden and there will be an attempt to install it regardless. On newer systems, there is likely to not be /usr/bin/python even if Python 2 is installed; as such its installation will fail. |
Keywords: Installation, Python 2 |
|
Workaround: If neohost-backend is needed on a newer system, install Python 2 in advance and create the symbolic link /usr/bin/python -> python2. |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3213777 |
Description: Oracle Enterprise Linux version 9.0 generates kernel module packages that have dependencies that are not provided by their own kernel RPM packages and thus are not installable. |
Keywords: Installation, Oracle Enterprise Linux v9.0 |
|
Workaround: N/A |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3229904 |
Description: Restart driver failes to load OFED modules after installing OFED on SLES15sp4 with errata kernel 5.14.21-150400.24.21-default. |
Keywords: Installation |
|
Workaround: Install OFED with --add-kernel-support flag. |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3189424 |
Description: VLAN naming is limited to 16 characters (like all other interface names). For names longer than 16 charachters, the kernel generates its own interface name VLAN (VID). |
Keywords: Core, VLAN, Interface Name |
|
Workaround: Select a name which complies to the 16-characters limitation. |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3220855 |
Description: Creating external SFs on BF ARM when the host (x86) operating system does not support SFs may cause the host to crash. |
Keywords: Core, Scalable Functions |
|
Workaround: N/A |
|
Discovered in Release: 5.8- 1.0.1.1 |
|
3239291 |
Description: In some topologies, like logical partitions, mlxfwreset is not supported. |
Keywords: Core, mlxfwreset |
|
Workaround: N/A |
|
Discovered in Release: 5.8- 1.0.1.1 |
Internal Ref. Number |
Issue |
3114823 |
Description: The first attempt to create a new iSER connection fails with the following messages in dmesg: iSCSI Login timeout on Network Portal <iSER_Target_IP_ADDR>:3260 After the error, the iSER Initiator connects to the Target successfully, but the memory allocated for the first connection is not freed correctly. As a result, the failed attempt also causes memory leakage.
The error happens due to a bug in the scsi_transport_iscsi module, which is not a part of MLNX_OFED. As such, the issue cannot be fixed in MLNX_OFED. The bug is already fixed in kernel 5.19 by the commit f6eed15f3ea7 ("scsi: iscsi: Exclude zero from the endpoint ID range"). |
Workaround: Update the kernel if the above errors are experienced. If the issue is still reproduced after the kernel update, ask your distro support to apply the bug fix from the upstream kernel. |
|
Keywords: iSER Initiator |
|
Discovered in Release: 5.7-1.0.2.0 |
|
3096911 |
Description: Installing chkconfig on Rhel9.0 with OFED using yum failed (chkconfig creates /etc/init.d sym link and OFED creates files in this directory, causing a conflict). |
Workaround: Installing chkconfig before OFED. |
|
Keywords: Installation |
|
Discovered in Release: 5.7-1.0.2.0 |
|
3100544 |
Description: On a RHEL9.x system, in some cases where inbox modules do not match for the drivers being build, rebuilding the drivers (--add-kernel-support) works, but fails to install the built package, with many errors such as: kernel(__rdma_block_iter_next) = 0x8e7528da is needed by mlnx-ofa_kernel-modules-5.6-OFED.5.6.2.0.9.1.kver.5.14.0_70.13.1.el9_0.aarch64.aarch64 This was caused by a bug in the scripts that creates the Requires and Provides headers that is confused by dependencies between different modules of the same external package. |
Workaround: dnf install kernel-modules- |
|
Keywords: Installation, RHEL9.x |
|
Discovered in Release: 5.7-1.0.2.0 |
|
3132158 |
Description: Building rdma-core package on Rocky 8.6 OS caused failure in OFED build. |
Workaround: N/A |
|
Keywords: Installation |
|
Discovered in Release: 5.7-1.0.2.0 |
|
3137440 |
Description: Python package is missing, need to install it manually. |
Workaround: Install Python before starting the build. |
|
Keywords: Installation, Python |
|
Discovered in Release: 5.7-1.0.2.0 |
|
3141506 |
Description: kernel-macros package does not support building with KMP enabled. KMP needs to be disabled. |
Workaround: Build and install MOFED with KMP disabled (without --kmp flag). |
|
Keywords: Installation |
|
Discovered in Release: 5.7-1.0.2.0 |
|
3141506 |
Description: kernel-macros package does not support building with KMP enabled. KMP needs to be disabled. |
Workaround: Build and install MOFED with KMP disabled (without --kmp flag). |
|
Keywords: Installation |
|
Discovered in Release: 5.7-1.0.2.0 |
|
3129627 |
Description: Kernel module packaging is not supported in CtyunOS. |
Workaround: N/A |
|
Keywords: Installation |
|
Discovered in Release: 5.7-1.0.2.0 |
|
2971708 |
Description: For OSs in which Devlink supports setting roce-enable/disable, both sysfs roce_enable show and sysfs roce_enable set are disabled, and the RoCE state must be managed exclusively via Devlink. The sysfs interface for roce-enable/disable will be removed entirely for these OSs in a future release. To determine if Devlink can be used to enable or disable RoCE, execute the following console command after starting OFED:
Devlink supports roce enable/disable if the following line is reflected in the output:
For OSs which do not allow enabling/disabling RoCE via Devlink, the sysfs interface behaves as in the previous 2 releases:
|
Workaround: N/A |
|
Keywords: Enabling/Disabling RoCE |
|
Discovered in Release: 5.7-1.0.2.0 |
Internal Ref. Number |
Issue |
2998194 |
Description: On some systems with many (e.g., 64) virtual functions (VFs) attached to a ConnectX interface, 'ip link' may give an error message: "Error: Buffer too small for object." This applies to both IP commands: the inbox iproute package in RHEL8.x and the mlnx-iproute2 package from MLNX_OFED. This is known to work well and not give an error in RHEL7.x kernel regardless of what user-space package is used (including user-space from RHEL8.x). |
Workaround: N/A |
|
Keywords: NetDev, RHEL, Virtual Functions |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3045436 |
Description: Rebooting the host while the Arm is down may block the shutdown flow till the Arm is up. |
Workaround: Restart the driver on the host side before reboot. |
|
Keywords: Reboot, Arm |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3040350 |
Description:
|
Workaround:
|
|
Keywords: OVS-DPDK, Bridge, Offload |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2973726 |
Description: dec_ttl only work with ConnectX-6. It does not work with ConnectX-5. |
Workaround: N/A |
|
Keywords: OVS-DPDK, dec_ttl |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2946873 |
Description: Moving to switchdev mode while deleting namespace may cause a deadlock. |
Workaround: Unload mlx5_ib module before moving to Switchdev mode. |
|
Keywords: ASAP2, Switchdev, Namespace |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2811957 |
Description: If a system is run from a network boot and is connected to the network storage through an NVIDIA ConnectX card, unloading the mlx5_core driver (such as running '/etc/init.d/openibd restart') will render the system unusable and should therefore be avoided. |
Workaround: N/A |
|
Keywords: Installation, mlx5_core |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2979243 |
Description: The kernel in CentOS 7.6alt (for non-x86 architectures) is different than that of RHEL 7.6alt. Some of the MLNX_OFED kernel modules that were built for the RHEL7.6alt kernel will not load on a system with Centos7.6alt kernel. If you want to install MLNX_OFED on such a system, you should use ./mlnxofedinstall --add-kernelsupport to rebuild the kernel modules for the Centos kernel. |
Workaround: Use add-kernel-support. |
|
Keywords: Installation,CentOS |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3011440 |
Description: In Debian 11.2, Ubuntu 21.10, and Ubuntu 22.04, attempting to install an "exact" type of metapackage (such as mlnx-ofed-all-exact or mlnx-ofed-basic-exact) may fail with an error regarding the version of mstflint. |
Workaround: Install also mstflint of the exact same version (e.g., apt install mlnx-ofed-all-exact mstflint=4.16.0-1.56xxxx). |
|
Keywords: Installation,Debian, Ubuntu, MST |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3024520 |
Description: The option --copy-ifnames-udev copy some files under /etc (/etc/udev/rules.d/82-net-setup-link.rules and /etc/infiniband/vf-net-link-name.sh) that are never removed--not in the case this option is not given and not upon uninstallation. Those scripts are merely examples. They are files under /etc to be maintained by the user. |
Workaround: Remove the files, if needed. |
|
Keywords: Installation |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3046601 |
Description: When rebuilding the kernel modules (--add-kernel-support) for some kernel versions (specifically mainline 4.14) do not unset LDFLAGS properly. Rebuilding xpmem in such a case may fail with the error such as "unrecognized option '-Wl,-z,relro'" in the xpmem build log. |
Workaround: Either disable building xpmem by adding --without-xpmem to the command line, or edit the kernel Makefile to make it unset LDFLAGS:
Note: The Makefile may be located elsewhere, such as the top-level directory of the kernel source directory. |
|
Keywords: Installation, SLES |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3048411 |
Description: After installing OFED with rebuilt kernel modules, error messages indicating that the kernel module mlx5_ib failed to load (e.g. "mlx5_ib: Unknown symbol . . .") appear. These messages could be safely ignored because the module eventually loads. |
Workaround: Run the command 'dracut -f' to update the initramfs. |
|
Keywords: Installation |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3048444 |
Description: OFED installation failed using yum for --add-kernel-support option (building packages without KMP enabled) if libfabric package is installed. |
Workaround: Remove libfabric package before OFED installation or use installation script. |
|
Keywords: Installation, RHEL 8.5 |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3015210 |
Description: OVS topology where the tunnel device is over a VF and the VF representor is connected to a bond is not supported. |
Workaround: N/A |
|
Keywords: ASAP2, Tunnel Over VF, LAG, Connection Tracking |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3028300 |
Description: OVS metering is not support over kernel 5.17. |
Workaround: N/A |
|
Keywords: ASAP2,OVS, Meter, Kernel 5.17 |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3044255 |
Description: Destroying mlxdevm group while SF is attached to it is not supported. |
Workaround: N/A |
|
Keywords: ASAP2, mlxdevm, QoS, Group, Scalable Functions, ConnectX-6 Dx |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2900346 |
Description: On Ubuntu OS, configuring different IP addresses with different subnets to both ports 0 and 1 is currently not supported. When trying to ping from port 0 on one BlueField-2 card to port 0 on the other BlueField-2 card, then both port 0 and port 1 on the receiving side send a reply to the ARP request (a.k.a, ARP flux). |
Workaround: N/A |
|
Keywords: BlueField-2, Ubuntu, ARP Flux |
|
Discovered in Release: 5.6-1.0.3.3 |
|
3046456 |
Description: Switching between SwitchDev mode and legacy mode quickly on BlueField-2 can prevent the driver from loading successfully and breaks its health recovery. |
Workaround: Pause 60 seconds between state-altering commands to guarantee the driver health recovery is completed successfully. |
|
Keywords: ASAP2, Health Recovery |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2934149 |
Description: Adding vDPA ports over ConnectX-5 devices in ovs-dpdk is not supported and will cause a crash. |
Workaround: N/A |
|
Keywords: OVS-DPDK, ConnectX-5 |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2934833 |
Description: Running I/O traffic and toggling both physical ports status (UP/DOWN) in a stressful manner on the receiving-end machine may cause traffic loss. |
Workaround: N/A |
|
Keywords: RDMA, Port Toggle |
|
Discovered in Release: 5.6-1.0.3.3 |
|
2901514 |
Description: Relaxed Ordering is not working properly on Virtual Functions. |
Workaround: N/A |
|
Keywords: Relaxed Ordering, VF |
|
Discovered in Release: 5.6-1.0.3.3 |
Internal Ref. Number |
Issue |
2870299 |
Description: Managing SFs is possible using the iproute2 with mlxdevm tool only. |
Workaround: N/A |
|
Keywords: Scalable Functions |
|
Discovered in Release: 5.5-1.0.3.2 |
|
2869722 |
Description: OFED packages were built with DKMS disabled since building OFED with DKMS failed due to a problem in the DKMS package on UOS. --dkms flag should not be used. |
Workaround: N/A |
|
Keywords: Installation, DKMS |
|
Discovered in Release: 5.5-1.0.3.2 |
|
2870367 |
Description: On UOS, IPoIB PKEY may require manual bring up after driver restart. |
Workaround: N/A |
|
Keywords: Installation, IPoIB, PKEY |
|
Discovered in Release: 5.5-1.0.3.2 |
|
2836032 |
Description: When using Software steering mlx5dv_dr API to create rules containing encapsulation actions in MLNX_OFED v5.5-1.x.x.x, upgrade firmware to the latest version. Otherwise, the maximum number of encapsulation actions that can be created will be limited to only 16K, and degradation for the rule insertion rate is expected compared to MLNX_OFED v5.4-.x.x.x.x. |
Workaround: N/A |
|
Keywords: Software Steering |
|
Discovered in Release: 5.5-1.0.3.2 |
|
2851639 |
Description: Enabling ARFS in legacy mode and then moving to switchdev mode is not supported and may cause unwanted behavior. |
Workaround: N/A |
|
Keywords: NetDev, ARFS |
|
Discovered in Release: 5.5-1.0.3.2 |
|
2851639 |
Description: nvme and iser are not enabled on UOS ARM, because of missing UOS kernel support. |
Workaround: N/A |
|
Keywords: nvme, iser, UOS ARM |
|
Discovered in Release: 5.5-1.0.3.2 |
|
2860855 |
Description: Building OFED on RHEL 8.4 with kmp disabled and then installing with yum fails due to some conflicting packages. |
Workaround: Remove libfabric and librpmem packages before OFED installation,or add --allowerasing option to the installation command. |
|
Keywords: Installation, RHEL 8.4, kmp, yum |
|
Discovered in Release: 5.5-1.0.3.2 |
|
2865983 |
Description: OFED packages were built with kmp disabled. Building with kmp enabled fails due to missing packages. |
Workaround: N/A |
|
Keywords: Installation, kmp |
|
Discovered in Release: 5.5-1.0.3.2 |
Internal Ref. Number |
Issue |
2658644 |
Description: Only match on lower 32 bit of ct_label is supported. |
Workaround: N/A |
|
Keywords: ASAP2, Connection Tracking |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2706345 |
Description: Number of RQ and TIR allocation in the driver depends on total number of MSI-X vectors allocated. Total number of TIRs supported by device is 16K range. Each representor needs number of CPUs worth TIRs, upto maximum of 128. |
Workaround: To use large number of VFs, set PF_NUM_PF_MSIX to a smaller value of around 32. |
|
Keywords: ASAP2,VF, PF_NUM_PF_MSIX |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2836997 |
Description: An automatic test that checks a flow meter rate fluctuation stays within a fixed threshold (e.g., 10%) may fail because meter precision is dependent on multiple factors (i.e., rate and burst values and shape of the traffic). To pick the best configuration parameters for a flow meter, perform a couple of test measurements using different values of burst size against expected traffic workload and average the results over an extended period of time (tens of minutes). |
Workaround: N/A |
|
Keywords: ASAP2,Meter Threshold |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2863456 |
Description: SA limit by packet count (hard and soft) are supported only on traffic originated from the ECPF. Trying to configure them on VF traffic will remove the SA when hard limit is hit, however traffic could still pass as plain text due to the tunnel offload that is used in such configuration. |
Workaround: N/A |
|
Keywords: ASAP2, IPsec Full Offload |
|
Discovered in Release: 5.4-0.5.1.1 |
|
2657392 |
Description: OFED installation caused CIFS to break in RHEL 8.4 and above. A dummy module was added so that CIFS will be disabled after OFED installation in RHEL 8.4 and above. |
Workaround: N/A |
|
Keywords: Installation, RHEL, CIFS |
|
Discovered in Release: 5.4-0.5.1.1 |
|
2800993 |
Description: OpenMPI does not support running across different operating systems and/or CPU architectures. |
Workaround: N/A |
|
Keywords: OpenMPI |
|
2399503 |
Description: O pen vSwitch is not supported on the latest operating systems containing only Python3 support. |
Workaround: N/A |
|
Keywords: Python, O pen vSwitch |
|
2657392 |
Description: OFED installation caused CIFS to break in RHEL8.4. A dummy module was added so that CIFS will be disabled after OFED installation in RHEL8.4. |
Workaround: N/A |
|
Keywords: Installation, RHEL8.4, CIFS |
|
Discovered in Release: 5.4-0.5.1.1 |
|
2782406 |
Description: Running yum update will upgrade kylin-release to a higher version. The version of this package is used for kylin10sp2 detection so the script will detect kylin 10 instead of kylin10sp2 and use its repository by mistake. |
Workaround: Because there are no special cases for kylin10sp2, the repository that was detected with adding --add-kernel-support to the installation command can be used. |
|
Keywords: Upgrade, kylin |
|
Discovered in Release: 5.4-3.0.3.0 |
|
2755632 |
Description: On dual port cards with SR-IOV, when one port link is configured to InfiniBand and the other port link is configured to Ethernet, the Ethernet port will not be able to support VST and QinQ. |
Workaround: N/A |
|
Keywords: SR-IOV, VST, QinQ |
|
Discovered in Release: 5.4-3.0.3.0 |
|
2780436 |
Description: Non-default MTU (>1500) is not supported with IPsec crypto offload and may cause packet drops. |
Workaround: N/A |
|
Keywords: IPsec, Crypto Offload, MTU |
|
Discovered in Release: 5.4-3.0.3.0 |
|
2726021 |
Description: Building packages on openEuler with kmp enabled requires kernel-rpm-macros package installed. kernel-rpm-macros-30-13.oe1 does not support -p option and kernel-rpm-macros-30-18.oe1 should be installed instead. On kylin OS, the version of kernel-rpm-macros package does not support -p option needed to support kmp, so it will stay disabled. |
Workaround: N/A |
|
Keywords: Installation, openEuler |
|
Discovered in Release: 5.4-3.0.3.0 |
Internal Ref. Number |
Issue |
2750653 |
Description: Running fragmented traffic in RHEL 8.3 (4.18.0-240.el8.x86_64) may cause call trace in build_skb. |
Workaround: Update to RHEL 8.3 z-stream 4.18.0-240.22.1.el8_3.x86_64. |
|
Keywords: RHEL 8.3, Kernel Panic, Call Trace, fr |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2629375 |
Description: Matching on CT label is only supported when matching on lower 32 bits. Full match on all 128 bits of CT label is not supported. |
Workaround: N/A |
|
Keywords: ASAP2, Connection Tracking, Label |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2707997 |
Description: Installation in the package manager mode under SLES 15.x may require user-intervention if the original libibverbs is installed. |
Workaround: zypper install --force-resolution mlnx-ofed-all |
|
Keywords: Installation, libibverbs |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2708531 |
Description: Installation in the package manager mode under SLES 15.x may require user-intervention if the original libopenvswitch is installed. |
Workaround: zypper install --force-resolution mlnx-ofed-all |
|
Keywords: Installation |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2703043 |
Description: Congested TCP lock for kTLS TX device offload traffic compromises the performance. |
Workaround: Disable TCP selective acknowledgement: echo 0 > /proc/sys/net/ipv4/tcp_sack |
|
Keywords: kTLS TX |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2676405 |
Description: If the package interface-rename is active (on XenServer, for example), the interface renaming by the OFED will not be done to eliminate conflicts. |
Workaround: N/A |
|
Keywords: Interface Renaming |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2687943 |
Description: Offload of rules which redirect from VF on one PF to VF on second PF is not supported on socket-direct devices. |
Workaround: N/A |
|
Keywords: ASAP2, Socket-Direct |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2678672 |
Description: When disabling switchdev mode, the qdisc in tunnel device cannot be destroyed and mlx5e_stats_flower() is still called by OVS resulting in NULL pointer panic and memory leak. |
Workaround: N/A |
|
Keywords: SwitchDev, mlx5, Tunnel Traffic |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2566548 |
Description: On PPC systems when EEH is enabled, running fw sync reset (either by mlxfwreset with flag --sync 1 or by devlink dev reload action fw_activate), the EEHmay catch the PCI reset and take ownership on the flow. When run few times in sequence, the EEH may also decide to disable the device. |
Workaround: Administrator may disable EEH before running firmware sync reset on the device. |
|
Keywords: PPC, EEH |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2617950 |
Description: TX port timestamp feature is supported for kernel versions 3.15 and greater. On older kernel versions, the feature will not be supported and ptp_tx |
Workaround: N/A |
|
Keywords: Ethtool |
|
Discovered in Release: 5.4-1.0.3.0 |
|
2390731 |
Description: Ethtool does not display Port Speed advertised/capability above 100Gb/s over and below kernels 5.0, even when supported. |
Workaround: N/A |
|
Keywords: Ethtool, Port Speed |
|
Discovered in Release: 5.4-1.0.3.0 |
Internal Ref. Number |
Issue |
2687198 |
Description: Activating VF/SF LAG when at least one VF/SF is still bound may lead to an internal error in the firmware. |
Workaround: Make sure all VFs/SFs are unbound prior to VF/SF LAG activation/deactivation. |
|
Keywords: VF, SF, Firmware, Binding |
|
Discovered in Release: 5.4-1.0.3.0 |
Internal Ref. Number |
Issue |
2585575 |
Description: After disabling sync reset by setting enable_remote_dev_reset to false, running firmware sync reset a few times may lead to general protection fault and system may get stuck. |
Workaround: N/A |
|
Keywords: Firmware Upgrade |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2582565 |
Description: Conducting a firmware reset or unbinding the PF while in switchdev mode may cause a kernel crash. |
Workaround: N/A |
|
Keywords: SwitchDev, ASAP2, Unbind, Firmware Reset |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2587802 |
Description: PTP synchronization may be lost while using tx_port_ts private flag. |
Workaround: Toggle private flag: ethtool --set-priv-flags |
|
Keywords: PTP Synchronization |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2574943 |
Description: When running kernel 5.8 and bellow or RHEL 8.2 and below, sampled packets do not support tunnel information. |
Workaround: N/A |
|
Keywords: ASAP2, sFLOW |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2568417 |
Description: Upon upgrade to version 5.3, the package manager tool will install the new packages and then remove the old packages, a depmod WARNING on "mlx5_fpga_tools" will appear. This warning can be safely ignored. mlx5_fpga_tools is a module that existed in version 5.2 and was removed in 5.3. |
Workaround: N/A |
|
Keywords: Upgrade; mlx5_fpga_tools |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2506425 |
Description: When installing kmod packages on EulerOS 2.0SP9 or OpenEuler 20.03, the following error appears: "modprobe: FATAL: could not get modversions of |
Workaround: N/A |
|
Keywords: Installation; modules; kmod |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2492509 |
Description: When installing the driver on OpenEuler or on EulerOS 2.0SP9, rebuilding the drivers (--add-kernel-support) with the --kmp option (to create kmod packages) generates packages that are uninstallable because they have a dependency on "/sbin/depmod" that the system does not provide. This dependency is created by a buggy kmod package building tool included with the distribution. |
Workaround: N/A |
|
Keywords: add-kernel-support |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2479327 |
Description: On SLES 12 SP5, if the kernel was upgraded to 4.12.14-122.46, it is not possible to rebuild kernel modules (--add-kernel-support) without upgrading gcc as well to at least 4.8.5-31.23.2. |
Workaround: N/A |
|
Keywords: Upgrade; SLES 12; add-kernel-support |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2584441 |
Description: On SLES 12 SP5, if the kernel was upgraded to 4.12.14-122.46, it is not possible to rebuild kernel modules (--add-kernel-support) without upgrading gcc as well to at least 4.8.5-31.23.2. |
Workaround: N/A |
|
Keywords: Upgrade; SLES 12; add-kernel-support |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2460865 |
Description: When setting MTU to low values, such as 68 bytes, packets may fail on oversize. |
Workaround: N/A |
|
Keywords: MTU |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2383318 |
Description: On kernels based on RedHat 7.2, the "tx_port_ts" feature, as set by ethtool —set-priv-flags, is disabled. |
Workaround: N/A |
|
Keywords: RedHat; tx_port_ts |
|
Discovered in Release: 5.3-1.0.0.1 |
|
2575647 |
Description: An OvS-DPDK crash might occur while doing live-migration for VMs that use virtio-interfaces that are accelerated using OvS-DPDK vDPA ports. |
Workaround: N/A |
|
Keywords: OvS-DPDK vDPA, Live-migration |
|
Discovered in Release: 5.3-1.0.0.1 |
Internal Ref. Number |
Issue |
2430071 |
Description: After reloading devlink in IPoIB setup, the IB link may stay in initialization state and require to run OpenSM to get the IB link to active state. |
Workaround: N/A |
|
Keywords: IPoIB devlink reload |
|
Discovered in Release: 5.2-2.2.0.0 |
|
2302786 |
Description: On EulerOS 2.0 SP9 systems, the kernel ABI (kABI) between the base vhulk2006 kernel and the errata vhulk2008 kernel has been changed. It is now not possible to install MLNX_OFED compiled with KMP on vhulk2006 kernel on a vhulk2008 system. |
Workaround: Install MLNX_OFED with --add-kernel-support. |
|
Keywords: EulerOS; kABI; installation; --add-kernel-support |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2398281 |
Description: A crash in the TLS Rx socket cleanup flow may occur due to a kernel issue where a wrong extra call to tls_dev_del is made. |
Workaround: N/A |
|
Keywords: TLS RX device offload |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2407415 |
Description: OpenEuler 20.03 Aarch64 with errata kernels 4.19.90-2011.6.0.0049.oe1.aarch64 and 4.19.90-2012.5.0.0054.oe1.aarch64 are incompatible with MLNX_OFED kmod-mlnx-ofa_kernel. |
Workaround: Install MLNX_OFED with --add-kernel-support. |
|
Keywords: OpenEuler; Aarch64; installation; --add-kernel-support |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2348077 |
Description: RDMA device name for VFs may change after resetting all VFs at once. |
Workaround: Either reset interfaces one by one with a delay in between, or use a network interface naming scheme with predictable interface names, such as NAME_PCI or NAME_GUID. Copy /lib/udev/rules.d/60-rdma-persistent-naming.rules to /etc/udev/rules.d/ and edit the last line accordingly. Note that this will change interface names. |
|
Keywords: RDMA; VF |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2381713 |
Description: esp4_offload and esp6_offload modules are expected to be loaded according to the list determined by the default kernel. However, these modules cannot be loaded when working over Debian 10 with non-default custom kernel as they are not included in it. |
Workaround: Either install MLNX_OFED using --add-kernel-support, or rebuild the non-default custom kernel to include these modules. |
|
Keywords: esp4_offload; esp6_offload; kernel, Debian |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2382898 |
Description: On kernel 4.14, there is no traffic for UDP or TCP with payload size larger than 1398 on GENEVE IPv6 over VLAN tag interface. |
Workaround: N/A |
|
Keywords: GENEVE; stag; VLAN; UDP |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2326155 |
Description: When toggling the link state while running RoCE traffic, the below warning may appear in the dmesg: __ib_cache_gid_add: unable to add gid <gid> error=-28 |
Workaround: N/A |
|
Keywords: RoCE; __ib_cache_gid_add |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2329654 |
Description: Running XDP over an IP tunnel may fail when working with kernels as old as version 4.14. |
Workaround: N/A |
|
Keywords: XDP, Kernel |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2249156 |
Description: MLNX_OFED installation will remove qperf package in case it was done after qperf installation. |
Workaround: Make sure to install qperf package after installing MLNX_OFED, or re-install qperf after installing MLNX_OFED. |
|
Keywords: Installation; qperf |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2355956 |
Description: OFED installation requires kernel config CONFIG_DEBUG_INFO to be set. |
Workaround: N/A |
|
Keywords: Installation; CONFIG_DEBUG_INFO |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2362781 |
Description: Openibd may fail to unload the Inbox driver mlx5_ib on Ubuntu 18.04 PPC Boston server due to a bug in the Inbox drivers. |
Workaround: N/A |
|
Keywords: Openibd; Inbox; Ubuntu; mlx5_ib |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2367659 |
Description: Upgrading the MLNX_OFED version that is configured as a YUM repository may yield warning messages from depmod about unknown symbols, such as: depmod: WARNING: /lib/modules/4.18.0-240.el8.×8664/extra/iser/ib_iser.ko needs unknown symbol ib_fmr_pool_unmap depmod: WARNING: /lib/modules/4.18.0-240.el8.×8664/extra/srp/ib_srp.ko needs unknown symbol ib_create_qp_user These warnings appear since the RPM packages upgrade occurs sequentially, and there is an upgrade dependency between some of the modules, which would create a state of upgrade inconsistency. These warnings are temporary and can be ignored as eventually all modules will be upgraded, and the warnings will no longer appear. |
Workaround: N/A |
|
Keywords: YUM; RPM; symbol; depmod; ISER; SRP |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2385269 |
Description: The number of connections offloaded is limited to 100K when working with Kernel v5.9. |
Workaround: N/A |
|
Keywords: ASAP2; Connection Tracking; Kernel |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2393169 |
Description: Mirroring is not supported with Connection Tracking when the source port is a VxLAN device. |
Workaround: N/A |
|
Keywords: ASAP2; Connection Tracking; Mirroring |
|
Discovered in Release: 5.2-1.0.4.0 |
|
2395082 |
Description: A call trace may take place when moving from SwitchDev mode back to Legacy mode in Kernel v5.9 due to a kernel issue in tcf_block_unbind. |
Workaround: N/A |
|
Keywords: ASAP2;SwitchDev; call trace; kernel; tcf_block_unbind |
|
Discovered in Release: 5.2-1.0.4.0 |
Internal Ref. Number |
Issue |
2354899 |
Description: ODP is not supported on RHEL7.x systems when running over an ETH link layer with RoCE disabled. |
Workaround: N/A |
|
Keywords: ODP, RHEL, RoCE |
|
Discovered in Release: 5.1-2.5.8.0 |
|
2338150 |
Description: Scatter to CQE feature should be disabled for the GPUDirect tests to work. |
Workaround: Set the MLX5_SCATTER_TO_CQE environment variable to 0 before the ib_send_bw command. For example: MLX5_SCATTER_TO_CQE=0 ib_send_bw -d <...> |
|
Keywords: CQE, GPUDirect |
|
Discovered in Release: 5.1-2.5.8.0 |
|
2295732 |
Description: Upgrading from legacy (mlnx-libs) to the current rdma-core based build using YUM (package manager) fails. |
Workaround: To perform this upgrade, either use the installer script or uninstall the old packages and install the new packages. |
|
Keywords: Legacy, mlnx-libs, rdma-core, installation |
|
Discovered in Release: 5.1-2.5.8.0 |
|
2295735 |
Description: Upgrading from legacy (mlnx-libs) to the current rdma-core based build using the apt-get (package manager) fails. |
Workaround: To perform this upgrade, either use the installer script or uninstall the old packages and install the new packages. |
|
Keywords: Legacy, mlnx-libs, rdma-core, apt, apt-get, installation |
|
Discovered in Release: 5.1-2.5.8.0 |
|
2248996 |
Description: Downgrading the firmware version for ConnectX-6 cards using "mlnx_ofed_install --fw-update-only --force-fw-update" fails. |
Workaround: Manually downgrade the firmware version - please see Firmware Update Instructions. |
|
Keywords: Firmware, ConnectX-6 |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2175930 |
Description: When using OFED 5.1 on PPC architectures with kernels v5.5 or v5.6 and an old ethtool utility, a harmless warning call trace may appear in the dmesg due to mismatch between user space and kernel. The warning call trace mentions ethtool_notify. |
Workaround: Update the ethtool utility to version 5.6 on such systems in order to avoid the call trace. |
|
Keywords: PPC, ethtool_notify, kernel |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2198764 |
Description: If MLNX_OFED is installed on a Debian or Ubuntu system that is run in chroot environment, the openibd service will not be enabled. If the chroot files are being used as a base of a full system, the openibd service is left disabled. |
Workaround: Currently, openibd is a sysv-init script that you can enable manually by running: update-rc.d openibd defaults |
|
Keywords: chroot, Debian , Ubuntu, openibd |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2237134 |
Description: Running connection tracking (CT) with FW steering may cause CREATE_FLOW_TABLE command to fail with syndrome. |
Workaround: Configure OVS to use a single handler-thread: #ovs-vsctl set Open_vSwitch . other_config:n-handler-threads=1 |
|
Keywords: Connection tracking, ASAP, OVS, FW steering |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2239894 |
Description: Running OpenVSwitch offload with high traffic throughput can cause low insertion rate due to high CPU usage. |
Workaround: Reduce the number of combined channels of the uplink using "ethtool -L". |
|
Keywords: Insertion rate, ASAP2 |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2240671 |
Description: Header rewrite action is not supported over RHEL/CentOS 7.4. |
Workaround: N/A |
|
Keywords: ASAP, header rewrite, RHEL, RedHat, CentOS, OS |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2242546 |
Description: Tunnel offload (encap/decap) may cause kernel panic if nf_tables module is not probed. |
Workaround: Make sure to probe the nf_tables module before inserting any rule. |
|
Keywords: Kernel v5.7, ASAP, kernel panic |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2143007 |
Description: IPsec packets are dropped during heavy traffic due to a bug in net/xfrm Linux Kernel. |
Workaround: Make sure the Kernel is modified to apply the following patch: "xfrm: Fix double ESP trailer insertion in IPsec crypto offload". |
|
Keywords: IPsec, xfrm |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2225952 |
Description: VF mirroring with TC policy skip_sw is not supported on RHEL/CentOS 7.4, 7.5 and 7.6 OSs. |
Workaround: N/A |
|
Keywords: ASAP2, Mirroring, RHEL, RedHat, OS |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2216521 |
Description: After upgrading MLNX_OFED from v5.0 or earlier, ibdev2netdev utility changes the installation prefix to /usr/sbin. Therefore, it cannot be found while found in the same SHELL environment. |
Workaround: After installing MLNX_OFED, log out and log in again to refresh the SHELL environment. |
|
Keywords: ibdev2netdev |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2202520 |
Description: Rules with VLAN push/pop, encap/decap and header rewrite actions together are not supported. |
Workaround: N/A |
|
Keywords: ASAP2, SwitchDev, VLAN push/pop, encap/decap, header rewrite |
|
Discovered in Release: 5.1-0.6.6.0 |
|
2210752 |
Description: Switching from Legacy mode to SwitchDev mode and vice-versa while TC rules exist on the NIC will result in failure. |
Workaround: Before attempting to switch mode, make sure to delete all TC rules on the NIC or stop OpenvSwitch. |
|
Keywords: ASAP2, Devlink, Legacy SR-IOV |
|
Discovered in Release: 5.1-0.6.6.0 |
Internal Ref. Number |
Issue |
2125036/2125031 |
Description: Upgrading the MLNX_OFED from an UPSTREAM_LIBS based version to an MLNX_LIBS based version fails unless the driver is uninstalled and then re-installed. |
Workaround: Make sure to uninstall and re-install MLNX_OFED to complete the upgrade. |
|
Keywords: Installation, UPSTREAM_LIBS, MLNX_LIBS |
|
Discovered in Release: 5.0-2.1.8.0 |
|
2105447 |
Description: hns_roce warning messages will appear in the dmesg after reboot on Euler2 SP3 OSs. |
Workaround: N/A |
|
Keywords: hns_roce, dmesg, Euler |
|
Discovered in Release: 5.0-2.1.8.0 |
|
2110321 |
Description: Multiple driver restarts may cause IPoIB soft lockup. |
Workaround: N/A |
|
Keywords: Driver restart, IPoIB |
|
Discovered in Release: 5.0-2.1.8.0 |
|
2112251 |
Description: On kernels 4.10-4.14, when Geneve tunnel's remote endpoint is defined using IPv6, packets larger than MTU are not fragmented, resulting in no traffic sent. |
Workaround: Define geneve tunnel's remote endpoint using IPv4. |
|
Keywords: Kernel, Geneve, IPv4, IPv6, MTU, fragmentation |
|
Discovered in Release: 5.0-2.1.8.0 |
|
2102902 |
Description: A kernel panic may occur over RH8.0-4.18.0-80.el8.x86_64 OS when opening kTLS offload connection due to a bug in kernel TLS stack. |
Workaround: N/A |
|
Keywords: TLS offload, mlx5e |
|
Discovered in Release: 5.0-2.1.8.0 |
|
2111534 |
Description: A Kernel panic may occur over Ubuntu19.04-5.0.0-38-generic OS when opening kTLS offload connection due to a bug in the Kernel TLS stack. |
Workaround: N/A |
|
Keywords: TLS offload, mlx5e |
|
Discovered in Release: 5.0-2.1.8.0 |
|
2035950 |
Description: An internal error might take place in the firmware when performing any of the following in VF LAG mode, when at least one VF of either PF is still bound/attached to a VM.
|
Workaround: N/A |
|
Keywords: VF LAG, binding, firmware, FW, PF, SR-IOV |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2044544 |
Description: When working with OSs with Kernel v4.10, bonding module does not allow setting MTUs larger than 1500 on a bonding interface. |
Workaround: Upgrade your Kernel version to v4.11 or above. |
|
Keywords: Bonding, MTU, Kernel |
|
Discovered in Release: 5.0-1.0.0.0 |
|
1882932 |
Description: Libibverbs dependencies are removed during OFED installation, requiring manual installation of libraries that OFED does not reinstall. |
Workaround: Manually install missing packages. |
|
Keywords: libibverbs, installation |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2058535 |
Description: ibdev2netdev command returns duplicate devices with different ports in SwitchDev mode. |
Workaround: Use /opt/mellanox/iproute2/sbin/rdma link show command instead. |
|
Keywords: ibdev2netdev |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2072568 |
Description: In RHEL/CentOS 7.2 OSs, adding drop rules when act_gact is not loaded may cause a kernel crash. |
Workaround: Preload all needed modules to avoid such a scenario (cls_flower, act_mirred, act_gact, act_tunnel_key and act_vlan). |
|
Keywords: RHEL/CentOS 7.2, Kernel 4.9, call trace, ASAP |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2093698 |
Description: VF LAG configuration is not supported when the NUM_OF_VFS configured in mlxconfig is higher than 64. |
Workaround: N/A |
|
Keywords: VF LAG, SwitchDev mode, ASAP |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2093746 |
Description: Devlink health dumps are not supported on kernels lower than v5.3. |
Workaround: N/A |
|
Keywords: Devlink, health report, dump |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2000590 |
Description: Sending packets larger than MTU is not supported when working with OVS-DPDK. |
Workaround: N/A |
|
Keywords: MTU, OVS-DPDK |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2062900 |
Description: Moving VF from SwitchDev mode to Legacy mode while the representor is being used by OVS-DPDK results in a segmentation fault. |
Workaround: To move VF to Legacy mode with no error, make sure to delete the ports from the OVS. |
|
Keywords: SwitchDev, Legacy, representor, OVS-DPDK |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2075942 |
Description: Huge pages configuration is lost each time the server is configured. |
Workaround: Re-configure the huge pages after each reboot, or configure them as a kernel parameter. |
|
Keywords: Huge pages, reboot, OVS-DPDK |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2067012 |
Description: MLNX_OFED cannot be installed on Debian 9.11 OS in SwitchDev mode. |
Workaround: Install OFED with the flag --add-kernel-support. |
|
Keywords: ASAP, SwitchDev, Debian, Kernel |
|
Discovered in Release: 5.0-1.0.0.0 |
|
2036572 |
Description: When using a thread domain and the lockless rdma-core ibv_post_send path, there is an additional CPU penalty due to required barriers around the device MMIO buffer that were omitted in MLNX_OFED. |
Workaround: N/A |
|
Keywords: rdma-core, write-combining, MMIO buffer |
|
Discovered in Release: 5.0-1.0.0.0 |
Internal Ref. Number |
Issue |
- |
Description: The argparse module is installed by default in Python versions =>2.7 and >=3.2. In case an older Python version is used, the argparse module is not installed by default. |
Workaround: Install the argparse module manually. |
|
Keywords: Python, MFT, argparse, installation |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1997230 |
Description: Running mlxfwreset or unloading mlx5_core module while contrak flows are offloaded may cause a call trace in the kernel. |
Workaround: Stop OVS service before calling mlxfwreset or unloading mlx5_core module. |
|
Keywords: Contrak, ASAP, OVS, mlxfwrest, unload |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1955352 |
Description: Moving 2 ports to SwitchDev mode in parallel is not supported. |
Workaround: N/A |
|
Keywords: ASAP, SwitchDev |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1979958 |
Description: VxLAN IPv6 offload is not supported over CentOS/RHEL v7.2 OSs. |
Workaround: N/A |
|
Keywords: Tunnel, VXLAN, ASAP, IPv6 |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1991710 |
Description: PRIO_TAG_REQUIRED_EN configuration is not supported and may cause call trace. |
Workaround: N/A |
|
Keywords: ASAP, PRIO_TAG, mstconfig |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1967866 |
Description: Enabling ECMP offload requires the VFs to be unbound and VMs to be shut down. |
Workaround: N/A |
|
Keywords: ECMP, Multipath, ASAP2 |
|
Discovered in Release: 4.7-3.2.9.0 |
|
1921981 |
Description: On Ubuntu, Debian and RedHat 8 and above OSS, parsing the mfa2 file using the mstarchive might result in a segmentation fault. |
Workaround: Use mlxarchive to parse the mfa2 file instead. |
|
Keywords: MFT, mfa2, mstarchive, mlxarchive, Ubuntu, Debian, RedHat, operating system |
|
Discovered in Release: 4.7-1.0.0.1 |
|
1840288 |
Description: MLNX_OFED does not support XDP features on RedHat 7 OS, despite the declared support by RedHat. |
Workaround: N/A |
|
Keywords: XDP, RedHat |
|
Discovered in Release: 4.7-1.0.0.1 |
|
1821235 |
Description: When using mlx5dv_dr API for flow creation, for flows which execute the "encapsulation" action or "push vlan" action, metadata C registers will be reset to zero. |
Workaround: Use the both actions at the end of the flow process. |
|
Keywords: Flow steering |
|
Discovered in Release: 4.7-1.0.0.1 |
Internal Ref. Number |
Issue |
1504785 |
Description: A lost interrupt issue in pass-through virtual machines may prevent the driver from loading, followed by printing managed pages errors to the dmesg. |
Workaround: Restart the driver. |
|
Keywords: VM, virtual machine |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1764415 |
Description: Unbinding PFs on LAG devices results in a "Failed to modify QP to RESET" error message. |
Workaround: N/A |
|
Keywords: RoCE LAG, unbind, PF, RDMA |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1806565 |
Description: RoCE default GIDs v1 and v2 are derived from the MAC address of the corresponding netdevice's PCI function, and they resemble the IPv6 address. However, in systems where the IPv6 link local address generated does not depend on the MAC address, RoCEv2 default GID should not be used. |
Workaround: Use RoCEv2 default GID. |
|
Keywords: RoCE |
|
Discovered in Release: 4.6-1.0.1.1 |
|
- |
Description: Aging is not functional on bond device in RHEL 7.6. |
Workaround: N/A |
|
Keywords: VF LAG, ASAP2 |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1747774 |
Description: In VF LAG mode, outgoing traffic in load balanced mode is according to the origin ring, thus, half of the rings will be coupled with port 1 and half with port 2. All the traffic on the same ring will be sent from the same port. |
Workaround: N/A |
|
Keywords: VF LAG, ASAP2 |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1753629 |
Description: A bonding bug found in Kernels 4.12 and 4.13 may cause a slave to become permanently stuck in BOND_LINK_FAIL state. As a result, the following message may appear in dmesg: bond: link status down for interface eth1, disabling it in 100 ms |
Workaround: N/A |
|
Keywords: Bonding, slave |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1712068 |
Description: Uninstalling MLNX_OFED automatically results in the uninstallation of several libraries that are included in the MLNX_OFED package, such as InfiniBand-related libraries. |
Workaround: If these libraries are required, reinstall them using the local package manager (yum/dnf). |
|
Keywords: MLNX_OFED libraries |
|
Discovered in Release: 4.6-1.0.1.1 |
|
- |
Description: Due to changes in libraries, MFT v4.11.0 and below are not forward compatible with MLNX_OFED v4.6-1.0.0.0 and above. Therefore, with MLNX_OFED v4.6-1.0.0.0 and above, it is recommended to use MFT v4.12.0 and above. |
Workaround: N/A |
|
Keywords: MFT compatible |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1730840 |
Description: On ConnectX-4 HCAs, GID index for RoCE v2 is inconsistent when toggling between enabled and disabled interface modes. |
Workaround: N/A |
|
Keywords: RoCE v2, GID |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1717428 |
Description: On kernels 4.10-4.14, MTUs larger than 1500 cannot be set for a GRE interface with any driver (IPv4 or IPv6). |
Workaround: Upgrade your kernel to any version higher than v4.14. |
|
Keywords: Fedora 27, gretap, ip_gre, ip_tunnel, ip6_gre, ip6_tunnel |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1748343 |
Description: Driver reload takes several minutes when a large number of VFs exists. |
Workaround: N/A |
|
Keywords: VF, SR-IOV |
|
Discovered in Release: 4.6-1.0.1.1 |
|
1733974 |
Description: Running heavy traffic (such as 'ping flood') while bringing up and down other mlx5 interfaces may result in “INFO: rcu_preempt dectected stalls on CPUS/tasks:” call traces. |
Workaround: N/A |
|
Keywords: mlx5 |
|
Discovered in Release: 4.6-1.0.1.1 |
|
- |
Description: On ConnectX-6 HCAs and above, an attempt to configure advertisement (any bitmap) will result in advertising the whole capabilities. |
Workaround: N/A |
|
Keywords: 200Gb/s, advertisement, Ethtool |
|
Discovered in Release: 4.6-1.0.1.1 |
Internal Ref. Number |
Issue |
1699289 |
Description: HW LRO feature is disabled OOB, which results in increased CPU utilization on the Receive side. On ConnectX-5 adapter cards and above, this causes a bandwidth drop for a few streams. |
Workaround: Make sure to enable HW LRO in the driver: ethtool -k <intf> lro ethtool --set-priv-flag <intf> hw_lro on |
|
Keywords: HW LRO, ConnectX-5 and above |
|
Discovered in Release: 4.5-1.0.1.0 |
|
1403313 |
Description: Attempting to allocate an excessive number of VFs per PF in operating systems with kernel versions below v4.15 might fail due to a known issue in the Kernel. |
Workaround: Make sure to update the Kernel version to v4.15 or above. |
|
Keywords: VF, PF, IOMMU, Kernel, OS |
|
Discovered in Release: 4.5-1.0.1.0 |
|
- |
Description: NEO-Host is not supported on the following OSs:
|
Workaround: N/A |
|
Keywords: NEO-Host, operating systems |
|
Discovered in Release: 4.5-1.0.1.0 |
|
1521877 |
Description: On SLES 12 SP1 OSs, a kernel tracepoint issue may cause undefined behavior when inserting a kernel module with a wrong parameter. |
Workaround: N/A |
|
Keywords: mlx5 driver, SLES 12 SP1 |
|
Discovered in Release: 4.5-1.0.1.0 |