NVIDIA MLNX_OFED Documentation v23.10-6.1.6.1 LTS

Known Issues

The following is a list of general limitations and known issues of the current version of the release.

Internal Ref. Number

Issue

3546668

Description: On 64k page size systems, applications that open a large number of RDMA resources (UARs/QPs/CQs etc.) might face errors creating those resources due to a PCI BAR size limitation.

Keywords: PCI BAR size limitation

Workaround: It is recommended to increase the BAR size via mlxconfig to allow enough space for the allocation of all the needed RDMA resources.

Discovered in Release: 23.10-1.1.9.0

3678715

Description: When attempting to restart drivers using openIbd service while the nvme_rdma module is loaded, the process may fail. This behavior is intentional, as unloading nvme_rdma during the driver restart can lead to connectivity issues in other applications within the setup.

Keywords: openIbd service, nvme_rdma module

Workaround: Manually unload the nvme_rdma module before performing the driver restart. This can be achieved using the modprobe -r nvme_rdma command.

Discovered in Release: 23.10-1.1.9.0

3676223

Description: When using kernel version 4.12 or above, it is advised to run

echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_drivers_autoprobe to avoid VF probing

Keywords: VF probing

Workaround: N/A

Discovered in Release: 23.10-1.1.9.0

3682658

Description: While using the RDMA-CM user application and the AF_IB parameter, the kernel uses only the first byte of the private data to set the CMA version. In such scenario, any user data written to this byte will be overwritten.

Keywords: RDMA-CM user application, AF_IB, private data

Workaround: Do not use AF_IB for application's private data.

Discovered in Release: 23.10-0.5.5.0

3640082

Description: A potential null pointer dereference might occur due to a missing update in the PCI subsystem code when creating the maximum number of VFs.

All kernel versions lacking the following fix are impacted:"PCI: Avoid enabling PCI atomics on VFs."

Keywords: Maximal VF number

Workaround: N/A

Discovered in Release: 23.10-0.5.5.0

3653417

Description: When offloading IPsec policy rules while in legacy mode there are two options:

  1. Software steering - The software stack will handle the task, and no device offload will take place.

2. Changing the steering mode to firmware steering will return unsupported.

Keywords: IPsec, legacy mode

Workaround: Perform a devlink reload after changing the steering mode.

Discovered in Release: 23.10-0.5.5.0

3612274

Description: Currently, either IPsec offload or TC offload for a specific interface is allowed. The offloading TC rule to an interface will fail if an IPSec rule is already offloaded on it, and vice-versa.

Keywords: IPsec offload, TC offload

Workaround: N/A

Discovered in Release: 23.10-0.5.5.0

3596126

Description: OVS mirroring of both egress and ingress together with modified TTL is not supported by Connectx-5 cards, and may cause packets checksum issues and errors in the dmesg command.

Keywords: OVS mirroring, Connectx-5

Workaround: N/A

Discovered in Release: 23.10-0.5.5.0

3538463

Description: A Kernel ABI problem in Sles15SP4 may lead to issues during driver start. This impacts kernels starting from version 5.14.21-150400.24.11.1 up to version 5.14.21-150400.24.63.1 (July 2022 to May 2023), inclusive. For more information, see https://www.suse.com/support/kb/doc/?id=000021137.

Keywords: Kernel ABI, Sles15SP4, driver start

Workaround: Upgrade to a kernel version newer than 5.14.21-150400.24.63.1 (May 2023).

Discovered in Release: 23.10-0.5.5.0

3637252

Description: When running over REHL7.6 with excessive RDMA/RoCE workload, kernel warnings may be triggered.

Keywords: REHL7.6, RDMA, RoCE

Workaround: N/A

Discovered in Release: 23.10-0.5.5.0

Internal Ref. Number

Issue

3046655

Description: A package manager upgrade with zypper (on an SLES system) may prompt a question about vendor change from "Mellanox Technologies" to "OpenFabrics".

Keywords: Installation, SLES

Workaround: Either accept the prompted change, or add the /etc/zypp/vendors.d/mlnx_ofed file with the following content:

[main]

vendors = Mellanox,OpenFabrics

Discovered in Release: 23.07-0.5.0.0

3392477

Description: The ConnectX-7 firmware embedded in this MLNX_OFED version cannot be burnt using the MLNX_OFED installer script.

Keywords: ConnectX-7, MLNX_OFED installer script

Workaround: Please download and install the dedicated firmware from the web https://network.nvidia.com/support/firmware/connectx7ib/

Discovered in Release: 23.07-0.5.0.0

3532756

Description: The kernel may crash when restarting the driver while IP sec rules are configured.

Keywords: IP sec

Workaround: Flush the IP sec configuration before reloading the driver:

ip xfrm state fluship xfrm policy flush

Discovered in Release: 23.07-0.5.0.0

3472979

Description: When a large number of virtual functions are present, the output of the "ip link show" command may be truncated.

Keywords: virtual functions, ip link show

Workaround: N/A

Discovered in Release: 23.07-0.5.0.0

3413938

Description: When using the mlnx-sf script, creating and deleting an SF with the same ID number in a stressful manner may cause the setup to hang due to a race between the create and delete commands.

Keywords: Hang; mlnx-sf

Workaround: N/A

Discovered in Release: 23.07-0.5.0.0

3461572

Description: Configuring Multiport Eswitch LAG mode can be performed only via devlink from this release onwards. The compat sysfs should not be used to configure mpesw LAG.

Keywords: Multiport Eswitch, compat sysfs, mpesw LAG

Workaround: N/A

Discovered in Release: 23.07-0.5.0.0

3464337

Description: Simultaneously adding or removing TC rules while operating on kernel version 6.3 could potentially result in stability issues.

Keywords: ASAP, rules, TC

Workaround: Make sure the following fix is part of the kernel: https://lore.kernel.org/netdev/20230504181616.2834983-3-vladbu@nvidia.com/T/

Discovered in Release: 23.07-0.5.0.0

3469484

Description: Mirror and connection tracking (CT) offload actions are not supported simultaneously if the kernel version does not support hardware miss to TC actions. Thus, when performing a CT offload test, the actual number of offloaded connections may be lower than expected.

Keywords: ASAP, CT offload

Workaround: Make sure to have the following offending commit in the tree:

net/sched: act_ct: offload UDP NEW connections Make sure to to have https://www.spinics.net/lists/stable-commits/msg303536.html in the kernel tree to fix this issue.

Discovered in Release: 23.07-0.5.0.0

3473331

Description: When performing a CT offload test, the actual number of offloaded connections may be lower than expected.

Keywords: ASAP, CT offload

Workaround: N/A

Discovered in Release: 23.07-0.5.0.0

3499413

Description: Due to the following kernel issue, under heavy load, some connections may not be offloaded, leading to performance issues:

"net/sched: act_ct: offload UDP NEW connections."

Keywords: ASAP, CT offload

Workaround: N/A

Discovered in Release: 23.07-0.5.0.0

Internal Ref. Number

Issue

3360710

Description: Configuring PFC in parallel to buffer size and prio2buffer commands may lead to misalignment between firmware and software in regards to receiving buffer ownership.

Keywords: NetDev, PFC, Buffer Size, prio2buffer

Workaround: First, configure PFC on all ports, and then perform other needed QoS (i.e., buffer_size or prio2buffer) configurations accordingly.

Discovered in Release: 23.04-0.5.3.3

3413879

Description: OpenSM may not be started automatically if chkconfig was not installed before OpenSM is installed. Note, however, that chkconfig will fail to install if the directory (rather than symbolic link to directory) /etc/init.d already exists (e.g., from a previous installation of MLNX_OFED).

Keywords: Installation, OpenSM, chkconfig

Workaround: Install chkconfig before installing MLNX_OFED. If installing it fails, make sure /etc/init.d does not exist at the time of installing it.

Discovered in Release: 23.04-0.5.3.3

3424596

Description: On SLES 15.4, installing MLNX_OFED using a package repository (with zypper) may trigger an error message about missing dependency for 'librte_eal.so.20.0()(64bit)' . This is because the inbox package libdpdk-20_0 is being uninstalled as it is incompatible with the MLNX_OFED rdma-core packages.

Keywords: Installation, SLES 15.4

Workaround: Uninstall the relevant packages: 'zypper uninstall libdpdk-20_0' before installing MLNX_OFED. This will also remove the inbox openvswitch package.

Discovered in Release: 23.04-0.5.3.3

3433416

Description: On systems that were installed with MLNX_OFED 5.9 or older and include a CUDA package (ucx-cuda / hcoll-cuda), an upgrade to MLNX_OFED 23.04 using the package manager ("yum") method will fail. This is because MLNX_OFED up to 5.9 is built with CUDA 11. MLNX_OFED 23.04 is built with CUDA 12 and those CUDA versions are incompatible.

Keywords: Installation, CUDA, yum

Workaround: Remove CUDA packages included with OFED (ucx-cuda, hcoll-cuda) before upgrading. This will allow to upgrade MLNX_OFED regardless of CUDA version installed. To install them later, CUDA 12 must be installed on the system.

Discovered in Release: 23.04-0.5.3.3

3420831

Description: mlx-steering-dump is not supported on systems in which Python3 is not the default.

Keywords: mlx-steering-dump, Python3

Workaround: N/A

Discovered in Release: 23.04-0.5.3.3

3351989

Description: If the underlying persistent device name exceeds 15 characters in length, the operating system will not be able to perform renaming (i.e., the device name will remain "eth").

Keywords: Persistant Interface Names

Workaround: Add the --copy-ifnames-udev flag to the OFED installation command. Note that this flag is only applicable if the persistent name provided by the kernel, without the 'np' suffix, is 15 characters or fewer.

Discovered in Release: 23.04-0.5.3.3

© Copyright 2026, NVIDIA. Last updated on Jan 6, 2026