NVIDIA MLNX_EN Documentation v23.10-6.1.6.1 LTS

Bug Fixes History

This table lists the bugs fixed in the last three major GA releases. For a list of old bug fixes, please refer to the release notes of the desired version.

Below are the bugs fixed in this version. For a list of fixes previous version, see Bug Fixes History.

Internal Reference Number

Description

4410029

Description: Fixed an issue where installing mlnx-ofa_kernel drivers on SLES 15 SP5 with kernel version 5.14.21-150500.55.68-default (and newer) failed due to weak-modules falling back to the original inbox modules. The failure was caused by a mismatch: the original build kernel (5.14.21-150500.53-default) did not include the mana_ib driver, so no dummy module was provided, while the newer kernel did include it. This mismatch led to weak-modules sanity check errors due to the presence of the inbox mana_ib driver.

Keywords: mlnx-ofa_kernel, SLES 15 SP5

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4471811

Description: Resolved NVMe driver compilation issue on Linux kernel version 6.6.87.

Keywords: NVMe driver

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4253229

Description: Fixed a race condition between the firmware syndrome report and driver initialization during boot.

Keywords: Race condition

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4442965

Description: Fixed performance degradation on older kernel versions using RX cache, particularly on slower ARM CPUs with larger RX buffers. The issue was caused by the driver attempting to allocate new RX pages too quickly, leading to head-of-line blocking in the RX cache.

The fix improves RX cache usage by triggering page allocation for a bulk of at least 2 WQEs, allowing the application more time to process packets and return buffers to the RX cache, thereby reducing blocking and enhancing performance.

Keywords: Performance, kernel, Rx cache, page allocation

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4243800

Description: Resolved improper page deallocation handling issue present in some kernels.

Keywords: Page deallocation

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4466255

Description: Fixed an issue where a kernel crash could occur if a device event arrives during the event subscription process.

Keywords: DevX, event_fd

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4441119

Description: Fixed a crash caused by handling multiple CMA net events occurring in quick succession on the same CMA ID.

Keywords: CMA

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4405723

Description: Fixed a potential deadlock that could occur during the handling of peer memory registration failures.

Keywords: Deadlock, peer memory registration

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4340109

Description: Fixed a sysfs issue that occurred when accessing hardware counters from within a namespace.

Keywords: sysfs

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4248125

Description: Fixed the UMR QP recovery flow to ensure proper functionality and prevent tasks from getting stuck in the kernel. Additionally, resolved a race condition in the ODP MR area that could lead to a CQE error in the UMR QP.

Keywords: UMR QP recovery flow

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4235682

Description: Resolved corruption of SA MAD Congestion Control FIFO queue when all elements are canceled and a dequeue operation is attempted.

Keywords: SA legacy congestion control mechanism

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4409282

Description: Increased the size of the slow FDB table to prevent hitting the following error when switching to SwitchDev mode.

mlx5_core 0000:03:00.0: mlx5_cmd_out_err:835:(pid 24362): CREATE_FLOW_GROUP(0x933) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x4065f0), err(-22)

mlx5_core 0000:03:00.0: E-Switch: Failed to create peer miss flow group err(-22)

Keywords: Slow FDB table

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

Internal Reference Number

Description

3992667

Description: Fixed an issue that caused the MLNX_EN uninstall script to keep some leftover packages from the last installation process.

Keywords: Installation, MLNX_EN

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

Internal Reference Number

Description

4192798

Description: Fixed the issue where the device failed to initialize after a write-combine test failure. The device now loads with Blueflame capabilities disabled instead.

Keywords: write-combine; Blueflame; device initialization

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4111625

Description: MLNX_OFED can now successfully be built with add-kernel-support flag over SLES15-SP5 kernel 5.14.21-150500.55.73.

Keywords: SLES; Kernel; operating system; OS

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4128400

Description: Removed the udev rule error message "/usr/lib/udev/rules.d/90-ib.rules:4 Only network interfaces can be renamed" from the log files. The udev rule included a line in a syntax that is no longer valid that triggered the mentioned error.

Keywords: udev rule

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4001035 / 4001038

Description: Fixed an issue that resulted in corrupt SMP MAD requests list when the sent list was accessed while the unregistered flow was running.

Keywords: SMP MAD requests

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

3991835

Description: Fixed a stack overrun warning by reducing the size of the local on-stack array used for optimization by 192 bytes.

Keywords: Kernel Stack

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4004300

Description: Fixed an issue that prevented netdev queue value from being updated in mqprio param when switchdev mode was enabled and the netdev queue number was reset to 1.

Keywords: PF TXQ mapping

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4176804

Description: Fixed the receive queue cache size calculation to take into account the host page size.

Keywords: Memory allocation

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4027218

Description: Fixed the packet inspection parsing to avoid data corruption when GRE offload was turned on by parsing the outer header as UDP and not as TCP.

Keywords: UDP, TCP, GRE offload

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4001551

Description: Fixed a CT entry update failure that was caused because of a firmware limitation, the old modify header context was not freed and had leaks.

Keywords: CT entry update

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4012710

Description: Increased the MLX5E_TC_MAX_INT_PORT_NUM value to 32 to avoid cases of rules not being offloaded.

Keywords: Rules offloaded

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

4014362

Description: Fixed an issue that prevented the "hash" lag_port_select_mode from working properly with ConnectX-7 adapter cards on some old OSs.

Keywords: LAG

Discovered in Release: 23.10-3.2.2.0

Fixed in Release: 23.10-4.0.9.1

Internal Reference Number

Description

3932946

Description: Fixed the setting of ATS for DMABUF MRs that caused some MRs to miss the ATS enablement. Lack of ATS enablement on DMABUF MRs results in slower performance when using these MRs.

Keywords: ATS, DMABUF, MRs

Discovered in Release: 23.10-2.1.3.1

Fixed in Release: 23.10-3.2.2.0

3894403

Description: On rare occasions, rdmacm applications could not find the device upon creating new RDMA devices, as the CMA driver lost some of the devices due to an overflow issue.

Keywords: rdmacm

Discovered in Release: 23.10-2.1.3.1

Fixed in Release: 23.10-3.2.2.0

3807155

Description: Fixed an issue that could have caused memory corruption when running XDP traffic.

Keywords: tc_wrap tool, VLAN

Discovered in Release: 23.10-2.1.3.1

Fixed in Release: 23.10-3.2.2.0

3848999

Description: Fixed an issue that prevented the tc_wrap tool from properly working when VLAN is configured as the tool wrongly handled the library function return value.

Keywords: tc_wrap tool, VLAN

Discovered in Release: 23.10-2.1.3.1

Fixed in Release: 23.10-3.2.2.0

3822520

Description: Fixed an issue that caused a deadlock in the flow of disabling the LAG when changing eswitch mode from switchdev to legacy when a LAG bond existed on the machine.

Keywords: MR cache cleanup

Discovered in Release: 23.10-2.1.3.1

Fixed in Release: 23.10-3.2.2.0

3822520

Description: Fixed an issue related to the comparison process between the SW steering and FW steering modes to avoid kernel crashes incidences.

Keywords: MR cache cleanup

Discovered in Release: 23.10-2.1.3.1

Fixed in Release: 23.10-3.2.2.0

3947195

Description: Fixed an issue related to the driver's internal MR cache cleanup that caused high memory consumption on the host.

Keywords: MR cache cleanup

Discovered in Release: 23.10-2.1.3.1

Fixed in Release: 23.10-3.2.2.0

Internal Reference Number

Description

3729466

Description: Resolved a discalculation issue where more Q-counters were freed than allocated when moving to switchdev mode.

Keywords: Q-counters, switchdev

Discovered in Release: 23.10-1.1.9.0

Fixed in Release: 23.10-2.1.3.1

3727822

Description: Fixed an issue that allowed concurrent creation of encap entries, and could potentially cause double free vulnerabilities.

Keywords: encap entries, double free

Discovered in Release: 23.10-1.1.9.0

Fixed in Release: 23.10-2.1.3.1

3728381

Description: Fixed an issue that exposed debugfs entries for non supported RoCE general parameters, such as rtt_resp_dscp.

Keywords: debugfs, RoCE

Discovered in Release: 23.10-1.1.9.0

Fixed in Release: 23.10-2.1.3.1

3710957

Description: Fixed an issue that triggered an error message by updating the rule actions STE apply flow. Following the update, the flow checks if the rule domain is different from the ASO CT action domain when applying the ASO CT action.

Keywords: Software Steering

Discovered in Release: 23.10-1.1.9.0

Fixed in Release: 23.10-2.1.3.1

Internal Reference Number

Description

3663363

Description: Fixed an issue where an error was triggered in case devlink reload was attempted when there were allocated subfunctions.

Keywords: devlink reload, allocated subfunctions

Discovered in Release: 23.10-0.5.5.0

Fixed in Release: 23.10-1.1.9.0

3660998

Description: Resolved an issue on ConnectX-4 Lx, where the VF state was not configured correctly following the activation of SR-IOV.

Keywords: ConnectX-4 Lx, VF state

Discovered in Release: 23.10-0.5.5.0

Fixed in Release: 23.10-1.1.9.0

3653417

Description: Fixed an issue where changing the steering mode to firmware steering was unsupported for policy IPsec rules.

Keywords: Firmware steering

Discovered in Release: 23.10-0.5.5.0

Fixed in Release: 23.10-1.1.9.0

Internal Reference Number

Description

3602955

Description: Fixed an issue that occurred when a VF was set to get allmulti traffic. The issue caused the steering rules to send the multicast traffic received by the NIC back to the uplink.

Keywords: VF, allmulti traffic

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3553766

Description: Fixed an issue where the enable_remote_dev_reset Devlink parameter was not supported on kernel versions below v5.10.

Keywords: Devlink parameter

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3546694

Description: Fixed an issue where MAC address configuration for PFs could fail if SR-IOV was enabled at the same time.

Keywords: PF, MAC address, SR-IOV

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3538018

Description: Fixed an issue where firmware sync reset (with the 'mlxfwreset -d <device> -l 3 r --sync 1' command) could fail on a system configured for hotplug on the PCIe slot on which the mlx5 card was mounted.

Keywords: Firmware sync reset, mlx5 card

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3587834

Description: Fixed an issue where the enable_remote_dev_reset Devlink parameter was not supported on kernel versions below v5.10.

Keywords: Devlink parameter

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3576351

Description: Resolved a warning that was triggered when starting the openibd service, which pertained to an unidentified 'ExecRestart' value within the 'Service' section.

Keywords: openibd, warning

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3557482

Description: Fixed an issue where the 'mlnx_tune -l' list of supported operating systems did not include several operating systems that were actually supported, such as RHEL8.6 and Ubuntu 22.04.

Keywords: mlnx_tune -l

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3549684

Description: Fixed a signature-related issue that occurred when installing DOCA on SLES15SP4 using the repository.

Keywords: DOCA, SLES15SP4

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3380263

Description: Fixed an issue where users who attempted to use OFED with Device ID NVD0000000033, had to install the firmware manually.

Keywords: Device ID NVD0000000033

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

3228788

Description: Fixed an issue where running rx-tls-offload over Korg6.0 as its TLS module did not work properly.

Keywords: NetDev, TLS

Discovered in Release: 23.07-0.5.0.0

Fixed in Release: 23.10-0.5.5.0

Internal Reference Number

Description

3546304

Description: Resolved the kernel crash resulting from sysfs calls to profiles lacking TC (Traffic Control) support.

Keywords: sysfs calls, Trafic Control

Discovered in Release: 23.04-0.5.3.3

Fixed in Release: 23.07-0.5.0.0

3531986

Description: Fixed an issue that prevented OS booting following an installation of the EN and RoCE drivers.

Keywords: OS booting, EN, RoCE

Discovered in Release: 23.04-0.5.3.3

Fixed in Release: 23.07-0.5.0.0

3489233

Description: Fixed an issue in SLES 15 SP4 where the openibd service failed to start automatically after system boot.

Keywords: SLES 15 SP4,openibd, system boot

Discovered in Release: 23.04-0.5.3.3

Fixed in Release: 23.07-0.5.0.0

3431430

Description: Fixed an issue that prevented the installation of OFED on RHEL systems using a non-default Python version.

Keywords: Installation, RHEL, Python

Discovered in Release: 5.9-0.5.6

Fixed in Release: 23.07-0.5.0.0

3422823

Description: Fixed an OFED installation issue on BCLinux 21.10 that occurred when using the "--add-kernel-support" installation flag.

Keywords: Installation, BCLinux 21.10, "--add-kernel-support"

Discovered in Release: 5.9-0.5.6

Fixed in Release: 23.07-0.5.0.0

3264588

Description: Resolved a problem where the system boot process would hang when more than two Network Interface Cards were installed.

Keywords: System boot, Network Interface Cards

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 23.07-0.5.0.0

3499136

Description: Fixed an issue where the sysfs PHY counters displayed outdated information.

Keywords: sysfs PHY counters

Discovered in Release: 23.04

Fixed in Release: 23.07-0.5.0.0

Internal Reference Number

Description

2883451

Description: Installing mlnx_tune on Python3 did not work properly. mlnx_tune now supports Python3 in addition to Python2.

Keywords: Installation, mlnx_tune, Python3

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 23.04-0.5.3.3

3219842

Description: When creating a bond interface for all ports on a ConnectX-7 4-port HCA, the wrong bond name appeared in ibdev2netdev.

Keywords: RDMA, Bond Name, ibdev2netdev, ConnectX-7

Discovered in Release: 5.8- 1.0.1.1

Fixed in Release: 23.04-0.5.3.3

3333919

Description: Changing traffic class via the sysfs while modifying QPs in parallel causes a deadlock.

Keywords: RDMA, TC, Sysfs, QP

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 23.04-0.5.3.3

3406019

Description: Due to a bug in the emulation layer, performance degradation might be experienced when running GPUDirect over Virtual Functions.

Keywords: RDMA, GPUDirect, performance, VF

Discovered in Release: 5.9-0.5.6.0

Fixed in Release: 23.04-0.5.3.3

3233799

Description: debugfs directories cannot be created for representors and sub-functions, thus the log might show error warning for either of the scenarios.

Keywords: NetDev, debugfs, SF, logging

Discovered in Release: 5.8- 1.0.1.1

Fixed in Release: 23.04-0.5.3.3

1892663/1800633/2883451

Description: mlnx_tune script does not support Python3 interpreter.

Keywords: mlnx_tune, Python3

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 23.04-0.5.3.3

3340542

Description: The verion number for perftest was not-standard resulting in some distribution packages recieving a higher version number than the OFED version for no good reason. Changed the naming of perftest to MAJOR.MINOR.PATCH.

Keywords: Installation, perftest

Fixed in Release: 23.04-0.5.3.3

3428775

Description: knem did not fully support RHEL8.7 and newer releases.

Keywords: Installation, knem, RHEL

Discovered in Release: 5.8- 1.0.1.1

Fixed in Release: 23.04-0.5.3.3

3431430

Description: Installing MLNX_OFED on a RHEL system that uses a non-default version of Python (e.g., Python3.9 on RHEL8.6, where the default is 3.6) may fail with an error that mlnx-tools is missing a dependency on 'python(abi)'. mlnx-tools includes a single script, mlnx_qos, that depends on a specific version of python. In such a case, after the fix, it may fail to run with such a non-standard version of Python.

Keywords: Installation, Python, RHEL, mlnx-tools

Discovered in Release: 5.9-0.5.6.0

Fixed in Release: 23.04-0.5.3.3

© Copyright 2026, NVIDIA. Last updated on Jan 6, 2026