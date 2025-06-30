What can I help you with?
NVIDIA MLNX_EN Documentation v23.10-5.1.4.0 LTS
Bug Fixes in This Version

Below are the bugs fixed in this version. For a list of fixes previous version, see Bug Fixes History.

Internal Reference Number

Description

4410029

Description: Fixed an issue where installing mlnx-ofa_kernel drivers on SLES 15 SP5 with kernel version 5.14.21-150500.55.68-default (and newer) failed due to weak-modules falling back to the original inbox modules. The failure was caused by a mismatch: the original build kernel (5.14.21-150500.53-default) did not include the mana_ib driver, so no dummy module was provided, while the newer kernel did include it. This mismatch led to weak-modules sanity check errors due to the presence of the inbox mana_ib driver.

Keywords: mlnx-ofa_kernel, SLES 15 SP5

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4471811

Description: Resolved NVMe driver compilation issue on Linux kernel version 6.6.87.

Keywords: NVMe driver

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4253229

Description: Fixed a race condition between the firmware syndrome report and driver initialization during boot.

Keywords: Race condition

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4442965

Description: Fixed performance degradation on older kernel versions using RX cache, particularly on slower ARM CPUs with larger RX buffers. The issue was caused by the driver attempting to allocate new RX pages too quickly, leading to head-of-line blocking in the RX cache.

The fix improves RX cache usage by triggering page allocation for a bulk of at least 2 WQEs, allowing the application more time to process packets and return buffers to the RX cache, thereby reducing blocking and enhancing performance.

Keywords: Performance, kernel, Rx cache, page allocation

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4243800

Description: Resolved improper page deallocation handling issue present in some kernels.

Keywords: Page deallocation

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4466255

Description: Fixed an issue where a kernel crash could occur if a device event arrives during the event subscription process.

Keywords: DevX, event_fd

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4441119

Description: Fixed a crash caused by handling multiple CMA net events occurring in quick succession on the same CMA ID.

Keywords: CMA

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4405723

Description: Fixed a potential deadlock that could occur during the handling of peer memory registration failures.

Keywords: Deadlock, peer memory registration

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4340109

Description: Fixed a sysfs issue that occurred when accessing hardware counters from within a namespace.

Keywords: sysfs

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4248125

Description: Fixed the UMR QP recovery flow to ensure proper functionality and prevent tasks from getting stuck in the kernel. Additionally, resolved a race condition in the ODP MR area that could lead to a CQE error in the UMR QP.

Keywords: UMR QP recovery flow

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4235682

Description: Resolved corruption of SA MAD Congestion Control FIFO queue when all elements are canceled and a dequeue operation is attempted.

Keywords: SA legacy congestion control mechanism

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0

4409282

Description: Increased the size of the slow FDB table to prevent hitting the following error when switching to SwitchDev mode.

mlx5_core 0000:03:00.0: mlx5_cmd_out_err:835:(pid 24362): CREATE_FLOW_GROUP(0x933) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x4065f0), err(-22)

mlx5_core 0000:03:00.0: E-Switch: Failed to create peer miss flow group err(-22)

Keywords: Slow FDB table

Discovered in Release: 23.10-4.0.9.1

Fixed in Release: 23.10-5.1.4.0
