What can I help you with?
DOCA Documentation v2.9.3 LTS

Bug Fixes in This Version

Ref #

Issue

4469496

Description: On some environments, when an application deletes all flow rules from a template table and then attempted to read flow rules to the same table, an error with fail to create rte flow is raised.

Keyword: Flow rules

Detected in version: 2.9.2

4391384

Description: The UCX package is built without GDR copy support due to an unintended change in the build system that excluded CUDA. As a result, applications relying on GDR functionality in UCX are unable to use it.

Keyword: GDR support; CUDA missing; build environment

Detected in version: 2.9.2

4403063

Description: When the package mlnx-fw-updater is installed, it runs its firmware loading script. That script will automatically try to start the MFT kernel support as part of its hardware scan loop. This may cause an issue on some devices.

Keyword: MFT; firmware

Detected in version: 2.9.2

4259675

Description: In rare cases, systems using shared receive queues (shared_rxq) may experience incorrect packet handling during high-throughput traffic.

Keyword: Shared RXQ; packet corruption; routing error

Detected in version: 2.9.2

4410028

Description: On SLES 15 SP5 with kernel version 5.14.21-150500.55.68-default or later, installation of mlnx-ofa_kernel drivers fails to use weak-modules, causing the system to fall back to inbox OFED modules. This occurs because the kernel used to build the drivers (5.14.21-150500.53-default) did not include the mana_ib driver, while newer kernels do—triggering a weak-modules sanity check failure due to the missing replacement.

Keyword: Weak modules; Kernel version mismatch; inbox driver conflict

Detected in version: 2.9.0

Ref #

Issue Description

4403055

Description: Repeated power cycles cause corruption in the EXT4 file system.

Keywords: Power cycle; FS corruption

Reported in version:

Ref #

Issue Details

4398082

Description: On BlueField-3, repeatedly modifying the BMC user password can eventually fill the flash storage due to excessive log generation. This may cause the BMC logging service to enter an abnormal state.

Discovered in version: 24.10

4508657

Description: After upgrading the BMC to version 24.10 or later, SEL entries generated by earlier versions may no longer appear in ipmitool sel output.

Discovered in version: 24.10

4413916

Description: On certain BlueField-3 systems, newly created BMC users may fail to authenticate via ipmitool lanplus, returning an "unauthorized name" error.

Discovered in version: 24.10

4481392

Description: BMC reboot in the middle of config update can result in unexpected behavior.

Discovered in version: 24.10

4440343

Description: The BMC fails to boot due to the device running out of space, as the dump file completely filled the read-write flash.

Discovered in version: 24.10

4461233

Description: In BlueField-3 BMC, the USB interface between Arm and BMC could drop to full speed and fail to recover, causing sustained performance degradation.

Discovered in version: 24.10

4482812

Description: In BlueField-3 BMC software, the RTC battery sensor generates false SEL entries due to an incorrect or overly sensitive voltage threshold that misclassified normal readings as out of range.

Discovered in version: 24.10-LTSU2

Internal Ref.

Issue

4444987

Description: Removed from the relevant PRS the incorrect INI configuration that skipped receiver detection.

Keywords: PRS

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4427796

Description: Enabled MCTP communication with the DPU BMC on SKUs: 900-9D3C6-00SV-DA0 and 900-9D3C6-B9SV-DA0.

Keywords: MCTP communication, DPU BMC

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4438736

Description: Fixed a race condition between firmware and hardware flows during QP closure and a potential endless loop.

Keywords: Race; endless loop

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4470567

Description: Modified the VQoS parameter configuration to improve latency for large messages.

Keywords: VQoS, latency improvement

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4443919

Description: Fixed a race condition between firmware and hardware flows during QP closure.

Keywords: Race condition

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4355566

Description: Fixed high latency observed in IB_READ_LATACNY when eswitch scheduling is enabled and rate limit is set.

Keywords: Data latency

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4444874

Description: Fixed an issue where the firmware failed to de-assert the PERST signal of the DSP on pcore1. The fix involved correctly checking the output of the default GPIO mapping against 0xFFF (NO_GPIO_FUNCTION) instead of 0xFF (INVALID_READ).

Keywords: PERST signal

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4234972

Description: Fixed an issue where the isr_distributer, responsible for distributing tokens to SQs, was not being triggered reliably every 100 µs. Its priority has been elevated to HIGH, and it is now marked as 'busy' upon completion to ensure consistent and timely execution.

Keywords: VQoS

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4384412

Description: Fixed an issue where the firmware could send an incorrect object_id in the device emulation object change event, causing the virtio-net controller to fail in handling operations on the host's virtio device. This typically occurred after a software live upgrade when many events were triggered simultaneously—such as unbinding drivers on VFs in parallel—and could result in a host hang.

Keywords: Device emulation object change event

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4344710

Description: The enabled by default MSB bit in pkg_id has been removed from the strap. pkg_id now correctly supports values in the range 0 to 3.

Keywords: NC-SI package ID

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4330201

Description: Fixed an issue that prevented the OS from booting due to UEFI PCI enumeration.

Keywords: Booting

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4283167

Description: Fixed an issue in the VQoS algorithm related to learning when an element is active and when it begins sending traffic.

Keywords: VQoS algorithm

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4283168

Description: Resolved higher latency issue when enabling VF group rate limiter (ESW scheduling).

Keywords: Rate limiter

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4361277

Description: Fixed an issue in the ZTR_RTTCC algorithm when using SOURCE_QP (ROCE_CC_SHAPER_COALESCE in mlxconfig) in LAG mode, which caused low bandwidth in many-to-one traffic scenarios.

Keywords: LAG, PCC, ZTR_RTTCC

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4403151

Description: Fixed an issue that caused reduced bandwidth during the initial traffic phase when the lossy ADP retransmission feature was enabled alongside the DCQCN congestion control algorithm, due to a low ACK timeout making ADP retransmissions overly aggressive.

Keywords: Lossy ADP retransmission, Congestion Control

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4444306

Description: Fixed an issue where transitioning a QP attached to an XRQ to the error state using the 2ERR command could lead to request conflicts. The firmware now properly waits for all in-flight requests to complete before issuing a new event, ensuring the software can safely proceed with initializing a new QP.

Keywords: NVMe-oF Target Offload

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4336970

Description: Reduced the bandwidth fluctuation induced by VQoS rate limiting in systems with bellow 350 QPs. This change is enabled by default.

Keywords: VQoS

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4336965

Description: Adjusted the RX lossless buffer default parameters to delay transmission of Pause/PFC frames when the NIC is congested. Rx lossless buffer parameters will now be enabled by default.

Keywords: RX lossless buffer size

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

4361179

Description: Fixed an issue that caused bandwidth to drop when unbinding multiple VFs with VQoS enabled.

Keywords: VQoS

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.3608

Internal Ref.

Issue

4342749

Description: Fixed an issue where, if the summary queue size on initiators exceeds the SRQ size on the NVMe-oF target, RNR NACKs are triggered. The Congestion Control (CC) mechanism significantly reduces the rate in response to the presence of RNR, leading to a substantial drop in bandwidth during NVMe WRITE operations and mixed tests.

Keywords: NVMe-oF target, RNR NACKs, Congestion Control (CC)

Discovered in Version: 24.43.2566

Fixed in Release: 24.43.3608

4358188

Description: Fixed an issue where enabling DIM could lead to high IRQ/s in certain scenarios.

Keywords: vDPA, DIM

Discovered in Version: 24.43.2566

Fixed in Release: 24.43.3608

4355566

Description: Fixed high latency observed in IB_READ_LATACNY when eswitch scheduling is enabled and rate limit is set.

Keywords: Data latency

Discovered in Version: 24.43.2566

Fixed in Release: 24.43.3608

© Copyright 2025, NVIDIA. Last updated on Jul 3, 2025.