DOCA Documentation v2.9.4 (2024 LTS U4)

Bug Fixes in This Version

Ref #

Issue

4469496

Description: On some environments, when an application deletes all flow rules from a template table and then attempted to read flow rules to the same table, an error with fail to create rte flow is raised.

Keyword: Flow rules

Detected in version: 2.9.2

4391384

Description: The UCX package is built without GDR copy support due to an unintended change in the build system that excluded CUDA. As a result, applications relying on GDR functionality in UCX are unable to use it.

Keyword: GDR support; CUDA missing; build environment

Detected in version: 2.9.2

4403063

Description: When the package mlnx-fw-updater is installed, it runs its firmware loading script. That script will automatically try to start the MFT kernel support as part of its hardware scan loop. This may cause an issue on some devices.

Keyword: MFT; firmware

Detected in version: 2.9.2

4259675

Description: In rare cases, systems using shared receive queues (shared_rxq) may experience incorrect packet handling during high-throughput traffic.

Keyword: Shared RXQ; packet corruption; routing error

Detected in version: 2.9.2

4410028

Description: On SLES 15 SP5 with kernel version 5.14.21-150500.55.68-default or later, installation of mlnx-ofa_kernel drivers fails to use weak-modules, causing the system to fall back to inbox OFED modules. This occurs because the kernel used to build the drivers (5.14.21-150500.53-default) did not include the mana_ib driver, while newer kernels do—triggering a weak-modules sanity check failure due to the missing replacement.

Keyword: Weak modules; Kernel version mismatch; inbox driver conflict

Detected in version: 2.9.0

Ref #

Issue Description

4728634 / 4648210

Description: Fixed an issue preventing DNF/YUM upgrades of rebuilt mlnx-ofa_kernel-devel packages (via doca-kernel-support) due to version sorting errors. The rebuilt package now appends ".1" and the kernel version to ensure proper RPM comparison and smooth upgrades.

Keywords: mlnx-ofa_kernel-devel package

Reported in version: 2.9.3

Ref #

Issue Description

4403055

Description: Repeated power cycles cause corruption in the EXT4 file system.

Keywords: Power cycle; FS corruption

Reported in version:

Ref #

Issue Details

4658620

Description: In NIC mode, after Arm reboot, the BMC firmware and CEC versions are intermittently missing in the host query output.

Discovered in version: 24.10-LTSU3

4720835

Description: When creating a VLAN using ipmitool, the interface incorrectly appeared as a second entry in the DedicatedNetworkPorts collection. This results in redundant port entries showing identical LLDP data.

Discovered in version: 24.10-LTSU3

4718432

Description: The BMC pending version may be displayed in a non-standard format (e.g., the version BF-24.10-36 may appear as 02410036).

Discovered in version: 24.10-LTSU3

4720826

Description: After a successful BFB update via SCP, subsequent updates may show incorrect progress messages (e.g., “Transfer progress: 100%”) that do not reflect the actual update status.

Discovered in version: 24.10-LTSU3

4730931

Description: The BMC Redfish API may intermittently return a 404 "Resource Not Found" error when querying the oob0 Ethernet interface, even though oob0 is correctly listed in the EthernetInterfaceCollection.

Discovered in version: 24.10-LTSU3

4658609

Description: When BMC comes up from boot and the DPU OS is still down, Port/EthernetInterface objects will not be visible from Redfish.

Discovered in version: 24.10-LTSU3

Internal Ref.

Issue

44527750 / 4400949

Description: Fixed an issue where the MCTP context was incorrectly cleared when a new message arrived over a different transport.

Keywords: MCTP

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4774378 / 3921355

Description: Fixed a deadlock seen during stress traffic with small packets and modify actions, which could hang the SoC (e.g., testpmd with three SFs and hardware offload disabled).

Keywords: Firmware deadlock, small packets

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4507527 / 4518004 / 4633348 / 4662213

Description: Fixed an issue where, after installing the fw-bundle, the DOCA installer could not query BMC/CEC versions, so doca-installer --show-running-fw failed to display that version information..

Keywords: fw-bundle installation, BMC/CEC

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4507527

Description: Fixed an issue where multiple "NCSI Error" messages could appear in the DPU BMC journalctl logs during boot. These messages have no functional impact.

Keywords: BMC

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4566988 / 4570720

Description: Fixed an issue where the mlxconfig reset command resets the user's TLVs by swapping between NV_DATA0 and NV_DATA1 flash partitions, causing unnecessary flash wear. The reset now invalidates the TLVs directly, using the same approach as when resetting a single TLV.

Keywords: mlxconfig rese

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4719936 / 4731957 / 457884

Description: Fixed an issue where NV Config operations could hang or produce untracked register errors, and where flash operations longer than 2 seconds could trigger an unnecessary assert. Debug information has been added to help detect stuck NV Config operations and register access errors.

Keywords: Stability and diagnostic

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4370695 / 465353 / 4739817

Description: Fixed a corner case during warm reboot that could cause some TX lane elastic buffers to lose synchronization during link training.

Keywords: TX lane elastic buffers

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4507579 / 4498670

Description: Fixed an issue where destroying two emulation objects with the same VHCA ID could result in one destroy command returning syndrome 0xF3F880 due to a race condition.

Keywords: VHCA ID, syndrome 0xF3F880

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4640489

Description: Fixed an issue where the TX precoding state was changed to OFF during link cycle tests. TX precoding for LPO is now forced to remain ON.

Keywords: TX precoding state

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4641440 / 4636590

Description: Fixed a bug that could trigger assert 0x8cad when ROCE_CC_STEERING_EXT is enabled and roce_cc_shaper_coalescing is set to DEST_IP or 5_TUPLE.

Keywords: assert 0x8cad

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

4711949 / 4773333 / 4773334

Description: Fixed an issue where, in older versions and under specific configurations, the MEMIC BAR size could be unaligned to 64 bytes. The software now treats the address as aligned down to a 64-byte boundary.

Keywords: MEMIC BAR size

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.4100

4508525 / 4480427

Description: Fixed incorrect calculation of start address and mode for the CQE buffer in DPA CQ, which could cause CQEs to be written to the wrong address when the buffer is not 4K-aligned and spans a second page boundary.

Keywords: CQ, CQE Buffer, DPA

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.4100

4261968

Description: Performing a FW reactivation when there is no pending image on the flash will cause the PLDM upgrade to fail on the first attempt.

Keywords: PLDM

Discovered in Version: 32.43.2566

Fixed in Release: 32.43.4100

4507527 / 4508401

Description: During DPU BMC boot, users may observe multiple "NCSI Error" messages in the DPU BMC journalctl logs. These messages have no functional impact.

Keywords: BMC

Discovered in Version: 32.43.3608

Fixed in Release: 32.43.4100

© Copyright 2026, NVIDIA. Last updated on Jan 15, 2026