What can I help you with?
DOCA Documentation v3.0.0

Bug Fixes in This Version

Ref #

Issue

4186679

Description: Fixed an issue that could cause OVS to crash when SFLOW was enabled with OVN.

Keyword: OVS

Detected in version: 3.0.0

4242133

Description: Fixed an issue where setting hw-offload to false while ports were configured could trigger errors in the OVS logs.

Keyword: OVS-DOCA

Detected in version: 3.0.0

4313727

Description: Disabled the "mark" action in switch mode; it remains supported in VNF mode.

Keyword: DPDK

Detected in version: 3.0.0

4264397

Description: Fixed an issue where OVS did not punt IPv6 Neighbor Advertisements with unicast MACs to the CPU, preventing MAC learning for completely silent IPv6 endpoints. This caused traffic to be software forwarded until the endpoint initiated communication.

Keyword: OVS

Detected in version: 3.0.0

4287011

Description: Disabling OVS CT (using ovs-vsctl set o . other_config:hw-offload-ct-size=0) and attempting to offload CT rules is not supported and could lead to OVS crashes.

Keyword: OVS

Detected in version: 2.10.0

4304103

Description: Fixed an issue where, in flows rewriting both inner and outer destination (dst) and/or source (src) MAC addresses to the same value, the outer MAC rewrite was skipped, resulting in an outer MAC address of 00:00:00:00:00:00.

Keyword: MAC addresses

Detected in version: 3.0.0

4340654

Description: Fixed an issue that prevented LLDP traffic from VFs or BF host PFs from passing through the representor kernel interfaces.

Keyword: LLDP

Detected in version: 3.0.0

Ref #

Issue

4385184

Description: Fixed an issue where buffer initialization became a performance bottleneck during the allocation of large buffers, typically when using a high number of QPs with large message sizes. The root cause was the inefficient use of rand(). This has been resolved by replacing it with a faster pseudo-random algorithm.

Keyword: Buffer initialization, performance

Detected in version: 3.0.0

4390560

Description: Fixed a potential deadlock that could occur during the handling of peer memory registration failures.

Keyword: Deadlock, peer memory registration

Detected in version: 3.0.0

4405229

Description: Increased the size of the slow FDB table to prevent hitting the following error when switching to SwitchDev mode.

mlx5_core 0000:03:00.0: mlx5_cmd_out_err:835:(pid 24362): CREATE_FLOW_GROUP(0x933) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x4065f0), err(-22)

mlx5_core 0000:03:00.0: E-Switch: Failed to create peer miss flow group err(-22)

Keyword: Slow FDB table

Detected in version: 3.0.0

4296889

Description: Fixed a sysfs issue that occurred when accessing hardware counters from within a namespace.

Keyword: sysfs

Detected in version: 3.0.0

4320810

Description: Fixed an issue where ibstat would fail and crash when encountering a non-RoCE/IB device, preventing it from displaying information for the remaining valid RoCE/IB devices.

Keyword: ibstat

Detected in version: 3.0.0

4358857

Description: Increased the poll batch size as the number of QPs scales up to prevent bandwidth degradation in cases of high number of QPs, where polling only 16 CQEs per iteration may not be sufficient to process all completions in time.

Keyword: ib_write_bw performance

Detected in version: 3.0.0

3868222

Description: Fixed a race condition between firmware syndrome report and driver initialization during boot.

Keyword: Race condition, firmware syndrome report, driver initialization

Detected in version: 3.0.0

4125295

Description: Fixed an issue where the driver failed to load when a firmware syndrome was detected during boot.

Keyword: Driver load

Detected in version: 3.0.0

4369312

Description: Fixed an issue where the mlnx_tune -l command did not list several operating systems that were in fact supported.

Keyword: mlnx_tune, OSes

Detected in version: 3.0.0

4172481

Description: The kernel does not define TCA_TUNNEL_KEY_ENC_SRC_PORT. To align offload behavior with non-offload, the OVS community introduced a commit [1] that causes offload to fail if this tunnel attribute is used. Now, any rule with a tunnel set action that includes a tunnel source port can no longer be offloaded.

[1] netdev-offload-tc: Fix offload of tunnel key tp_src.

Keyword: Tunnel source port offload

Detected in version: 3.0.0

4375188

Description: Upstream kernel 6.11 introduced support for encapsulation control flags, which was also added in OVS 3.5.0. However, current hardware does not support matching on these flags, such as "don't fragment" and "checksum." Since these flags can be safely ignored, we reverted upstream commit [1] as a workaround to restore tunnel offload functionality.

[1] net/mlx5e: flower: validate encapsulation control flags

Keyword: Encapsulation control flags, tunnel offload

Detected in version: 3.0.0

4340654

Description: Fixed an issue where LLDP traffic from VFs or BF host PFs was not reaching the representor kernel interfaces.

Keyword: LLDP packets

Detected in version: 2.8.0, 2.9.0

4304103

Description: Flows where both the inner and outer destination (dst) and/or source (src) MAC addresses were rewritten to the same value, the outer MAC address rewrite was ignored, leading to an outer MAC address of 00:00:00:00:00:00. This issue has been fixed.

Keyword: MAC addresses

Detected in version: 2.4.0

4264397

Description: OVS does not forward IPv6 Neighbor Advertisements with unicast destination MAC addresses to the CPU. This means the endpoint MAC address may not be learned on the VTEP if the endpoint is silent, causing traffic to be software forwarded. After the endpoint initiates traffic, it will be hardware forwarded. The issue persists only if the endpoint never initiates any traffic, only responding to IPv6 Neighbor Solicitations (rare). This issue has been fixed.

Keyword: Neighbor Advertisements, Neighbor Solicitations, OVS

Detected in version: 2.10.0

4186679

Description: Fixed an issue where enabling sFlow with OVN caused OVS to crash.

Keyword: sFlow

Detected in version: 2.9.2

4150662

Description: Fixed an issue where OVS crashed unexpectedly after DPUs repeatedly broadcast the error message “packet with own source address.”

Keyword: OVS, DPUs

Detected in version: 2.7

4242133

Description: Fixed an issue where c hanging the hw-offload setting from true to false while ports are configured could lead to errors reported in the OVS log.

Keyword: OVS

Detected in version: 2.10

Ref #

Issue Description

4284756

Description: Boot option re-enabled by UEFI after power reset of DPU.

Keywords: Boot; UEFI; power reset

Detected in version : 4.10.0

4211513

Description: BlueField-3 SPDM Index 1 version is incorrect

Keywords: BlueField-3; SPDM

Detected in version : 4.10.0

4196880

Description: DHCP issues may lead to incomplete resolve.conf on the HBN container. The consequences can be DNS resolution failures and/or the hostname being set to 'localhost'.

Keyword: DHCP; resolve.conf; hostname; localhost; DNS

Detected in version: 4.10.0

4389380

Description: EXT4 file system corruption occurs following a power cycle

Keywords: Power cycle; EXT4

Detected in version : 4.10.0

4384302

Description: Fixed display issue were it appeared in log that only one partition was being updated despite both being updated.

Keywords: Partition; update

Detected in version : 4.10.0

4390904

Description: rcu-sched issue during NIC mode installation.

Keywords: NIC mode; installation

Detected in version : 4.10.0

4370524

Description: As a non-root user, sending bfver, bfvcheck would return extraneous trace massages "strings: /sys/firmware/acpi/tables/SSDT1: Permission denied"

Keyword: bfver; bfvcheck

Detected in version: 4.10.0

4353110

Description: bfb-install of bf-fwbundle-2.9.2.xxx via local rshim fails. To perform a full bfb upgrade over last GA release NIC FW:

- Install latest bfb-install which is needed to inform Arm that it is in NIC mode

- push the new release bfb

Keyword: bfb-install; rshim

Detected in version : 4.10.0

No bug fixes in this release.

Internal Ref.

Issue

4241238

Description: Fixed TX timeout issue related to the esw_scheduling QoS feature.

Keywords: esw_scheduling QoS

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4392587

Description: Adjusted the temperature sensors array size to match the number of sensors defined in the INI file.

Keywords: Temperature sensors

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4318537

Description: Fixed an issue where the AI and HAI parameters of the ZTR_RTTCC algorithm, when configured by users, were automatically overwritten upon link speed changes. With this fix, if AI/HAI values were tuned for link speeds other than 100Gb/s, users should now divide those values by (link_speed / 100) to maintain consistent congestion control algorithm behavior.

Keywords: Congestion control, ZTR_RTTCC

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4368450

Description: Fixed an issue where PCC_CNP_COUNT could not be reset using the pcc_counter.sh script in the DOCA tools.

Keywords: PCC

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4360191

Description: Fixed an issue where the CMDIF MNVDA could cause the NV Config mechanism to become stuck when the BMC enables Self Recovery mode.

Keywords: NV Config

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4346657

Description: Fixed a firmware issue to ensure that typical PPCC access register failures in DOCA PCC are no longer silently ignored. Users will now receive a syndrome notification when executing the command.

Keywords: DOCA PCC

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4257863

Description: Fixed an issue that could cause the DESTROY_MKEY command to take an excessively long time to execute, with the host driver displaying a "No done completion" message for this command.

Keywords: MKey

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4366438

Description: Fixed TX timeout issue when eSwitch scheduling is enabled and a rate limit is applied.

Keywords: TX timeout

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4370796

Description: Fixed an issue where the firmware could send an incorrect object_id in the device emulation object change event, causing the virtio-net controller to fail to respond to operations on the virtio device on the host. This issue commonly occurred after a software live upgrade when numerous events needed to be reported simultaneously (e.g., unbinding drivers on VFs in parallel).

Keywords: virtio-net controller

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

4345431

Description: Fixed high latency observed in IB_READ_LATACNY when eswitch scheduling is enabled and rate limit is set.

Keywords: Data latency

Discovered in Version: 32.43.1014

Fixed in Release: 32.45.1016

3878086

Description: Congestion Control counters such as ECN and CNP will now be the sum of both ports when in LAG mode.

Keywords: Congestion Control counters

Discovered in Version: 32.42.1000

Fixed in Release: 32.45.1016

4199274

Description: Fixed an issue where RTT packets with any destination MAC address were incorrectly treated as having a valid destination MAC. The new firmware now discards RTT packets if their destination MAC does not match the port's MAC.

Keywords: RTT, destination MAC

Discovered in Version: 32.44.1036

Fixed in Release: 32.45.1016

Internal Ref.

Issue

2899026 / 2853408

Description: Some pre-OS environments may fail when sensing a hot plug operation during their boot stage.

Keywords: BIOS; Hot plug; Virtio-net

Discovered in Version: 24.33.1048

Fixed in Release: 24.45.1016

3296463

Description: fwreset is currently supported on PCI Gen 4 devices only.

Keywords: fwreset, PCI Gen4

Discovered in Version: 24.37.1300

Fixed in Release: 24.45.1016

3457472

Description: Disabling the Relaxed Ordered (RO) capability (relaxed_ordering_read_pci_enabled=0) using the vhca_resource_manager is currently not functional.

Keywords: Relaxed Ordered

Discovered in Version: 24.37.1300

Fixed in Release: 24.45.1016

2169950

Description: When decapsulation on a packet occurs, the FCS indication is not calculated correctly.

Keywords: FCS

Discovered in Version: 24.42.1000

Fixed in Release: 24.45.1016

3638554

Description: Fixed an issue where, if the summary queue size on initiators exceeds the SRQ size on the NVMe-oF target, RNR NACKs are triggered. The Congestion Control (CC) mechanism significantly reduces the rate in response to the presence of RNR, leading to a substantial drop in bandwidth during NVMe WRITE operations and mixed tests.

Keywords: NVMe-oF target, RNR NACKs, Congestion Control (CC)

Discovered in Version: 24.43.2402

Fixed in Release: 24.45.1016

4262272

Description: Fixed an issue where the query_hca_cap timing could be increased on certain BlueField-2 systems.

Keywords: query_hca_cap timing

Discovered in Version: 24.43.2402

Fixed in Release: 24.45.1016

© Copyright 2025, NVIDIA. Last updated on May 5, 2025.