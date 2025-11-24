On This Page
4415267
Description: When DPA code running by Flex IO RPC or CMD-Q runs into a recoverable error, code execution doesn't resume from calling context, which results in a timeout.
Workaround: N/A
Keyword: DPA; Timeout; Flex IO SDK; RPC
Reported in version: 3.0.0
4452977
Description: The IPSec hardware offload Anti-Replay feature has a known issue and should not be used.
Workaround: N/A
Keyword: IPSec; Anti-Replay
Reported in version: 3.0.0
4426511
Description: Orchestrated reset mode (MLXConfig) will be released as a Beta feature. There's a known race condition between server reboot and the reset flow running in parallel, which can cause the reset to go out of sync.
Workaround: Power cycle the system to recover the issue.
Keyword: Orchestrated reset mode
Reported in version: 3.0.0
2657392
Description: OFED installation caused CIFS to break in RHEL 8.4 and above. A dummy module was added so that CIFS will be disabled after OFED installation in RHEL 8.4 and above.
Workaround: N/A
Keyword: Installation; CIFS
Reported in version: 3.0.0
4297489
Description: Due to incompatibility between DPA and host libraries, a DPA device application must be recompiled after updating DOCA to a newer version.
Workaround: N/A
Keyword: DPA; host library; update
Reported in version: 2.10.0
4270602
Description: UEFI/ATF firmware does not upgrade as part of the Linux Standard Tool process when Secure Boot is disabled.
Workaround: Remove PK key and initiate UEFI/ATF firmware upgrade again.
To remove the PK key, use the UEFI menu to navigate to Device Manager → Secure Boot Configuration → Custom Secure Boot Options → PK Options → Delete Signature.
Keyword: UEFI/ATF; PK; Secure Boot; EFI Capsule Authentication
Reported in version: 2.10.0
4200690
Description: The fTPM trusted application is signed for testing proposes only (i.e., not securely) with a development key.
Workaround: N/A
Keyword: fTPM over OP-TEE
Reported in version: 2.10.0
3987526
Description: OVS-DOCA offload of meter with sFlow is not supported and may cause OVS application to crash.
Workaround: N/A
Keyword: OVS-DOCA; meter; sFlow
Reported in version: 2.9.0
N/A
Description: Applications using DPA might not work with older firmware versions .
Workaround: Full upgrade of all DOCA 2.9.0 components including the firmware (i.e., doca-host and BF-Bundle) .
Keyword: DPA; backward compatibility
Reported in version: 2.9.0
N/A
Description: Applications using FlexIO SDK API may have missing symbols during runtime.
Workaround: Re-compile FlexIO-based applications with the DOCA 2.9.0 release.
Keyword: FlexIO; backward compatibility
Reported in version: 2.9.0
4095728
Description: Corrupt create repo causes doca-kernel repo to not contain the repo data.
Workaround: If repo data is missing after installing the doca-kernel repo, run
Keyword: Kernel; repo
Reported in version: 2.9.0
4049034
Description: On openEuler 22.03 SP3 and openEuler 20.03 SP1, it is not possible to do yum update after BFB installation.
Workaround: To perform yum update with either openEuler 22.03 SP3 and openEuler 20.03 SP1, follow these procedures depending on the use case:
Keyword: openEuler
Reported in version: 2.9.0
4046180
Description: PCIe data IDs that require
Workaround: N/A
Keyword: DOCA Telemetry
Reported in version: 2.9.0
4129715
Description: Compiling Rocky 9.2 may fail when using GCC with the "native" arch flag.
Keyword: Upgrade to toolset 13 (gcc 13).
Keyword: Linux; GCC
Reported in version: 2.9.0
4035553
Description:
Workaround: N/A
Keyword: Core
Reported in version: 2.8.0
4023257
Description: If RDMA samples are compiled with memory sanitizer enabled, "read memory leak" errors are printed when running the samples with the RDMA CM flag and when running the client before the server.
Workaround: Make sure to start the RDMA Server before RDMA Client.
Keyword: DOCA RDMA; samples
Reported in version: 2.8.0
4021752
4021748
Description: In all RDMA samples, if an error occurs in any of the following functions:
An error is printed but the sample resumes and might:
Workaround for 1: Either:
Workaround for 2: The mentioned address sanitizer violation shall be ignored in case of an error in a relevant function.
Keyword: DOCA RDMA; samples
Reported in version: 2.8.0
4022563
Description: OVS-DOCA connection tracking with E2E enabled is not supported.
Workaround: N/A
Keyword: OVS-DPDK; connection tracking; E2E
Reported in version: 2.8.0
3837255
Description: When running Arm shutdown from the host OS it is expected to get the message
Workaround: Wait 2 more minutes before rebooting the host. Before proceeding with host OS reboot, it is recommended to query the operational state of the BlueField Arm cores from the BlueField BMC to verify that shutdown state has been reached. Run the following command:
Expected output is
Keyword: Host OS; reboot; error
Reported in version: 2.7.0
3844705
Description: In OpenEuler 20.03, the Linux Kernel version 4.19.90 is affected by an issue that impacts the discard/trim functionality for the BlueField eMMC device which may cause degraded performance of the BlueField eMMC over time.
Workaround: Upgrade to Linux Kernel version 5.10 or later.
Keyword: eMMC discard; trim functionality
Reported in version: 2.7.0
3877725
Description: During BFB installation in NIC mode on BlueField-3, too much information is added into RShim log which fills it, causing the Linux installation progress log to not appear in the RShim log.
Workaround: Monitor the BlueField-3 Arm's UART console to check whether BFB installation has completed or not for NIC mode.
Keyword: NIC mode; BFB install
Reported in version: 2.7.0
3855702
Description: Trying to jump from a steering level in the hardware to a lower level using software steering is not supported on
Workaround: N/A
Keyword: RDMA; SWS
Reported in version: 2.7.0
3855485
Description:
When enabling the
Workaround: N/A
Keyword: NVconfig; RShim; dmsg
Reported in version: 2.7.0
3831230
Description: In OpenEuler 20.03, the Linux Kernel version 4.19.90 is affected by an issue that impacts the discard/trim functionality for BlueField eMMC device which may cause degraded performance of BlueField eMMC over time.
Workaround: Upgrade to Linux Kernel version 5.10 or later.
Keyword: eMMC discard; trim functionality
Reported in version: 2.7.0
3743879
Description:
Workaround:
Set
If host Linux kernel lockdown is enabled, then manually unbind the RShim driver before
Keyword: Timeout; mlxfwreset; INTx
Reported in version: 2.7.0
3678069
Description: If using BlueField with NVMe and mmcbld and configured to boot from mmcblk, users must create
Workaround: N/A
Keyword: NVMe
Reported in version: 2.5.0
3680538
Description: When using strongSwan or OVS-IPsec as explained in the NVIDIA BlueField DPU BSP, the IPSec Rx data path is not offloaded to hardware and occurs in software running on the Arm cores. As a result, bandwidth performance is substantially low.
Workaround: N/A
Keyword: IPsec
Reported in version: 2.5.0
N/A
Description: Execution unit partitions are still not implemented and would be added in a future release.
Workaround: N/A
Keyword: EU tool
Reported in version: 2.5.0
3666160
Description: Installing BFB using
Workaround: Change
Keyword: SF;
Reported in version: 2.2.1
3594836
Description: When enabling Flex IO SDK tracer at high rates, a slow-down in processing may occur and/or some traces may be lost.
Workaround: Keep tracing limited to ~1M traces per second to avoid a significant processing slow-down. Use tracer for debug purposes and consider disabling it by default.
Keyword: Tracer FlexIO
Reported in version: 2.2.1
3592080
Description: When using UEK8 on the host in DPU mode, creating a VF on the host consumes about 100MB memory on BlueField
Workaround: N/A
Keyword: UEK; VF
Reported in version: 2.2.1
3546202
Description: After rebooting a BlueField-3 DPU running Rocky Linux 8.6 BFB, the kernel log shows the following error:
This message indicates that the Ethernet driver will function normally in all aspects, except that PHY polling is enabled.
Workaround: N/A
Keyword: Linux; PHY; kernel
Reported in version: 2.2.0
3566042
Description: Virtio hotplug is not supported in GPU-HOST mode on the NVIDIA Converged Accelerator.
Workaround: N/A
Keyword: Virtio; Converged Accelerator
Reported in version: 2.2.0
3546474
Description: PXE boot over ConnectX interface might not work due to an invalid MAC address in the UEFI boot entry.
Workaround: On BlueField, create
Keyword: PXE; boot; MAC
Reported in version: 2.2.0
3561723
Description: Running
Workaround: N/A
Keywords: mlxfwreset
Reported in version: 2.2.0
3306489
Description: When performing longevity tests (e.g., mlxfwreset, DPU reboot, burning of new BFBs), a host running an Intel CPU may observer errors related to "CPU 0: Machine Check Exception".
Workaround: Add
Keywords: Longevity; mlxfwreset; DPU reboot
Reported in version: 2.2.0
3534219
Description: On BlueField-3 devices, from DOCA 2.2.0 to 32.37.1306 (or lower), the host crashes when executing partial Arm reset (e.g., Arm reboot; BFB push; mlxfwreset).
Workaround: Before downgrading the firmware:
Keyword: BlueField-3; downgrade
Reported in version: 2.2.0
3462630
When trying to perform a PXE installation when UEFI Secure Boot is enabled, the following error messages may be observed:
Workaround: Download a Grub EFI binary from the Ubuntu website. For further information on Ubuntu UEFI Secure Boot PXE Boot, please visit Ubuntu's official website.
Keyword: PXE; UEFI Secure Boot
Reported in version: 2.0.2
3448841
Description: While running CentOS 8.2, switchdev Ethernet BlueField runs in "shared" RDMA net namespace mode instead of "exclusive".
Workaround: Use
Keyword: RDMA; isolation; Net NS
Reported in version: 2.0.2
2706803
Description: When an NVMe controller, SoC management controller, and DMA controller are configured, the maximum number of VFs is limited to 124.
Workaround: N/A
Keyword: VF; limitation
Reported in version: 2.0.2
3273435
Description: Changing the mode of operation between NIC and DPU modes results in different capabilities for the host driver which might cause unexpected behavior.
Workaround: Reload the host driver or reboot the host.
Keyword: Modes of operation; driver
Reported in version: 2.0.2
3264749
Description: In Rocky and CentOS 8.2 inbox-kernel BFBs, RegEx requires the following extra huge page configuration for it to function properly:
If these commands have executed successfully you should see
Workaround: N/A
Keyword: RegEx; hugepages
Reported in version: 1.5.1
3240153
Description: DOCA kernel support only works on a non-default kernel.
Workaround: N/A
Keyword: Kernel
Reported in version: 1.5.0
3217627
Description: The
Workaround: N/A
Keyword: DOCA core; InfiniBand
Reported in version: 1.5.0
4404719
Description: Splitting a DPU into 4 ports conflicts with the shared_rq feature.
Workaround:
Keyword: PCI information
Reported in version: 3.0.0
4273881
Description: PCI information is missing on RedHat p host.
Workaround:
Matching an interface name with its PCI address requires running:
Keyword: PCI information
Reported in version: 3.0.0
4155701
Description: When offloading xfrm states to hardware, the offloading device is linked to the skb's secpath. If an skb is freed or deferred, an unregister netdevice operation may hang because the netdevice is still being reference-counted.
Workaround: Remove the netdevice from the xfrm states when the netdevice is unregistered.
Keyword: IPSec Crypto Offload
Reported in version: 2.10.0
4604969
Description: Probe packets might be dropped at the transmission stage when multiple congestion control flows are active.
Workaround: N/A
Keywords: PCC, RTT, probe
Detected in version: 32.47.1026
4683823
Description: Some diagnostic data counters share hardware resources and cannot be configured simultaneously since 64-bit counter formats (e.g., DIAG_DATA_PARAMS_CONTEXT.output_format set to FORMAT_0 or FORMAT_1) consume more hardware resources per counter.
Workaround: If a NO_RESOURCES error occurs, use output_format FORMAT_2 to reduce resource usage.
Keywords: DOCA Telemetry Diagnostics
Detected in version: 32.47.1026
4685736
Description: Creating a DPA process that allocates a 128 MB data segment and loads a dynamic library may fail with syndrome 0xdc30ac.
Workaround: Limit the DPA application’s data segment size to 64 MB.
Keywords: DPA
Detected in version: 32.47.1026
4535791
Description: When running sync2 on an EP configuration, the following error may appear during the MLXFWReset sync2 operation: ERROR: System Off: operation not handled.
Although sync2 is intended for use in a switch topology, it can technically run on an EP configuration. However, this is not the default mode, nor is it a typical or recommended use case.
Workaround: N/A
Keywords: sync2, EP configuration
Detected in version: 32.46.1006
4534767
Description: In multi-probe mode, only one slot of IFA1 or IFA2 is allowed, although IFA1 and IFA2 can operate together.
Workaround: N/A
Keywords: PCC, IFA1, IFA2
Detected in version: 32.46.1006
4394475
Description: The existing congestion control configuration applies globally, rather than on a per-priority basis.
Workaround: Ensure that the configuration values for all priorities are aligned in either
Keywords: Congestion control, ROCE_CC_PRIO
Detected in version: 32.45.1020
4422120
Description: Any BFB upgrade from the October GA (2.9.2) to the new BFB will trigger a 0x00b4 assert. Nonetheless, the update will complete successfully, and the customer can safely ignore the assert.
Workaround: N/A
Keywords: BFB upgrade
Detected in version: 32.45.1020
4216761
Description: For all host-related counters, the buffers used by the Arm are the same as those used by the host. Buffer usage is tracked collectively, combining both ARM and host consumption.
Workaround: N/A
Keywords: Counters
Detected in version: 32.45.1020
4125431
Description: The MKEY created by software (VirtIo.Net DPA App is created with a length of 1 byte and used to access L2 memory. Since the minimum translation size is 64 bytes, using a 1-byte MKEY results in a translation error and triggers an exception.
Workaround: N/A
Keywords: MKEY
Detected in version: 32.45.1020
4303583
Description: The query_header_modify_pattern command may produce inaccurate results when specific fields are used.
Workaround: N/A
Keywords: query_header_modify_pattern command
Detected in version: 32.45.1020
4296168
Description: Running mlxfwreset fails when the DPU is configured as the root complex for NVMe drives. This issue impacts the configuration use case where the DPU acts as the root complex for NVMe drives, rather than the BF-3 in the host functioning as a PCIe Switch for the NVMe.
Workaround: To ensure the firmware reset works correctly, explicitly run the fwreset command from the host using the "--method 1" flag (hot reset).
Keywords: mlxfwreset
Detected in version: 32.45.1020
4193036
Description: The initial allocation of DPA_THREAD on group affinity allocates memory for all EUs, including stack, core dump, and other resources.
Workaround: N/A
Keywords: DPA
Detected in version: 32.44.1036
4007228
Description: NC-SI pass-through requires the user to allocate a MAC address to the platform BMC.
Workaround: N/A
Keywords: NC-SI pass-through
Discovered in Version: 32.41.1000
3787618
Description: NVIA register is not allowed for external host if any field of EXTERNAL_HOST_PRIV or EXTERNAL_HOST_PRIV_FAST TLVs is not set as the default.
Workaround: N/A
Keywords: Host privilege
Discovered in Version: 32.41.1000
3636631
Description: When configuring BlueField-3 Arm cores as PCIe root-complex, all non-mlx5 devices must always set the BlueField-3’s IOMMU to disabled or passthrough mode. Turning IOMMU “ON” requires special handling of interrupts in the driver or the use of polling. For further assistance, contact NVIDIA support.
Workaround: N/A
Keywords: IOMMU
Discovered in Version: 32.39.2048
3614529
Description: The supported DDR5 link speed in SKU B3220, is 5200 MT/s.
Workaround: N/A
Keywords: DDR5 link speed
Discovered in Version: 32.39.2048
3728450
Description: SW_RESET with a pending image is currently not supported.
Workaround: N/A
Keywords: SW_RESET
Discovered in Version: 32.39.2048
3614288
Description: Occasionally, the device may hang when there a hot plug is performed from a unknown direction.
Workaround: N/A
Keywords: Hot-plug operation
Discovered in Version: 32.39.2048
-
Description: The I2C clock fall time is lower than the 12ns minimum defined in the I2C-bus specification.
For further information, refer to the I²C-bus Specification, Version 7.0, October 2021, https://www.i2c-bus.org/.
Workaround: N/A
Keywords: I2C clock
Discovered in Version: 32.39.2048
3439438
Description: When connecting to a High Speed Traffic Generator in 400G speed, the linkup time may takes up to 3 minutes.
Workaround: N/A
Keywords: 400G linkup time
Discovered in Version: 32.38.1002
3534128
Description: External flash access such as flash read using the MFT tools will fail if there is a pending image on the flash.
Workaround: N/A
Keywords: Flash access
Discovered in Version: 32.38.1002
3534219
Description: On BlueField-3 devices, from DOCA 2.2.0 to 32.37.1306 (or lower), the host crashes when executing partial Arm reset (e.g., Arm reboot; BFB push; mlxfwreset).
Workaround: Before downgrading the firmware, perform:
Keywords: BlueField-3; downgrade
Discovered in Version: 32.38.1002
3547022
Description: When unloading the network drivers on an external host, sync1 reset may be still reported as 'supported' although it is not. Thus, initiating the reset flow may result in reset failure after a few minutes.
Workaround: N/A
Keywords: Sync1 reset
Discovered in Version: 32.38.1002
3439438
Description: When connecting to a Spirent switch in 400G speed, the linkup time may takes up to 3 minutes.
Workaround: N/A
Keywords: Spirent, 400G, linkup time
Discovered in Version: 32.38.1002
3178339
Description: PCIe PML1 is disabled.
Workaround: N/A
Keywords: PCIe PML1
Discovered in Version: 32.38.1002
3525865
Description: Unexpected system behavior might be observed if the driver is loaded while reset is in progress.
Workaround: N/A
Keywords: Sync 1 reset, firmware reset
Discovered in Version: 32.38.1002
3275394
Description: When performing PCIe link secondary-bus-reset, disable/enable or mlxfwreset on AMD based Genoa systems, the device takes longer then expected to link up, due to a PCIe receiver termination misconfiguration.
Workaround: N/A
Keywords: PCIe
Discovered in Version: 32.37.1306
2878841
Description: The firmware rollback fails for the signature retransmit flow if the QPN field is configured in the mkey (as it only allows the given QP to use this Mkey) as the firmware rollback flow relies on an internal QP that uses the mkey.
Workaround: N/A
Keywords: Signature retransmit flow
Discovered in Version: 32.37.1306
3412847
Description: Socket-Direct is currently not supported.
Workaround: N/A
Keywords: Socket-Direct
Discovered in Version: 32.37.1306
4394475
Description: The existing congestion control configuration applies globally, rather than on a per-priority basis.
Workaround: Ensure that the configuration values for all priorities are aligned in either
Keywords: Congestion control, ROCE_CC_PRIO
Detected in version: 24.45.1020
3754913
Description: PHYless Reset is currently not supported.
Workaround: N/A
Keywords: PHYless Reset
Discovered in Version: 24.40.1000
3525865
Description: Unexpected system behavior might be observed if the driver is loaded while reset is in progress.
Workaround: N/A
Keywords: Sync 1 reset, firmware reset
Discovered in Version: 24.39.2048
-
Description: When
This might also cause an error while using timestamps for delay measurements (e,g., delay measurements reported by a PTP daemon) and even negative delay measurements in some cases.
Workaround: N/A
Keywords: PTP path delay
Discovered in Version: 24.38.1002
2878841
Description: The firmware rollback fails for the signature retransmit flow if the QPN field is configured in the mkey (as it only allows the given QP to use this Mkey) as the firmware rollback flow relies on an internal QP that uses the mkey.
Workaround: N/A
Keywords: Signature retransmit flow
Discovered in Version: 24.37.1300
3329109
Description: MFS1S50-H003E cable supports only HDR rate when used as a split cable.
Workaround: N/A
Keywords: HDR, split cable, MFS1S50-H003E
Discovered in Version: 24.37.1300
3267506
Description: CRC is included in the traffic byte counters as a port byte counter.
Workaround: N/A
Keywords: Counters, CRC
Discovered in Version: 24.35.2000
3141072
Description: The "max_shaper_rate" configuration query via QEEC mlxreg returns a value translated to hardware granularity.
Workaround: N/A
Keywords: RX Rate-Limiter, Multi-host
Discovered in Version: 24.34.1002
2870970
Description: GTP encapsulation (flex parser profile 3) is limited to the NIC domain.
Encapsulating in the FDB domain will render a 0-size length in GTP header.
Workaround: N/A
Keywords: GTP encapsulation
Discovered in Version: 24.34.1002
2870213
Description: Servers do not recover after configuring
Workaround: N/A
Keywords: VirtIO-net; power cycle
Discovered in Version: 24.33.1048
2855592
Description: When working with 3rd party device (e.g., Paragon) in 25GbE speed, the 25GbE speed must be configured in force mode.
Workaround: N/A
Keywords: Force mode, 3rd party devices, 25GbE
Discovered in Version: 24.33.1048
2850003
Description: Occasionally, when rising a logical link, the link recovery counter is increase by 1.
Workaround: N/A
Keywords: Link recovery counter
Discovered in Version: 24.33.1048
2616755
Description: Forward action for IPoIB is not supported on RX RDMA Flow Table.
Workaround: N/A
Keywords: Steering, IPoIB
Discovered in Version: 24.33.1048