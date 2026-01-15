4784257 Description: The DOCA-HOST perftest package is not compatible with CUDA 13 and fails to run when compiled with it.

Keywords: DOCA-HOST perftest

Workaround: N/A

Discovered in Release: 24.10-4.1.4.0

4087492 Description: When using hardware LAG, the hardware's hash may differ from the kernel's software bond hash. This discrepancy can cause packets from the same stream to be sent out through different ports.

Keywords: LAG

Workaround: N/A

Discovered in Release: 24.07-0.6.1.0

3990230 Description: DOCA-HOST include drivers for both Ethernet and InfiniBand protocols. By installing the kernel part of DOCA-HOST (as with MLNX_OFED before it) the InfiniBand drivers, such as ib_core and its dependencies that you may have previously installed on your host server will be replaced with the DOCA-HOST drivers.

Keywords: DOCA-HOST, kernel, inbox drivers

Workaround: To maintain the non-DOCA drivers, please use the relevant Inbox drivers https://docs.nvidia.com/networking/software/adapter-software/index.html#linux-inbox-drivers-upstream-releases

Discovered in Release: 24.07-0.6.1.0

4040187 Description: An mlx5_core probe error may occur on RHEL 9.4 after unloading the mlx5_core module.

Keywords: mlx5_core

Workaround: Upgrade to kernel 5.14.0-427.20.1.el9_4 or newer to resolve the kernel panics.

Discovered in Release: 24.07-0.6.1.0

4022803 Description: fwctl subsystem is supported on default kernels only. "add-kernel-support" will not build and install fwctl.

Keywords: fwctl, kernel

Workaround: N/A

Discovered in Release: 24.07-0.6.1.0

4001184 Description: When using QEMU versions older than 8.2, interruption loss are experienced resulting in firmware commands timeouts and undefined behavior in the driver.

Keywords: QEMU

Workaround: To avoid this issue, make sure: the running kernel includes the following change, which was accepted in 6.4-rc3: [Kernel Patch](https://lore.kernel.org/lkml/28521e1b0b091849952b0ecb8c118729fc8cdc4f.1683740667.git.reinette.chatre@intel.com/T/)

QEMU version is 8.2 or newer: [Qemu Patch](https://lore.kernel.org/qemu-devel/20231009064900.1465361-5-clg@redhat.com/)

Discovered in Release: 24.07-0.6.1.0

3856101 Description: In Debian 12, using dhcpcd instead of dhclient to configure the network interface (using Networkmanager) will result in wrong network interface configuration.

Keywords: dhcpcd, dhclient, Debian 12, Networkmanager

Workaround: Use dhclient to configure the network interface.

Discovered in Release: 24.04-0.6.6.0

3964215 Description: Driver might try to access privileged registers resulting in an error with syndrome.

Keywords: Unbind and bind the function or restart the driver.

Workaround: N/A

Discovered in Release: 24.04-0.6.6.0

3640907 Description: When using a kernel version lower than v5.5, application termination on PCIe Gen5 servers could lead to kernel problems, such as IOMMU call traces, because of a lack of support in the AMD IOMMU kernel component.

Keywords: PCIe Gen5, IOMMU, Call Trace

Workaround: To resolve the issue either: Add kernel parameter cmdline "iommu=pt" or Use a kernel that includes this upstream patchset: https://patchwork.kernel.org/project/linux-mediatek/cover/20190908165642.22253-1-murphyt7@tcd.ie/

Discovered in Release: 24.04-0.6.6.0

3004304 Description: Setting NVMe num_p2p_queues module parameter value to be greater than 0, may cause a harmless warning "irq #XXX: nobody cared" with Call Trace afterwards.

Keywords: NVMe, Call Trace, num_p2p_queues

Workaround: N/A

Discovered in Release: 24.01-0.3.3.1

3735400 Description: The NVMF connect command does not work on IB setups when AR (Adaptive Routing) is enabled, since the PI (the Protection Information that is used by the NVMF) and AR are not supported simultaneously .

Keywords: NVMF connect, PI, Adaptive Routing

Workaround: Disable the AR at the opensm, or, alternatively, disable the PI at the nvme_rdma with a new module parameter.

Discovered in Release: 24.01-0.3.3.1

3774149 Description: In some cases, there could be a race condition between RDMA_WRITE and shared memory write, leading to the MPI receiving invalid data with large messages or collective operations between ranks on the same node.

Keywords: Race condition, RDMA_WRITE, shared memory write

Workaround: Set UCX_RNDV_SCHEME=get_zcopy to force using RDMA_READ protocol.

Discovered in Release: 24.01-0.3.3.1

3565433 Description: An error may occur when creating a DCI due to oversized WQEs. This is caused by a loose enforcement of the allowed max quantity of SGEs.

Keywords: DCI, SGEs

Workaround: N/A

Discovered in Release: 24.01-0.3.3.1

3732632 Description: Geneve offload does not opeate together with FLEX_PARSER.

Keywords: Geneve offload, FLEX_PARSER

Workaround: Make sure that the firmware is appropriately configured by verifying that the FLEX_PARSER_PROFILE_ENABLE mlxconfig flag is set to 0.

Discovered in Release: 24.01-0.3.3.1

3644590 Description: When working in switchdev mode, the number of XFRM IN rules that can be added is limited to 2047.

Keywords: switchdev mode, XFRM IN rules

Workaround: N/A

Discovered in Release: 24.01-0.3.3.1

3563584 Description: In case of a steering loop, the packet would loop indefinitely, causing a device hang.

Keywords: Steering loop

Workaround: Enable firmware infinite loop protection.