Known Issues
The following table lists the known issues and limitations for this release of DOCA SDK.
Reference |
Description |
4032924 |
Description: When upgrading to DOCA 2.8.0 on RPM-based OSes, a conflict between strongswan-bf or libreswan and strongSwan may occur. |
Workaround: Before upgrading, delete strongswan-bf and libreswan:
|
|
Keyword: strongSwan; upgrade |
|
Reported in version: 2.8.0 |
|
4035553 |
Description: oper_sample_period does not always reflect the correct sample period. In some cases, it will reflect the admin_sample_period instead. |
Workaround: N/A |
|
Keyword: Core |
|
Reported in version: 2.8.0 |
|
4023257 |
Description: If RDMA samples are compiled with memory sanitizer enabled, "read memory leak" errors are printed when running the samples with the RDMA CM flag and when running the client before the server. |
Workaround: Make sure to start the RDMA Server before RDMA Client. |
|
Keyword: DOCA RDMA; samples |
|
Reported in version: 2.8.0 |
|
4021752 4021748 |
Description: In all RDMA samples, if an error occurs in any of the following functions:
An error is printed but the sample resumes and might:
|
Workaround for 1: Either:
Workaround for 2: The mentioned address sanitizer violation shall be ignored in case of an error in a relevant function. |
|
Keyword: DOCA RDMA; samples |
|
Reported in version: 2.8.0 |
|
3961940 |
Description: OVS-DOCA connection tracking with E2E enabled is not supported. |
Workaround: N/A |
|
Keyword: OVS-DPDK; connection tracking; E2E |
|
Reported in version: 2.8.0 |
|
3989851 |
Description: A DOCA Flow pipe has multiple actions. When the action idx is not 0 and it has a shared endecap action, a crash occurs when attempting to create an entry. |
Workaround: N/A |
|
Keyword: DOCA Flow |
|
Reported in version: 2.8.0 |
|
3988904 |
Description: Failure to create a control entry with shared endecap action. |
Workaround: N/A |
|
Keyword: DOCA Flow |
|
Reported in version: 2.8.0 |
|
3886674 |
Description: Installing doca-all and other DOCA metapackages does not install the mlnx-nvme driver. |
Workaround: mlnx-nvme is only needed for NVMe-over-RDMA remote storage support. If you wish to install it, add the mlnx-nvme package to the install command.
|
|
Keyword: NVMe; DOCA profile |
|
Reported in version: 2.7.0 |
|
3885930 |
Description: When installing DOCA-Host on a system using NVMe storage (typically local NVMe disk), and the script doca-kernel-support is used to rebuild and install kernel modules, unloading the mlx5 drivers is only possible after also unmounting the NVMe storage, which would typically necessitate a reboot. |
Workaround: N/A |
|
Keyword: NVMe; doca-kernel-support; DOCA for host |
|
Reported in version: 2.7.0 |
|
3837255 |
Description: When running Arm shutdown from the host OS it is expected to get the message -E- Failed to send Register MRSI. This message should be ignored. |
Workaround: Wait 2 more minutes before rebooting the host. Before proceeding with host OS reboot, it is recommended to query the operational state of the BlueField Arm cores from the BlueField BMC to verify that shutdown state has been reached. Run the following command:
Expected output is "06". |
|
Keyword: Host OS; reboot; error |
|
Reported in version: 2.7.0 |
|
3844705 |
Description: In OpenEuler 20.03, the Linux Kernel version 4.19.90 is affected by an issue that impacts the discard/trim functionality for the BlueField eMMC device which may cause degraded performance of the BlueField eMMC over time. |
Workaround: Upgrade to Linux Kernel version 5.10 or later. |
|
Keyword: eMMC discard; trim functionality |
|
Reported in version: 2.7.0 |
|
3877725 |
Description: During BFB installation in NIC mode on BlueField-3, too much information is added into RShim log which fills it, causing the Linux installation progress log to not appear in the RShim log.
|
Workaround: Monitor the BlueField-3 Arm's UART console to check whether BFB installation has completed or not for NIC mode.
|
|
Keyword: NIC mode; BFB install |
|
Reported in version: 2.7.0 |
|
3855702 |
Description: Trying to jump from a steering level in the hardware to a lower level using software steering is not supported on rdma-core lower than 48.x. |
Workaround: N/A |
|
Keyword: RDMA; SWS |
|
Reported in version: 2.7.0 |
|
3855485 |
Description: When enabling the PCI_SWITCH_EMULATION_ENABLE NVconfig, the mlx devices, and potentially the RShim devices disappear. Also, looking at the kernel logs using dmesg shows the following messages:
|
Workaround: N/A |
|
Keyword: NVconfig; RShim; dmsg |
|
Reported in version: 2.7.0 |
|
3831230 |
Description: In OpenEuler 20.03, the Linux Kernel version 4.19.90 is affected by an issue that impacts the discard/trim functionality for BlueField eMMC device which may cause degraded performance of BlueField eMMC over time. |
Workaround: Upgrade to Linux Kernel version 5.10 or later. |
|
Keyword: eMMC discard; trim functionality |
|
Reported in version: 2.7.0 |
|
3743879 |
Description: mlxfwreset could timeout on servers where the RShim driver is running and INTx is not supported. The following error message is printed: BF reset flow encountered a failure due to a reset state error of negotiation timeout. |
Workaround: Set PCIE_HAS_VFIO=0 and PCIE_HAS_UIO=0 in /etc/rshim.conf and restart the RShim driver. Then re-run the mlxfwreset command. If host Linux kernel lockdown is enabled, then manually unbind the RShim driver before mlxfwreset and bind it back after mlxfwreset:
|
|
Keyword: Timeout; mlxfwreset; INTx |
|
Reported in version: 2.7.0 |
|
3665070 |
Description: Virtio-net controller fails to load if DPA_AUTHENTICATION is enabled. |
Workaround: N/A |
|
Keyword: Virtio-net; DPA |
|
Reported in version: 2.5.0 |
|
3678069 |
Description: If using BlueField with NVMe and mmcbld and configured to boot from mmcblk, users must create bf.cfg file with device=/dev/mmcblk0, then install the *.bfb as normal. |
Workaround: N/A |
|
Keyword: NVMe |
|
Reported in version: 2.5.0 |
|
3680538 |
Description: When using strongSwan or OVS-IPsec as explained in the NVIDIA BlueField DPU BSP, the IPSec Rx data path is not offloaded to hardware and occurs in software running on the Arm cores. As a result, bandwidth performance is substantially low. |
Workaround: N/A |
|
Keyword: IPsec |
|
Reported in version: 2.5.0 |
|
N/A |
Description: Execution unit partitions are still not implemented and would be added in a future release. |
Workaround: N/A |
|
Keyword: EU tool |
|
Reported in version: 2.5.0 |
|
3666160 |
Description: Installing BFB using bfb-install when mlxconfig PF_TOTAL_SF>1700, triggers server reboot immediately. |
Workaround: Change PF_TOTAL_SF to 0, perform a graceful shutdown, power cycle, then installing BFB. |
|
Keyword: SF; PF_TOTAL_SF; BFB installation |
|
Reported in version: 2.2.1 |
|
3594836 |
Description: When enabling Flex IO SDK tracer at high rates, a slow-down in processing may occur and/or some traces may be lost. |
Workaround: Keep tracing limited to ~1M traces per second to avoid a significant processing slow-down. Use tracer for debug purposes and consider disabling it by default. |
|
Keyword: Tracer FlexIO |
|
Reported in version: 2.2.1 |
|
3592080 |
Description: When using UEK8 on the host in DPU mode, creating a VF on the host consumes about 100MB memory on BlueField |
Workaround: N/A |
|
Keyword: UEK; VF |
|
Reported in version: 2.2.1 |
|
3546202 |
Description: After rebooting a BlueField-3 DPU running Rocky Linux 8.6 BFB, the kernel log shows the following error:
This message indicates that the Ethernet driver will function normally in all aspects, except that PHY polling is enabled. |
Workaround: N/A |
|
Keyword: Linux; PHY; kernel |
|
Reported in version: 2.2.0 |
|
3566042 |
Description: Virtio hotplug is not supported in GPU-HOST mode on the NVIDIA Converged Accelerator. |
Workaround: N/A |
|
Keyword: Virtio; Converged Accelerator |
|
Reported in version: 2.2.0 |
|
3546474 |
Description: PXE boot over ConnectX interface might not work due to an invalid MAC address in the UEFI boot entry. |
Workaround: On BlueField, create /etc/bf.cfg file with the relevant PXE boot entries, then run the command bfcfg. |
|
Keyword: PXE; boot; MAC |
|
Reported in version: 2.2.0 |
|
3561723 |
Description: Running mlxfwreset sync 1 on NVIDIA Converged Accelerators may be reported as supported although it is not. Executing the reset will fail. |
Workaround: N/A |
|
Keywords: mlxfwreset |
|
Reported in version: 2.2.0 |
|
3306489 |
Description: When performing longevity tests (e.g., mlxfwreset, DPU reboot, burning of new BFBs), a host running an Intel CPU may observer errors related to "CPU 0: Machine Check Exception". |
Workaround: Add intel_idle.max_cstate=1 entry to the kernel command line. |
|
Keywords: Longevity; mlxfwreset; DPU reboot |
|
Reported in version: 2.2.0 |
|
3538486 |
Description: When removing LAG configuration from BlueField, a kernel warning for uverbs_destroy_ufile_hw is observed if virtio-net-controller is still running. |
Workaround: Stop virtio-net-controller service before cleaning up bond configuration. |
|
Keywords: Virtio-net; LAG |
|
Reported in version: 2.2.0 |
|
3534219 |
Description: On BlueField-3 devices, from DOCA 2.2.0 to 32.37.1306 (or lower), the host crashes when executing partial Arm reset (e.g., Arm reboot; BFB push; mlxfwreset). |
Workaround: Before downgrading the firmware:
|
|
Keyword: BlueField-3; downgrade |
|
Reported in version: 2.2.0 |
|
3462630 |
When trying to perform a PXE installation when UEFI Secure Boot is enabled, the following error messages may be observed:
|
Workaround: Download a Grub EFI binary from the Ubuntu website. For further information on Ubuntu UEFI Secure Boot PXE Boot, please visit Ubuntu's official website. |
|
Keyword: PXE; UEFI Secure Boot |
|
Reported in version: 2.0.2 |
|
3448841 |
Description: While running CentOS 8.2, switchdev Ethernet BlueField runs in "shared" RDMA net namespace mode instead of "exclusive". |
Workaround: Use ib_core module parameter netns_mode=0. For example:
|
|
Keyword: RDMA; isolation; Net NS |
|
Reported in version: 2.0.2 |
|
2706803 |
Description: When an NVMe controller, SoC management controller, and DMA controller are configured, the maximum number of VFs is limited to 124. |
Workaround: N/A |
|
Keyword: VF; limitation |
|
Reported in version: 2.0.2 |
|
3273435 |
Description: Changing the mode of operation between NIC and DPU modes results in different capabilities for the host driver which might cause unexpected behavior. |
Workaround: Reload the host driver or reboot the host. |
|
Keyword: Modes of operation; driver |
|
Reported in version: 2.0.2 |
|
3264749 |
Description: In Rocky and CentOS 8.2 inbox-kernel BFBs, RegEx requires the following extra huge page configuration for it to function properly:
If these commands have executed successfully you should see active (running) in the last line of the output. |
Workaround: N/A |
|
Keyword: RegEx; hugepages |
|
Reported in version: 1.5.1 |
|
3240153 |
Description: DOCA kernel support only works on a non-default kernel. |
Workaround: N/A |
|
Keyword: Kernel |
|
Reported in version: 1.5.0 |
|
3217627 |
Description: The doca_devinfo_rep_list_create API returns success on the host instead of Operation not supported. |
Workaround: N/A |
|
Keyword: DOCA core; InfiniBand |
|
Reported in version: 1.5.0 |