Known Issues

For the list of old Know Issues, please see the relevant Release Notes version.

Internal Ref.

Issue

3640110

Description: Setting the VLAN Trunk mode while the feature is disabled and the resiliency feature is enabled, triggers the resiliency flow for the maximal number of allowed resets, and consequently results in failure to load the driver.

Workaround: To avoid the issue perform the following:

  1. Do not configure VLAN trunk mode when it is disabled and resiliency is enabled.

  2. Disable resiliency.

  3. Load the driver.

  4. Delete the trunk mode configuration.

Keywords: VLAN Trunk mode, Resiliency, Yellow bang

Detected in version: 23.10.50000

3554731

Description: On Bluefield devices ,network adapter is disabled when performing cold boot right after restarting of the DPU.

Workaround: To resolve the issue perform one of the following:

  • Wait for the DPU to load and link up before executing cold boot.

  • Cold boot and also restart the DPU so a DPU restart is not required in case of cold boot.

Keywords: BlueField, cold boot, restart

Detected in version: 23.7.50000

3035275

Description: VMQoS statistics counter does not count RDMA traffic running on the PF.

Workaround: N/A

Keywords: VMQoS statistics counter, RDMA

Detected in version: 3.20.50010

3316447

Description: When using a firmware version between xx.34.1000 and xx.36.1000, the driver reports the wrong number of SQs (adds 1).

Workaround: Update the firmware version to xx.36.1000 or above.

Keywords: VMQoS SR-IOV

Detected in version: 3.20.50010

3240702

Description: Anti-spoofing counters are not supported on BlueField-2 in DPU mode.

Workaround: Arm user can add anti-spoofing rule via Linux in DPU as Arm is the manager of the eSwitch in embedded model.

Keywords: Anti-spoofing counters, Smart mode, Embedded mode, DPU

Detected in version: 3.10.50000

3046630

Description: The device will stay associated with the currently installed driver when downgrading to a package that does not support the device.

Workaround: Uninstall the device and scan for new hardware. The device will appear as a unknown device.

Keywords: Installation, downgrade

Detected in version: 3.10.50000

3150126

Description: On ConnectX-4 and ConnectX-4 Lx, when using Hardware QoS Offload revision 2 and running RDMA from the VF, only the RX inbound counters will be increased in the RDMA activity counters, and the bytes and frames counters will be the same.

Note: There is no functional impact on the actual traffic, only wrong counter value.

Workaround: N/A

Keywords: Hardware QoS Offload, RDMA activity counters

Detected in version: 3.0.50000

3040551

Description: The Set/Get/Enable-NetAdapterEncapsulatedPacketTaskOffload powersehll commands are not supported by default when working in NIC mode on NVIDIA Bluefield-2 DPU.

These commands will fail in this mode because the encapsulation registry keys (*EncapsulatedPacketTaskOffload,*EncapsulatedPacketTaskOffloadNvgre,*EncapsulatedPacketTaskOffloadVxlan) are missing.

However, as the encapsulation is still enabled by default, the user can configure encapsulation without these commands.

Workaround: Manually add these keys, however, the keys must be removed or disabled when switching to Smart NIC mode. For instruction on how to add the keys please see Configuring the Driver Registry Keys.

Keywords: NVIDIA BlueField-2, NIC Mode,NVGRE,VXLAN

Detected in version: 2.90.50010

2891364

Description: When working with ConnectX-4 or ConnectX-4 Lx dual-port adapter cards, the value of the "EnableVmQoSOffloadRev2" registry key must be the same on both ports, otherwise one port will failed to load.

Workaround: Set the same value for the "EnableVmQoSOffloadRev2" registry key on both ports.

Keywords: EnableVmQoSOffloadRev2, VMQoS

Detected in version: 2.80.50000

2854943

Description: When using Hardware QoS Offload Rev 2 when in VMQ mode and the VM traffic is mapped to TC != 0, the rate limit will be enforced only on NVIDIA ConnectX-4 Lx. For all other devices, the rate limit will be enforced only for TC = 0.

Note: When in SR-IOV mode, it works for all devices as expected.

Workaround: N/A

Keywords: Hardware QoS Offload, VMQ

Detected in version: 2.80.50000

2302247

Description: mlx5cmd exposes the system GUID information of a NVIDIA BlueField Virtual Function irrespective of its trusted state.

Workaround: N/A

Keywords: mlx5cmd, VF, NVIDIA BlueField, GUID

Detected in version: 2.80.50000

2491846

Description: As oversubscription of QP parameters (entries and depth) is allowed, it could cause run-time failure when running out of resources.

Workaround: N/A

Keywords: QP creation

Detected in version: 2.70.50000

2380684

Description: Although the IPOIB failover team gets the correct DHCP address when first created, if the team is disabled and then enabled, Windows requests and rejects the DHCP address as BAD_ADDRESS.

Workaround: When the issue is seen, restart the secondary member(s) of the team.

Keywords: IPOIB teaming, DHCP

Detected in version: 2.70.50000

2603423

Description: When in ETH mode, setting the MTU (JumboPacket) lower than 1514, results in Received Packets Error counters not being increased when receiving packets with larger frame size but less or equal to 1518 bytes (Like ping with data size of 1476).

Workaround: N/A

Keywords: MTU, traffic, counters

Detected in version: 2.70.50000

2374101

Description: After upgrade, *PtpHardwareTimestamp remains enabled. When *PtpHardwareTimestamp is enabled, UDP performance feature (URO) wil be automatically disabled.

This is an OS limitation, if you do not use the HW time stamp feature, it is recommended to disable this feature by setting *PtpHardwareTimestamp to 0.

Workaround: Disable HW timestamping. by setting *PtpHardwareTimestamp to 0.

Keywords: *PtpHardwareTimestamp, UDP performance feature ,URO

Detected in version: 2.60.50000

2306807

Description: When the Decouple VmSwitch protocol is enabled, VM's friendly given name is not displayed when running the "Get-NetAdapterSriovVf" and "mlnx5hpccmd -DriverVersion" commands.

Workaround: N/A

Keywords: HPC, SR-IOV

Detected in version: 2.60.50000

2205722

Description: WinOF-2 driver does not support IB MTU lower than 614.

Workaround: N/A

Keywords: IB MTU

Detected in version: 2.60.50000

2180714

Description: In case the user configs TCP to priority 0 with no VlanID, the packets will be sent without a VLAN header since the miniport cannot distinguish between priority 0 with VlanId 0 and no Vlan tag.

Workaround: N/A

Keywords: TCP QOS

Detected in version: 2.50.50000

2216232

Description: As ConnectX-5 adapter cards do not create counters for RX PACKET MARKED PCIe BUFFERS, its value will be 0.

Workaround: N/A

Keywords: ECN Marking

Detected in version: 2.50.50000

2243909

Description: The driver to sends a wrong CNP priority counter while running RDMA.

Workaround: Change the CNP priority using mlxconfig.

Keywords: RDMA, CNP

Detected in version: 2.50.50000

2118837

Description: Performance degradation might be experienced during UDP traffic when using a container networking and the UDP message size is larger than the MTU size .

Workaround: N/A

Keywords: Nested Virtualization, container networking

Detected in version: 2.50.50000

2137585

Description: While working in IPoIB mode and *JumboPacket is set in the range of [256, 614], the driver issues a warning event log message (Event ID: 25). This is a false alarm and could be ignored.

Workaround: N/A

Keywords: JumboPacket

Detected in version: 2.50.50000

2148077

Description: Explicitly disabling the *NetworkDirect key when using the HyperV mode, disables NDSPI as well as the NDK.

Workaround: Enable NetworkDirect (ND).

Keywords: ND, HyperV

Detected in version: 2.50.50000

2117964

Description: A delay in connection establishment might be experienced when the ND application is started immediately after restarting the adapter card. This scenario occurs because the ND application requires the ARP table to find the destination MAC and generate the ARP request.

Workaround: Use static ARP. Ping the system before starting the ND application.

Keywords: ND, RDMA

Detected in version: 2.40.51000

2117636

Description: On a native setup, when setting JumboPacket to be less than 1514, the Large Receive Offload (LRO) feature might be disabled, and all its counters will not be valid.

Workaround: N/A

Keywords: LRO, RSC

Detected in version: 2.40.51000

2083686

Description: As PCIe Write Relaxed Ordering is enabled by default, some older Intel processors might observe up to 5% packet loss in high packet rate and small packets. (https://lore.kernel.org/patchwork/patch/820922/)

Workaround: Disable the Relaxed Ordering Write option by setting the RelaxedOrderingWrite registry key to 0 and restart the adapter.

Keywords: PCIe Write Relaxed Ordering

Detected in version: 2.40.50000

1763379

Description: On Windows Server 19H1, running "netstat -axn" when RDMA is enabled and a vNIC is present, results in RDMA being disabled on the port with the VMswitch.

Workaround: N/A

Keywords: VMSwitch, RDMA, Windows Server 2019

Detected in version: 2.40.50000

1908862

Description: When running RoCE traffic with a different RoceFrameSize configuration, and the fabric (jumbo packet size) is large enough, the MTU will be taken from the initiator even when it supports larger size than the server.

Workaround: N/A

Keywords: RoCE, MTU

Detected in version: 2.40.50000

1846356

Description: The driver ignores the value set by the "*NumVfs" key. The maximal number of VFs is the maximal number of VFs supported by the hardware.

Workaround: N/A

Keywords: SR-IOV NUMVFs

Detected in version: 2.30.50000

1598716

Description: Issues with the OS' "SR-IOV PF/VF Backchannel Communication" mechanism in Windows Server 2019 Hyper-V, effect VF-Counters functionality as well.

Workaround: N/A

Keywords: Mellanox WinOF-2 VF Port Traffic, VF-Counters

Detected in version: 2.30.50000

1702662

Description: On WIndows Server 2019, the physical media type of the IPoIB NIC will be 802.3 and not InfiniBand.

Workaround: Use the mlx5cmd tool ("mlx5cmd -stat") which is part of the driver package to display the lin_layer type.

Keywords: Windows Server 2019, IPoIB NdisPhysicalMedium

Detected in version: 2.20

1718201

Description: Heavy traffic causes Sniffer' limit file to be the same as the buffer size (100M by default).

Workaround: N/A

Keywords: Sniffer, heavy traffic

Detected in version: 2.20

1580985

Description: iSCSI boot over IPoIB is currently not supported.

Workaround: N/A

Keywords: iSCSI Boot, IPoIB

Detected in version: 2.10

1536971

Description: The RscIPv4 and RscIPv6 keys’ values are set to 0 for the host in Windows Server 2019. As the values for those keys are already written by the Inbox Driver in Windows Server 2019, they will not be changed when upgrading.

Workaround: N/A

Keywords: RscIPv4, RscIPv6, Windows Server 2019

Detected in version: 2.10

1336097

Description: Due to an OID timeout, the miniport reset is executed.

Workaround: Increase the timeout value in such way that 2 * CheckForHangTOInSeconds > Max OID time.

For further information, refer to section General Registry Keys in the User Manual.

Keywords: Resiliency

Detected in version: 1.90

The below table summarizes the SR-IOV working limitations, and the driver’s expected behavior in unsupported configurations.

WinOF-2 Version

NVIDIA® ConnectX®-4 Firmware Ver.

Adapter Mode

InfiniBand

Ethernet

SR-IOV On

SR-IOV Off

SR-IOV On/Off

Earlier versions

Up to 12.16.1020

Driver will fail to load and show "Yellow Bang" in the device manager.

No limitations

1.50 and 1.60

Between 1x.16.1020 and 1x.19.2002 (IPoIB supported)

“Yellow Bang” unsupported mode - disable SR-IOV via mlxconfig

OK

No limitations

1.70 and onwards

1x.19.2002 and onwards (IPoIB supported)

OK

OK

No limitations

For further information on how to enable/disable SR-IOV, please refer to section Single Root I/O Virtualization (SR-IOV).

© Copyright 2023, NVIDIA. Last updated on Nov 3, 2023.