NVIDIA WinOF-2 Documentation v24.07.50000
NVIDIA WinOF-2 Documentation v24.07.50000

Known Issues

For the list of old Know Issues, please see the relevant Release Notes version.

Internal Ref.

Issue

4030457

Description: This release does not support InfiniBand (IB) over Windows OS when using ConnectX-7 MCX75310AAS-NEAT and MCX75310AAC-NEAT OPNs.

Workaround: N/A

Keywords: InfiniBand, Windows

Detected in version: 24.7.50000

3816081

Description: On Windows Server 2025 /Windows Client 11 24H2, device installation may take a few minutes on a disabled device.

Workaround: N/A

Keywords: Installation

Detected in version: 24.7.50000

4013319

Description: ND polling returns a wrong request context (ND2_RESULT.RequestContext).

Workaround: Use 2 CQs per QP (one for receive and one for send).

Keywords: ND polling

Detected in version: 24.7.50000

3974778

Description: The inbox driver of Windows Server 2025 and Windows Client 11 24H2 fails to load when REAL_TIME_TIMESTAMP_ENABLE in the firmware is set to 1.

By default, REAL_TIME_TIMESTAMP_ENABLE is set to 0.

Workaround: Disable the feature before downgrading to the Inbox driver.

Keywords: Inbox, 2.53, REAL_TIME_TIMESTAMP_ENABLE' ,Yellow bang, code 31

Detected in version: 24.7.50000

3876612

Description: When creating SET over two InfiniBand ports, error of Duplicate IP is received and the ping does not work.

Workaround: Perform the following sequence of operations:

  1. Create the SET over one InfiniBand port first.

  2. Assign the IP.

  3. Add the second port into the SET.

Keywords: InfiniBand, SET

Detected in version: 24.4.50000

3859759

Description: In Windows Server 2025/Windows 11 24H2, the disabled device remains disabled after driver installation.

Workaround: N/A

Keywords: Installation, Windows Server 2025/Windows 11 24H2

Detected in version: 24.4.50000

3888785

Description: In Windows Server 2019, when using SET over both ports of a dual-port device and the ms_ndislwf is bound, sending OID_TCP_OFFLOAD_PARAMETERS to port 1 will not enable the VXLAN Task Offload.

Workaround: Send OID_TCP_OFFLOAD_PARAMETERS to: enable port 2 -> disable port 2 -> enable port 1 -> enable port 2

Keywords: Switch Embedded Team, ETH, Windows Server 2019

Detected in version: 24.4.50000

3976810

Description: Occasionally, changing the IP while restarting the driver could lead to BSOD.

Workaround: Do not restart the driver while changing the IP.

Keywords: BSOD, Restart, Change IP, Reset

Detected in version: 24.4.50000

3876612

Description: When creating SET over 2 ports in Bluefield devices when in DPU mode, the user will get error of Duplicate IP and ping will not work.

Workaround: Disable one of the ports and remove the IP from the port.

Keywords: BlueField, DPU mode

Detected in version: 24.4.50000

3732709

Description: The inbox driver of Windows Server 2022/Windows Client 11(2.42/2.53) adds the "roceframesize" key to the registry with the value of 1024, meaning the RoCE Frame Size will not be changed automatically when changing the MTU size.

Note: On version 24.1 and above, *NetworkDirectRoCEFrameSize that replace roceframesize will be added automatically.

Workaround: To set the RoCE frame size automatically based on the MTU size, delete both the roceframesize and the *NetworkDirectRoCEFrameSize keys if they exist.

The keys can be deleted before installing new driver over the inbox or after.

Changes will be applied only after restart of the driver.

Keywords: RoceFrameSize,*NetworkDirectRoCEFrameSize, WS2022, inbox driver, Windows 11

Detected in version: 24.1.50000

3682841

Description: Configuration of RoCE MTU using the RoceFrameSize registry key does not work when *NetworkDirectRoCEFrameSize key exists.

Workaround: To configure RoCE MTU use the *NetworkDirectRoCEFrameSize registry key instead of RoceFrameSize.

Keywords: RoCE MTU registry, *NetworkDirectRoCEFrameSize , RoceFrameSize

Detected in version: 24.1.50000

3554731

Description: On Bluefield devices ,network adapter is disabled when performing cold boot right after restarting of the DPU.

Workaround: To resolve the issue perform one of the following:

  • Wait for the DPU to load and link up before executing cold boot.

  • Cold boot and also restart the DPU so a DPU restart is not required in case of cold boot.

Keywords: BlueField, cold boot, restart

Detected in version: 23.7.50000

3035275

Description: VMQoS statistics counter does not count RDMA traffic running on the PF.

Workaround: N/A

Keywords: VMQoS statistics counter, RDMA

Detected in version: 3.20.50010

3316447

Description: When using a firmware version between xx.34.1000 and xx.36.1000, the driver reports the wrong number of SQs (adds 1).

Workaround: Update the firmware version to xx.36.1000 or above.

Keywords: VMQoS SR-IOV

Detected in version: 3.20.50010

3240702

Description: Anti-spoofing counters are not supported on BlueField-2 in DPU mode.

Workaround: Arm user can add anti-spoofing rule via Linux in DPU as Arm is the manager of the eSwitch in embedded model.

Keywords: Anti-spoofing counters, Smart mode, Embedded mode, DPU

Detected in version: 3.10.50000

3046630

Description: The device will stay associated with the currently installed driver when downgrading to a package that does not support the device.

Workaround: Uninstall the device and scan for new hardware. The device will appear as a unknown device.

Keywords: Installation, downgrade

Detected in version: 3.10.50000

3150126

Description: On ConnectX-4 and ConnectX-4 Lx, when using Hardware QoS Offload revision 2 and running RDMA from the VF, only the RX inbound counters will be increased in the RDMA activity counters, and the bytes and frames counters will be the same.

Note: There is no functional impact on the actual traffic, only wrong counter value.

Workaround: N/A

Keywords: Hardware QoS Offload, RDMA activity counters

Detected in version: 3.0.50000

3040551

Description: The Set/Get/Enable-NetAdapterEncapsulatedPacketTaskOffload powersehll commands are not supported by default when working in NIC mode on NVIDIA Bluefield-2 DPU.

These commands will fail in this mode because the encapsulation registry keys (*EncapsulatedPacketTaskOffload,*EncapsulatedPacketTaskOffloadNvgre,*EncapsulatedPacketTaskOffloadVxlan) are missing.

However, as the encapsulation is still enabled by default, the user can configure encapsulation without these commands.

Workaround: Manually add these keys, however, the keys must be removed or disabled when switching to Smart NIC mode. For instruction on how to add the keys please see Configuring the Driver Registry Keys.

Keywords: NVIDIA BlueField-2, NIC Mode,NVGRE,VXLAN

Detected in version: 2.90.50010

2891364

Description: When working with ConnectX-4 or ConnectX-4 Lx dual-port adapter cards, the value of the "EnableVmQoSOffloadRev2" registry key must be the same on both ports, otherwise one port will failed to load.

Workaround: Set the same value for the "EnableVmQoSOffloadRev2" registry key on both ports.

Keywords: EnableVmQoSOffloadRev2, VMQoS

Detected in version: 2.80.50000

2854943

Description: When using Hardware QoS Offload Rev 2 when in VMQ mode and the VM traffic is mapped to TC != 0, the rate limit will be enforced only on NVIDIA ConnectX-4 Lx. For all other devices, the rate limit will be enforced only for TC = 0.

Note: When in SR-IOV mode, it works for all devices as expected.

Workaround: N/A

Keywords: Hardware QoS Offload, VMQ

Detected in version: 2.80.50000

2302247

Description: mlx5cmd exposes the system GUID information of a NVIDIA BlueField Virtual Function irrespective of its trusted state.

Workaround: N/A

Keywords: mlx5cmd, VF, NVIDIA BlueField, GUID

Detected in version: 2.80.50000

2491846

Description: As oversubscription of QP parameters (entries and depth) is allowed, it could cause run-time failure when running out of resources.

Workaround: N/A

Keywords: QP creation

Detected in version: 2.70.50000

2380684

Description: Although the IPOIB failover team gets the correct DHCP address when first created, if the team is disabled and then enabled, Windows requests and rejects the DHCP address as BAD_ADDRESS.

Workaround: When the issue is seen, restart the secondary member(s) of the team.

Keywords: IPOIB teaming, DHCP

Detected in version: 2.70.50000

2603423

Description: When in ETH mode, setting the MTU (JumboPacket) lower than 1514, results in Received Packets Error counters not being increased when receiving packets with larger frame size but less or equal to 1518 bytes (Like ping with data size of 1476).

Workaround: N/A

Keywords: MTU, traffic, counters

Detected in version: 2.70.50000

2374101

Description: After upgrade, *PtpHardwareTimestamp remains enabled. When *PtpHardwareTimestamp is enabled, UDP performance feature (URO) wil be automatically disabled.

This is an OS limitation, if you do not use the HW time stamp feature, it is recommended to disable this feature by setting *PtpHardwareTimestamp to 0.

Workaround: Disable HW timestamping. by setting *PtpHardwareTimestamp to 0.

Keywords: *PtpHardwareTimestamp, UDP performance feature ,URO

Detected in version: 2.60.50000

2306807

Description: When the Decouple VmSwitch protocol is enabled, VM's friendly given name is not displayed when running the "Get-NetAdapterSriovVf" and "mlnx5hpccmd -DriverVersion" commands.

Workaround: N/A

Keywords: HPC, SR-IOV

Detected in version: 2.60.50000

2205722

Description: WinOF-2 driver does not support IB MTU lower than 614.

Workaround: N/A

Keywords: IB MTU

Detected in version: 2.60.50000

2180714

Description: In case the user configs TCP to priority 0 with no VlanID, the packets will be sent without a VLAN header since the miniport cannot distinguish between priority 0 with VlanId 0 and no Vlan tag.

Workaround: N/A

Keywords: TCP QOS

Detected in version: 2.50.50000

2216232

Description: As ConnectX-5 adapter cards do not create counters for RX PACKET MARKED PCIe BUFFERS, its value will be 0.

Workaround: N/A

Keywords: ECN Marking

Detected in version: 2.50.50000

2243909

Description: The driver to sends a wrong CNP priority counter while running RDMA.

Workaround: Change the CNP priority using mlxconfig.

Keywords: RDMA, CNP

Detected in version: 2.50.50000

2118837

Description: Performance degradation might be experienced during UDP traffic when using a container networking and the UDP message size is larger than the MTU size .

Workaround: N/A

Keywords: Nested Virtualization, container networking

Detected in version: 2.50.50000

2137585

Description: While working in IPoIB mode and *JumboPacket is set in the range of [256, 614], the driver issues a warning event log message (Event ID: 25). This is a false alarm and could be ignored.

Workaround: N/A

Keywords: JumboPacket

Detected in version: 2.50.50000

2148077

Description: Explicitly disabling the *NetworkDirect key when using the HyperV mode, disables NDSPI as well as the NDK.

Workaround: Enable NetworkDirect (ND).

Keywords: ND, HyperV

Detected in version: 2.50.50000

2117964

Description: A delay in connection establishment might be experienced when the ND application is started immediately after restarting the adapter card. This scenario occurs because the ND application requires the ARP table to find the destination MAC and generate the ARP request.

Workaround: Use static ARP. Ping the system before starting the ND application.

Keywords: ND, RDMA

Detected in version: 2.40.51000

2117636

Description: On a native setup, when setting JumboPacket to be less than 1514, the Large Receive Offload (LRO) feature might be disabled, and all its counters will not be valid.

Workaround: N/A

Keywords: LRO, RSC

Detected in version: 2.40.51000

2083686

Description: As PCIe Write Relaxed Ordering is enabled by default, some older Intel processors might observe up to 5% packet loss in high packet rate and small packets. (https://lore.kernel.org/patchwork/patch/820922/)

Workaround: Disable the Relaxed Ordering Write option by setting the RelaxedOrderingWrite registry key to 0 and restart the adapter.

Keywords: PCIe Write Relaxed Ordering

Detected in version: 2.40.50000

1763379

Description: On Windows Server 19H1, running "netstat -axn" when RDMA is enabled and a vNIC is present, results in RDMA being disabled on the port with the VMswitch.

Workaround: N/A

Keywords: VMSwitch, RDMA, Windows Server 2019

Detected in version: 2.40.50000

1908862

Description: When running RoCE traffic with a different RoceFrameSize configuration, and the fabric (jumbo packet size) is large enough, the MTU will be taken from the initiator even when it supports larger size than the server.

Workaround: N/A

Keywords: RoCE, MTU

Detected in version: 2.40.50000

1846356

Description: The driver ignores the value set by the "*NumVfs" key. The maximal number of VFs is the maximal number of VFs supported by the hardware.

Workaround: N/A

Keywords: SR-IOV NUMVFs

Detected in version: 2.30.50000

1598716

Description: Issues with the OS' "SR-IOV PF/VF Backchannel Communication" mechanism in Windows Server 2019 Hyper-V, effect VF-Counters functionality as well.

Workaround: N/A

Keywords: Mellanox WinOF-2 VF Port Traffic, VF-Counters

Detected in version: 2.30.50000

1702662

Description: On WIndows Server 2019, the physical media type of the IPoIB NIC will be 802.3 and not InfiniBand.

Workaround: Use the mlx5cmd tool ("mlx5cmd -stat") which is part of the driver package to display the lin_layer type.

Keywords: Windows Server 2019, IPoIB NdisPhysicalMedium

Detected in version: 2.20

1718201

Description: Heavy traffic causes Sniffer' limit file to be the same as the buffer size (100M by default).

Workaround: N/A

Keywords: Sniffer, heavy traffic

Detected in version: 2.20

1580985

Description: iSCSI boot over IPoIB is currently not supported.

Workaround: N/A

Keywords: iSCSI Boot, IPoIB

Detected in version: 2.10

1536971

Description: The RscIPv4 and RscIPv6 keys’ values are set to 0 for the host in Windows Server 2019. As the values for those keys are already written by the Inbox Driver in Windows Server 2019, they will not be changed when upgrading.

Workaround: N/A

Keywords: RscIPv4, RscIPv6, Windows Server 2019

Detected in version: 2.10

1336097

Description: Due to an OID timeout, the miniport reset is executed.

Workaround: Increase the timeout value in such way that 2 * CheckForHangTOInSeconds > Max OID time.

For further information, refer to section General Registry Keys in the User Manual.

Keywords: Resiliency

Detected in version: 1.90

The below table summarizes the SR-IOV working limitations, and the driver’s expected behavior in unsupported configurations.

WinOF-2 Version

NVIDIA® ConnectX®-4 Firmware Ver.

Adapter Mode

InfiniBand

Ethernet

SR-IOV On

SR-IOV Off

SR-IOV On/Off

Earlier versions

Up to 12.16.1020

Driver will fail to load and show "Yellow Bang" in the device manager.

No limitations

1.50 and 1.60

Between 1x.16.1020 and 1x.19.2002 (IPoIB supported)

“Yellow Bang” unsupported mode - disable SR-IOV via mlxconfig

OK

No limitations

1.70 and onwards

1x.19.2002 and onwards (IPoIB supported)

OK

OK

No limitations

For further information on how to enable/disable SR-IOV, please refer to section Single Root I/O Virtualization (SR-IOV).

© Copyright 2024, NVIDIA. Last updated on Sep 18, 2024.