NVIDIA® WinOF VPI Documentation v5.50.54000
Linux Kernel Upstream Release Notes v6.5

Ethernet Related Troubleshooting

For further performance related information, please refer to the Performance Tuning Guide and to Performance Tuning and Counters.

Issue

Cause

Solution

Low performance.

Non-optimal system configuration might have occurred.

See section Performance Tuning and Counters to take advantage of Mellanox 10/40/56 GBit NIC performance.

The driver fails to start.

There might have been an RSS configuration mismatch between the TCP stack and the Mellanox adapter.

  1. Open the event log and look under "System" for the "mlx4ethX" source.

  2. If found, enable RSS, run: "netsh int tcp set global rss = enabled".

or a less recommended suggestion (as it will cause low performance):

  • Disable RSS on the adapter, run:

    "netsh int tcp set global rss = no dynamic balancing".

The driver fails to start and a yellow sign appears near the "Mellanox ConnectX 10Gb Ethernet Adapter" in the Device Manager display. (Code 10)

A hardware error might have occurred.

Disable and re-enable "Mellanox ConnectX Adapter" from the Device Manager display. In case it does not work, contact support.

The driver fails to start and in the Event log, under the mlx4_bus source, the following error message appears: “RUN_FW command failed with error - 22”

A wrong firmware image might have been programmed on the adapter card.

See Firmware Upgrade.

No connectivity to a Fault Tolerance team while using network capture tools (e.g., Wireshark).

The network capture tool might have captured the network traffic of the non-active adapter in the team. This is not allowed since the tool sets the packet filter to "promiscuous", thus causing traffic to be transferred on multiple interfaces.

Close the network capture tool on the physical adapter card, and set it on the team interface instead.

No Ethernet connectivity on 10Gb adapters after activating Performance Tuning (part of the installation).

A TcpWindowSize registry value might have been added.

  • Remove the value key under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpWind owSize

Or

  • Set its value to 0xFFFF.

Packets are being lost.

The port MTU might have been set to a value higher than the maximum MTU supported by the switch.

Change the MTU according to the maximum MTU supported by the switch.

NVGRE changes done on a running VM, are not propagated to the VM.

The configuration changes might not have taken effect until the OS is restarted.

Stop the VM and afterwards perform any NVGRE configuration changes on the VM connected to the SR-IOV-enabled virtual switch.

© Copyright 2023, NVIDIA. Last updated on Oct 26, 2023.