MFT Known Issues

NVIDIA ConnectX-5 Adapter Cards Firmware Release Notes v16.35.3502 LTS

The following table provides a list of known issues and limitations.

Internal Ref. No.

Issue

3549141

Description: mlxfwreset usage by the Prometheus PCIe switch is currently not supported.

Workaround: N/A

Keywords: mlxfwreset, Prometheus PCIe switch

Discovered in Version: 4.25.0

3262855

Description: The mlxfwreset tool might fail when using PPC64LE on the RH 8.7 operating system.

Workaround: N/A

Keywords: mlxfwreset, PPC64LE, RH 8.7

Discovered in Version: 4.25.0

3090162

Description: The PCIe Error Injection feature is not supported due to a security limitation.

Workaround: N/A

Keywords: PCI Error Injection

Discovered in Version: 4.22.0

3446066

Description: When using ConnectX-7 and later cards, the link should be fully down (not in polling state) for the loopback configuration can be applied.

Workaround: N/A

Keywords: mlxlink

Discovered in Version: 4.23.0

3352983

Description: mlxfwreset does not work on mlnx-os/sonic/cumulus.

Workaround: N/A

Keywords: mlxfwreset

Discovered in Version: 4.23.0

3418112

Description: Loading a new firmware may require running mlxfwreset, and in some cases rebooting or initiating a power-cycle.

Workaround: N/A

Keywords: mlxfwreset

Discovered in Version: 4.24.0

3314750

Description: When entering link speed values, you can specify a single value (i.e "HDR") or a list of values separated by commas (i.e "HDR, FDR, SDE"). In the current MFTshell version, the autocomplete feature suggesting possible values, only works for the first value in a list of values separated by commas.

Additionally, the autocompletion list includes all possible speeds. Some of them may not be supported by the device. Once the command is fired, you will be notified in case the selected speed is not supported.

Affected shell commands are:

port speed

port autonegotiation on speed

port autonegotiation off speed

Any inconvenience caused by these limitations will be addressed in future MFTshell updates.

Workaround: When entering a link speed, you may press the key first. This will provide you with all possible values. You can then select the desired link speed value, copy and paste it into the command prompt, and type a comma (,) to select the next speed. Repeat the process to form a list of all desired link speed values separated by commas.

Once the command is fired, the underlying MFT tool will inform you if the selected speed is not supported by the device.

In addition, the help context for the affected shell commands includes detailed explanations of the available options.

Keywords: mft-shell, link-speed

Discovered in Version: 4.23.0

3188577

Description: Some firmware scratchpad registers have been moved to a different location. Therefore, if you use your own utility to dump mstdumps, you must update your CSV file with the latest CSV, CSV2 files that are included in the MFT package.

Otherwise, the mstdumps device will not retrieve the firmware version, and the FAEs will not be able to use NVIDIA internal tools to debug the error.

Workaround: N/A

Keywords: CSV, mstdump

Discovered in Version: 4.22.0

2787479

Description: mlxcables shows the wrong firmware version for OSFP cables.

Workaround: N/A

Keywords: mlxcables, OSFP, firmware version

Discovered in Version: 4.18.0

2823492

Description: mlxfwreset is not supported on DPU with GPU boards.

Workaround: N/A

Keywords: mlxfwreset

Discovered in Version: 4.18.0

2715716

Description: mlxfwreset is not supported on secure-boot host devices.

Workaround: N/A

Keywords: mlxfwreset

Discovered in Version: 4.18.0

2752916

Description: The information of the IB/ETH protocols should not be stored on the same CSV file. Doing so will result in a mismatch on the columns of CSV file.

Workaround: N/A

Keywords: mlxlink

Discovered in Version: 4.18.0

2838222

Description: mlxfwreset is not supported on kernel 3.10.0-1062.el7.x86_64 due to a kernel bug that leads to 'rescan' PCI operation to take a few minutes.

Workaround: N/A

Keywords: mlxfwreset

Discovered in Version: 4.18.0

2703663

Description: Running flint commands on the hypervisor while a Virtual Machine is running with the same device (pass-through), may cause kernel panic.

Workaround: N/A

Keywords: flint, kernel, VM

Discovered in Version: 4.17.0

2670833

Description: Burning firmware using DMA might fail on virtual FreeBSD machines.

Workaround: N/A

Keywords: Firmware burning, DMA, FreeBS, VM

Discovered in Version: 4.17.0

2484780

Description: Configuring TX/RX_rate to 200GbE in test mode fails.

Workaround: To work with the new speeds specify the number of lanes as shown below:

  • 100G_1X/200G_2X/400G_4X/800G_8X for NDR speeds

  • 50G_1X/100G_2X/200G_4X/400G_4X for HDR speeds

Keywords: 200GbE, Tx/Rx

Discovered in Version: 4.17.0

2392334

Description: Using the MFT with the --with-pcap option to install stedump utility requires the following third-party dependencies:

  • Libraries and header files for the libpcap library

  • Libraries and header files for Python development library

  • Package Installer for Python (PIP) available

Workaround: To install the third-party dependencies, perform the following:

  1. Install libpcap-devel or libpcap-dev on Debian-based distributions.

  2. Install python3-devel or python3-dev on Debian-based distributions.

  3. Bootstrap the PIP installer in one of the following ways:

Keywords: stedump utility

Discovered in Version: 4.16.0

2376425

Description: Direct Device Assignment (DDA, ak.a. pass-through) facility is not supported in MFT, its usage may cause the host to reboot.

Workaround: Burn the firmware in PF and then attach the HCA to the VM.

Keywords: DDA

Discovered in Version: 4.16.0

2208845/2099263

Description: mlxlink does not support test mode for 50GE-KR4 speed.

Workaround: N/A

Keywords: mlxlink

Discovered in Version: 4.16.0

-

Description: Port toggling with Inband devices using mlxlink fails and the following error is presented: "Unknown MAD error".

Workaround: To avoid this issue, perform one of the following options:

  • Use OpenSM (with or without -o)

  • Use only active ports

Keywords: Port toggling, mlxlink, Inband devices

Discovered in Version: 4.14.0-105

2234589

Description: For Multi-Host systems, enabling the PRBS test mode causes network connectivity disconnection.

Workaround: Maintain another interface for enabling the link back.

Keywords: mlxlink

Discovered in Version: 4.15.0

2167841

Description: "mlxfwmanager --download" and "mlxfwmanager --online" commands are currently not functional on ESXi 7.0.

Workaround: N/A

Keywords: mlxup/mlxfwmanager

Discovered in Version: 4.14.3

2149437

Description: When the SLTP configuration is wrongly set, the “Bad status” explanation will not be presented (only error indication) to the user.

Workaround: N/A

Keywords: SLTP configuration

Discovered in Version: 4.14.2

1780276

Description: "mst server start" runs at foreground instead of the background on FreeBSD and VMWare ESXi OSes.

Workaround: Use '&' --> 'mst server start &'

Keywords: 'mst server start', FreeBSD, VMWare ESXi

Discovered in Version: 4.14.0-105

2001890

Description: The argparse module is installed by default in Python versions =>2.7 and >=3.2. In case an older Python version is used, the argparse module is not installed by default and therefore must be manually installed.

Workaround: N/A

Keywords: Python, argparse module

Discovered in Version: 4.13.3

1923665 / 1939791

Description: Force Mode does not work when using mlxlink in ConnectX-6 InfiniBand adapter cards.

Workaround: N/A

Keywords: mlxlink, Force Mode, ConnectX-6 IB

Discovered in Version: 4.13.3

1802662

Description: Due to mst signing process, some executions might be slower than expected.

Workaround: N/A

Keywords: mst

Discovered in Version: 4.13.0

1431471

Description: In ConnectX-5 adapter cards, the time-stamp capability using flint, is supported only on the device using the "-d" flag, and not on the binary using the "-i" flag.

Workaround: Use the “-d” flag to set the time-stamp.

Keywords: flint

Discovered in Version: 4.11.0

1442454

Description: Occasionally, when running mstfwreset over a Multi-Host device, the driver remains down if the mstfwreset operation fails.

Workaround: N/A

Keywords: mstfwreset

Discovered in Version: 4.11.0

-

Description: Running mstfwreset on ConnectX-5 Socket-Direct adapter cards on Windows OS is currently not functional.

Workaround: Reboot the server

Keywords: mstfwreset, ConnectX-5 Socket-Direct

Discovered in Version: 4.8.0

© Copyright 2023, NVIDIA. Last updated on Oct 23, 2023.