Bug Fixes History

Warning

This section includes history of bug fixes of 3 major releases back. For older releases history, please refer to the relevant firmware versions Release Notes inhttps://docs.mellanox.com/category/adapterfw.

Internal Ref.

Issue

2435442

Description: RDMA write may experience performance degradation when working with Adaptive Routing and DCT half-handshake mode.

Keywords: DC, DCT, AR

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2858666

Description: Fixed an issue that ignored the default value of TX_SCHEDULER_BURST when its value in the ini was different than "0".

Keywords: TX_SCHEDULER_BURST, NVCONFIG

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2802943

Description: Implemented SLD detection code. Surprise Down Error Reporting Capable value was changed from 1 to 0 in boards where the downstream perst was not controlled thus causing SLD detection not to function properly.

Keywords: SLD detection, Surprise Down Error Reporting Capable

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2513453

Description: Fixed rare lanes skew issue that caused CPU to timeout in Rec.idle.

Keywords: PCIe

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2850198

Description: Fixed RDMA_WRITE traffic performance degradation that occured when working with DC on Adaptive Routing network.

Keywords: Performance, DC, AR

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2951894

Description: Fixed bad cache invalidations of destroyed QPs.

Keywords: destroy_qp

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2801850

Description: Fixed a rare case where asserts and ext_synd appeared in dmesg after performing driver restart.

Keywords: Driver restart

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2822046

Description: Fixed an issue related to host isolation on multi-host systems.

Keywords: Multi-host systems, isolation

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

2860409

Description: Enabled delay drop for hairpin packets. If a hairpin QP is created with delay_drop_en enabled, the feature will be enabled across all GVMIs, based on the delay drop status.

Keywords: Hairpin delay drop

Discovered in Version: 20.32.1010

Fixed in Release: 20.33.1048

Internal Ref.

Issue

2796324

Description: Fixed an issue that resulted in firmware getting stuck and causing unexpected behavior when connecting an optical transceiver that support RXLOS, and the remote side port was down.

Keywords: cables, RXLOS

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2748800

Description: Fixed an issue that caused the link status to be reported incorrectly and consequently caused the link to go down due to the wrong definition of the RX_LOS polarity in the INI.

Keywords: RX_LOS polarity

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2784304

Description: Fixed an issue that prevented the system from creating more than 128K QPs.

Keywords: QP

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2798556

Description: Modified the Effective BER calculation method. Due to this, the value of the Effective BER will be slightly higher, however the link quality remains the same as prior to this change.

Note: There is no change in the Symbol BER, it is the same as prior to this change.

Keywords: BER

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2797035

Description: Removed a wrong indication of the PCIe link down in the AER registers on PCIe switch upstream port.

Keywords: PCIe

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2678394

Description: Limited the external loopback speed according to the used module's capabilities.

Keywords: Cables

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2733080

Description: This firmware version includes the following PCIe changes:

  • Fixed the ACS Port Number field in DSPs and ACS Egress Control Vector field in DSPs

  • Added support for VSC on USP of PCIe Switch

  • Fixed the mapping of Legacy Interrupts in the PCIe Switch

  • Fixed MRRS & MPS configurations in DSPs

Keywords: PCIe

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2823281

Description: Fixed an issue that resulted in wrong RNR timeout when trying to set it during the rts2rts_qp transition.

Keywords: RNR timeout

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2038112

Description: Fixed an issue that cause the flash frequency on boot to be lower than expected (under 50Mhz) by enabling the firmware to increase it on boot2 to normal frequency.

Note: On boards that use Winbond flash, the firmware is blocked if using a firmware that does not include this fix.

Keywords: Flash frequency, firmware boot, Windbond flash

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2798627

Description: Added support for DSFP AOC (CMIS) v4 when error code is not reported by the module.

Keywords: Cables

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2740158

Description: Fixed a race between DC QP flush and DC packets that led to stuck slices in the hardware. To avoid such situation, firmware keeps the TCU drop set until QP flush is done.

Keywords: DC

Discovered in Version: 20.31.1014

Fixed in Release: 20.32.1010

2648336

Description: Disabled the CNP counter “rp_cnp_ignored " (triggered by OOS (out-of-sequence)) when all ports are IB.

Note: For mixed IB/ETH scenario, the behavior depends on the RoCE configuration, the counter on the IB port may still increase but will not affect the regular use.

Keywords: CNP counter, IB

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2178949

Description: Improved PortXmitWait IB counter accuracy.

Keywords: Counters

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2641734

Description: Fixed the rate select mechanism in QSFP modules.

Keywords: Cables

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2600783

Description: Fixed classification issues for "Passive" cables to be more robust.

Keywords: Cables

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2574322

Description: Fixed an issue that occasionally caused some performance issues related to RC QPs using E2E-credits (not connected to SRQ and doing send/receive traffic) when the ROCE_ACCL tx_window was enabled.

Keywords: Bandwidth, performance

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2391109

Description: Fixed an issue that caused a fatal error, and eventually resulted in the HCA hanging when a packet was larger than a strided receive WQE that was being scattered.

Keywords: Strided RQ, MTU

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2569999

Description: Fixed a rare issue that caused RX pipe to hang.

Keywords: RX pipe

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2621704

Description: Fixed the resource number size (a 64 bit number) to avoid a scenario where it overwrote it with a 32 bit number and erased the high bits when de-allocating the resource number.

In this scenario, when two resource numbers had identical low 32 bits, and because the high bits were cleared, it resulted in the same idx. Consequently, when two idxes were identical, then it freed the same idx twice.

Keywords: Resource number size, free_4k page

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2619161

Description: Initialized the rate table in the static configuration so it will be configured at the link-not-up scenarios.

Keywords: RoCE, static configuration, rate table

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2589430

Description: CRT_DCR with index larger than 1 << 21 can collide with the CRT_SW_RESERVED address.

Keywords: DCR

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2447160

Description: In InfiniBand non-virtualization system, due to a corrupted steering root, traffic fails after a warm reboot.

Keywords: Steering, Traffic

Discovered in Version: 20.30.1004

Fixed in Release: 20.31.1014

2684071

Description: Changing the default host chaining buffer size or WQE size (HOST_CHAINING_DESCRIPTORS, HOST_CHAINING_TOTAL_BUFFER_SIZE) using NVconfig might result in driver initialization failure.

Keywords: Host chaining

Discovered in Version: 20.29.2002

Fixed in Release: 20.30.1004

2507096

Description: Removed the option to create unnecessary internal CNP operation for the Lossy ADP retransmission feature.

Keywords: RoCE, Lossy, Adp_retrans

Discovered in Version: 20.29.1016

Fixed in Release: 20.30.1004

2447334

Description: Fixed an issue related to unused port LEDs when no cable is connected to the adapter card.

Keywords: Cables, LEDs

Discovered in Version: 20.29.1016

Fixed in Release: 20.30.1004

2444837

Description: Set the cap to 0 for high index functions to avoid too many parallel VF NODNIC functions.

Keywords: NODNIC, VF, ETH PXE

Discovered in Version: 20.29.1016

Fixed in Release: 20.30.1004

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.