NVIDIA ConnectX-8 SuperNIC Firmware Release Notes v40.47.1026

Bug Fixes in this Firmware Version

Internal Ref.

Issue

4570205

Description: Fixed a firmware issue where the ZTR_RTTCC algorithm parameters AI and HAI did not support a sufficient range.

Keywords: PCC, ZTR_RTTCC

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4629077

Description: Fixed an issue where coalescing regular SX events with SX RTT events under ZTR_RTTCC could keep improper event fields, which could impact congestion control behavior.

Keywords: PCC, ZTR_RTTCC

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4683328

Description: Fixed an issue in the ZTR_RTTCC algorithm where probe-abortion handling could behave improperly under high-stress network conditions, ensuring proper congestion control and stable traffic performance.

Keywords: PCC, ZTR_RTTCC

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4501554

Description: Fixed an assertion failure that could occur with the E-Switch uplink in specific configurations where the e-switch was disabled and Path Migration was active or GVMIs were using SRQ loopback in SQs. The issue occurred because the firmware attempted to perform cleanup operations when the uplink configuration lacked sufficient capacity.

Now, when the E-Switch is disabled and no actions are available in the uplink STE, the firmware connects to the uplink STE instead of copying it.

Keywords: Path migration, steering

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4506854

Description: Added Scaling Factor "read" field. To obtain correct values in mlxlink, MFT version 4.33.0 or later is required.

Keywords: Scaling Factor, mlxlink, MFT

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4468319

Description: Fixed an issue where the ConnectX-8 downstream port failed to send a NACK when rejecting an L1 entry request from the upstream port.

Keywords: NACK, downstream port

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4550782

Description: Fixed an issue on GB200 systems with two symmetrical ConnectX-8 SuperNICs, which caused DPN numbering differences on the HCA upstream port. Legacy drivers accessed with dpn=0,0,0, which could result in attempts to access the wrong DPN node in socket-direct systems.

The firmware now automatically determines the correct pcie_index based on the accessed link in direct-NIC systems.

Keywords: DPN numbering

Detected in version: 40.45.1020

Fixed in Release: 40.47.1026

4571079

Description: Fixed an issue where invoking the resourcedump tool with segment type DPA_PROCESS_LST returned invalid data when the parameter n1 == 1 and no processes existed on the current vhca_id.

The fix adds a proper check, and the resourcedump tool now reports the correct error in this scenario.

Keywords: DPA PROCESS, RESOURCE DUMP

Detected in version: 40.45.1020

Fixed in Release: 40.47.1026

4529293

Description: Fixed an issue where, during failover or restart, the SM sending a PortInfo MAD to the HCA firmware triggered reinitialization of port buffers, momentarily halting ingress traffic and causing packet drops.

The firmware now avoids reconfiguring port buffers when the new configuration matches the current one.

Keywords: OpenSM

Detected in version: 40.45.1020

Fixed in Release: 40.47.1026

4641215

Description: Fixed a rare issue where MFRL operations could fail due to a timeout.

Keywords: MFRL

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4683346

Description: Fixed an issue where, under the ZTR_RTTCC algorithm, a flow that reached its minimum rate due to heavy congestion would not recover its rate once the congestion cleared.

Keywords: PCC, ZTR_RTTCC

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4575692

Description: Fixed an issue where a missing interrupt from the module IO (Expander) could prevent the module from being raised.

Keywords: Module IO (Expander)

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4620765

Description: Fixed an issue where reading debug registers could cause link BER (Bit Error Rate) degradation over time.

Keywords: BER

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4658434

Description: Fixed an issue where ports connected via 4 or 8 lanes and configured for 200G_2x (using only 2 lanes) would fail to link when using a mix of new firmware (with “Non Tx-Squelch” support) and older firmware versions.

Note: Please make sure on both sides, switch (local device) and Ssitch/NIC (peer device) you:

  • Deploy the new firmware release versions as a matched bundle on both Switch and NIC devices.

  • Configure the port to use 2 lanes (instead of 4 or 8 lanes) while keeping the 200G_2x speed setting.

Keywords: Port speed

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4401684

Description: Fixed an issue in Arch diagnostic data counters where the pcie_link_outbound_data_bytes counter was incorrectly returning only zero values.

Keywords: Arch diagnostic data counters

Detected in version: 40.45.1020

Fixed in Release: 40.47.1026

4575696

Description: Fixed an issue where multiple long-running process registers could cause aborted access and timeouts, the internal state is now properly handled.

Keywords: ibdiagnet2

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4583940

Description: Fixed an issue where enabling the CCMAD custom header on one PCC probe slot caused other slots to malfunction when multiple slots were configured.

Note: If using firmware versions older than the 40.47.10xx GA release, disable the CCMAD custom header when multiple probe slots are enabled.

Keywords: PCC CCMAD custom header

Detected in version: 40.46.1006

Fixed in Release: 40.47.1026

4610740

Description: Fixed a firmware issue where a CQE error with vendor_syndrome RDE_MAL_WQE (0xd6) could cause traffic disruption on the affected QP.

Keywords: RDMA, transport

Detected in version: 40.45.1020

Fixed in Release: 40.47.1026

© Copyright 2025, NVIDIA. Last updated on Nov 20, 2025