NVIDIA ConnectX-7 Adapter Cards Firmware Release Notes v28.40.1000
NVIDIA ConnectX-7 Adapter Cards Firmware Release Notes v28.40.1000

Bug Fixes History

Warning

This section includes history of 3 major releases back. For older releases history, please refer to the relevant firmware versions.

Internal Ref.

Issue

3652874

Description: Fixed firmware measurements calculation.

Keywords: Firmware measurements calculation

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3664415

Description: Fixed an issue that caused Live Migration to hang during the "save" stage.

Keywords: Live migration

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3629353

Description: Fixed the cr_space in port configuration to prevent wrong timestamp of cqes.

Keywords: Hardware timestamp

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3582559

Description: Added support for LED scheme #2 to MCX750500B-0D0K / MCX750500B-0D00 adapter cards.

Keywords: LED

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3669258

Description: Fixed a rare issue that prevented changes in mlxconfig from taking effect upon warm reboot.

Keywords: mlxconfig

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3670719 / 3676590

Description: Added a small delay after the power up process to fix an issue that occasionally caused the module to be unstable after the power up.

Keywords: Link up

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3629562

Description: Fixed a code mismatch in the process of handling the cause to the link being down when the remote faults were received.

Keywords: Link down

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3532508

Description: Fixed a wrong parameter in the cable info MAD that resulted in unnecessary messages in the log.

Keywords: Cable info MAD

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3634350

Description: Disabled PCI power event messages on OCP 3.0 adapter cards according to the spec requirements.

Keywords: PCI, OCP 3.0

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3636714

Description: Fixed an issue that caused the buffer for PLDM firmware update that were pending NIC requests to not being properly locked in case of PLDM-over-NC-SI, and consequently being corrupted by other flows.

Keywords: PLDM, buffer

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3592276

Description: Fixed an issue that prevent MSI Interrupts from being advertised correctly, resulting in the wrong MSI being sent.

Keywords: MSI

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3605363

Description: "Get Temperature" OEM command now always returns a unified temperature.

Keywords: Temperature

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3531972

Description: Changed the bar configuration algorithm so that the last update to the bar address will be the one that takes affect when the host configures the same bar address for two different PFs.

Keywords: Network Interface

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3626872

Description: Fixed an issue that caused the firmware to miscalculate the value of the maximum current temperature measured from all the diodes (found in the Internal_sensor_curr_temp field).

Keywords: Sensor, temperature

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3544340 / 3537706 / 3639178

Description: Improved SPDM v1.0 compatibility. SPDM measurements signature additional fixes.

Keywords: SPDM

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3587821

Description: Fixed a HW bug that resulted in transaction loss that when cache replacement transaction occurs in parallel to code transcoding.

Keywords: HW bug, transaction loss

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3610861

Description: The eeprom module gets stuck in polling in 20% of the times after reset. To resolve the issue, a delay after config module to high power was added.

Keywords: Polling, module, reset

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3507928

Description: Fixed a linkup failure issue that occurred when connecting to a 25GbE transceiver by clearing the PSI Aging before trying to open Tx power.

Keywords: Cables, PSI Aging, 25GbE transceiver

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3602379

Description: The "Bad Signal Integrity" message seen after power cycle can be safely ignored. The user should monitor BER number.

Keywords: Bad Signal Integrity, BER

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3605686

Description: Fixed a statics issue that caused the i2c access to module to lock and stuck the switch.

Keywords: i2c, switch

Discovered in Version: 28.38.1900

Fixed in Release: 28.39.2048

3482251

Description: Added support for hairpin drop counter in QUERY_VNIC_ENV command.

Keywords: Hairpin

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3539437

Description: Fixed an issue that prevented the get_func_num_from_pci_func_num function from returning the value "-1" for undefined function type.

Keywords: get_func_num_from_pci_func_num

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3570478

Description: Fixed Signal-to-Noise Ratio (SNR) value calculation for correct readings from the MMA4Z00 optical cable module.

Keywords: SNR

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3602169

Description: Added a locking mechanism to protect the firmware from a race condition between insertion and deletion of the same rule in parallel. Such behavior occasionally resulted in firmware accessing a memory that has already been released, thus causing IOMMU / translation error.

Note: This fix will not impact insertion rate for tables owned by SW steering.

Keywords: Firmware steering

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3588515 / 3409806

Description: Fixed a race condition that led to a firmware assert upon driver removal, or when changing the ETH flow control scheme in case of a stress of larger than MTU ingress packets.

Keywords: Race condition, firmware assert

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

3610169

Description: Fixed QoS Shaper handling behavior for non-transmitting applications.

Keywords: QoS Shaper

Discovered in Version: 28.38.1002

Fixed in Release: 28.39.2048

Internal Ref.

Issue

3537571

Description: Fixed SPDM measurements signature.

Keywords: SPDM

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3439757

Description: Fixed an issue that prevented the system from detecting the PCIe device during slot DC power cycle tests.

Keywords: PCIe device, DC power cycle tests

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3534473

Description: Added a new field/slot ID to PRS pcie_cfg_data.pci_cfg_space.pciex.pcie_switch_ini_defined_base_slot_id = 3 to define a specific slot number for GPU bridge DSP.

Keywords: Slot ID

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3331179

Description: Improved token calculation.

Keywords: Token calculation

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3299420

Description: Upgrading from firmware v28.38.1014 and below to v28.38.1002 no longer requires an upgrade to an intermediate version.

Keywords: Firmware upgrade

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3394841

Description: Updated the plug in/out events' reporting method to report only when the last recorded event is the opposite of the current event.

Keywords: Port events

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3469311

Description: Fixed the SPDM operations order according to the spec. v1.1.0.

Keywords: SPDM operations

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3527987

Description: Added support for NC-SI channel on both ports.

Keywords: NC-SI channel

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3459317

Description: Changed the protection mechanism for BAR configuration.

Keywords: BAR configuration

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3345150

Description: Fixed an issue that caused a packet with invalid/bad padcount to be silently dropped instead of sending a bad nack error.

Keywords: Packet drop

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3418627

Description: Fixed wrong credits configuration that occurred when MAX_ACC_OUT_READ was configured.

Keywords: Performance

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3466088

Description: Update the SX root to work with driverless mode in vport0 gvmi teardown.

Keywords: Driverless mode

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3487313

Description: Fixed a a rare deadlock case between 2 DC packets in the RX side.

Keywords: Firmware deadlock

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3495889

Description: Fixed a QoS host port rate limit shaper inaccuracy that occurred when the shaper was configured via the QSHR access register.

Keywords: Port rate limit shaper

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

3449451

Description: When using ConnectX-7 adapter card as InfiniBand, the port must be configured to use the Auto-Negotiation mode.

Keywords: Auto-Negotiation, InfiniBand

Discovered in Version: 28.37.1014

Fixed in Release: 28.38.1002

Internal Ref.

Issue

3272599

Description: Removed the option to clear "Tx disable cap" for all non-baseT SFP modules.

Keywords: Tx disable cap

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3339087

Description: Added a split mask verification process to check whether or not a module is split in HCA.

Keywords: Cables, split module

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3411270

Description: Fixed an issue that resulted in firmware crash when setting large payload length values (more than ~1500) in NC-SI command's header.

Keywords: NC-SI

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3405790

Description: Fixed an issue that resulted in the interface type being shown as "unsupported" in CMIS modules.

Keywords: CMIS

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3418889

Description: Updated the NEGOTIATE_ALGORITHMS response according to the SPDM specification.

Keywords: SPDM

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3409686

Description: Added the option to clear the DPC registers after warm reboot.

Keywords: DPC

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3411116

Description: Fixed the configuration of the TS1s sent by the DownStream port (DSP) when moving to EQLZ.ph2.

Keywords: DSP

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3138665

Description: Changed the initial Tx preset configuration for the DownStream port (DSP).

Keywords: Tx, DSP

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3138665

Description: PLDM firmware update process fails in case 1304 bytes chunk size is chosen.

Keywords: PLDM firmware update

Discovered in Version: 28.34.4000

Fixed in Release: 28.37.1014

3336619

Description: Fixed an issues that occurred during secure firmware update when decrypting and authenticating each chunk of data using its authentication tag. The issue appeared when the main code chunk was split between the user chunks and any GCM operation (e.g., flash read with decryption). This GCM operation broke the GCM context for main chunk authentication and therefore failed.

Keywords: Secure firmware update, GCM, code chunk

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3327847

Description: CNP received, handled, and ignored counters in the hardware counters cannot work after moving to Programmable Congestion Control mode.

Keywords: CNP, Programmable Congestion Control

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3336610

Description: Fixed a rare issue that prevented the hardware from handling an error flow that occurred when accessing the DPA cluster L2 cache from the firmware processor. In this case the firmware processor hardware requested a VA=>PA translation from the internal mmio, and the address translation was broken by the mmio on the 4K page boundary.

Keywords: Error handling, mmio, firmware processor

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

3073517

Description: When connecting a ConnectX-7 adapter card to a ConnectX-5 or an NVIDIA Spectrum switch and trying to raise 10G/40G over 100G optics cable is not supported.

Keywords: Optical cables, ConnectX-5, NVIDIA Spectrum

Discovered in Version: 28.33.4030

Fixed in Release: 28.37.1014

3358994

Description: Fixed an issue that prevented the hardware from consuming Port-VL and credits, which consequently blocked traffic from being transmitted due to a race condition between the firmware and the hardware when accessing the chip memory (CR space).

Keywords: Firmware race, CR space, Port-VL

Discovered in Version: 28.36.1010

Fixed in Release: 28.37.1014

© Copyright 2023, NVIDIA. Last updated on Feb 7, 2024.