NVIDIA ConnectX-8 SuperNIC Firmware Release Notes v40.46.1006

Changes and New Features

Info

To generate PLDM packages for firmware updates, users must install and use the MFT version that corresponds with the respective firmware release.

Feature/Change

Description

40.46.1006

PCIe TLP Processing Hints (TPH) and Steering Tag (ST)

Enabled PCIe TLP Processing Hints (TPH) and Steering Tag (ST) during MKey creation.

Note: The steering tag index in the MKey creation must reference an MSIX entry containing the actual steering tag value.

PCIe Congestion Events

Added support for the general PCIe congestion object to monitor and receive events related to inbound and outbound PCIe congestion. A threshold can be configured to specify when the firmware should send an event to the software.

This capability is activated by setting the mlxconfig parameter PCIE_CONGESTION_MONITOR.

RDMA QP

When an RDMA QP encounters a memory access an issue caused by address translation, it can recover without transitioning to an error state. The QP will send an error CQE to notify the software while continuing to serve other VMs and functions.

PPCNT Counters

Firmware now supports new counters in the PPCNT register to track multicast and unicast packets transmitted and received. The counters include:

  • port_multicast_xmit_pkts_high

  • port_multicast_xmit_pkts_low

  • port_multicast_rcv_pkts_high

  • port_multicast_rcv_pkts_low

  • port_unicast_xmit_pkts_high

  • port_unicast_xmit_pkts_low

  • port_unicast_rcv_pkts_high

  • port_unicast_rcv_pkts_low

Safely Identify DPUs/SmartNICs is a Machine and PCIe Slot

A new access register is introduced that accepts a type, length, and R/W command.

  • Write operation: Allocates a new ICMC buffer of the specified size (aligned to 64B) and stores the provided data. If a buffer for the given type already exists, the data in the ICMC is overwritten, and the locked area is adjusted accordingly

  • Read operation: If a buffer exists, its data is copied out. If not, the access register returns a size of 0 or an explicit error

The length can be stored within the data in the ICMC, and the type is mapped to 256B chunks (due to access register limitations), so the VA of the buffer is calculated as (base + (type << 8)). The first 4 bytes store a validity flag and the length. If length storage is unnecessary (e.g., null-terminated data), a hardware read can use a cache-line hit as a validity bit.

This feature is designed for limited use cases and does not address multi-host scenarios or broader ICMC utilization implications.

Latency Histogram Counter

Introduced a new latency histogram counter that measures the distribution of read operation latencies from our device to the PCI link, providing better visibility into PCI read performance and potential bottlenecks.

Incoming NC-SI Messages Validation for the payload_len Field

Added an extra validation for the payload_len field in incoming NC-SI messages. Previously, invalid packets might have been accepted; now, such packets are silently dropped.

RSS with Crypto Offload

Added support for RSS with crypto offload enabling the NIC to parallelize packet processing across CPU cores while performing encryption/decryption in hardware. Additionally, introduced a new l4_type_ext parameter with values: 0 (None), 1 (TCP), 2 (UDP), 3 (ICMP).

SPDM

Updated SPDM measurements report to version 1.1.

Bug Fixes

See Bug Fixes in this Firmware Version section.

© Copyright 2025, NVIDIA. Last updated on Aug 25, 2025.