Changes and New Feature History




Precision Time Protocol (PTP)

Added support for Precision Time Protocol (PTP), the protocol used to synchronize clocks throughout a computer network as part of 5T Technology.

Mergeable Buffer

Added mergeable buffer support (VIRTIO_NET_F_MRG_RXBUF in virtio spec) for VDPA kernel mode to improve performance in case of large MTU such as 9K. The feature is disabled by default and must be manually enabled while creating or modifying the virtio device.

Note: For best performance, it is NOT recommended to enable the feature if the VDPA MTU is set to the default value (1500).

Monitoring Cloud Guest RoCE Statistics on Cloud Provider

This new capability enables the VM to track and limit its Vport's activity. This is done using the new q_counters counter which enables aggregation of other Vport's from PF GVMI.

NVME Device Emulation

Enables the firmware to generate a Device Change Event upon any change in the NVME Device Emulation object (BAR change, HotPlug power state change, NVME Function reset, etc).

PCC Algorithms

Enables a smooth and statically switch between PCC algorithms. In addition, the user can now switch between PCC algorithms while running traffic.

Hardware Steering: Bulk Allocation

Added support for 32 actions in the header modify pattern using bulk allocation.

Bug Fixes

See Bug Fixes in this Firmware Version section.




PCC Algorithm

Enables the users to collect more information from NP to RP for PCC algorithm. To achieve this, the NP ingress bytes information was added to the RTT response packet sent from the NP side.

HPCC: Support per-IP and per-QP methods

Enables the user to configure the PCC algorithm shaper coalescing mode using nvconfig to select CC algorithm shaper coalescing for IB and ROCE.

Bug Fixes

See Bug Fixes in this Firmware Version section.




HPCC, Programmable Congestion Control

HPCC related configurations in is now supported via the mlxconfig utility.


Added support for copy modify header steering action to/from the UDP field.

Resource Dump

Added the following resource dump segments:

  • SEG_HW_STE_FULL that includes dump to STE and all its dependencies

  • SEG_FW_STE_FULL that include dump to FW_STE and to HW_STE_FULL in range

Striding WQE - Headroom and Tail-room

As the software requires additional space before and after a packet is scattered for its processing for stridden RQ, the hardware will allocate the required room while scattering packets to spare a copy.

Connections per Second (CPS)

Improved security offload's Connections per Second (CPS) rate using the general object DEK (PSP TLS etc).

VirtIO vDPA Performance Virtualization

Increased the VirtIO hardware offload message rate to 20/20 MPPS for 256 virtual devices by optimizing the datapath application code.

Open SNAPI: Communication Channel

Open SNAPI steering hop optimization:

  • Only packets with dmac/dlid 0 will enter the SNAPI steering channel. Meaning if the Open SNAPI communication channel exists, all other traffic will have only 1 steering hop penalty

  • SW must use dlid/dmac == 0 in address vector of a SNAPI QP

  • cross eSwitch, Open SNAPI is now allowed if LAG is enabled

SHA Calculations

Added support for SHA calculations via the MMO engine.

UPT Performance

Optimized latency for UPT traffic.


Added LAG support on NODNIC interfaces to enable traffic functionality on other ports after the LAG is enabled.

QoS Priority Trust Default State

QoS priority trust default state can now be changed using the new nvconfig below:



The values that can be used to set the default state are:





VirtIO Full Emulation

Implemented a transitional device which supports both drivers conforming to the spec 1.x and allows legacy drivers to support virtio legacy driver (virtio 0.95).

Bug Fixes

See Bug Fixes in this Firmware Version section.




LLDP Properties Implementation on RDE

Added LLDPEnable, LLDPTransmit and LLDPReceive properties to the RDE Port schema implementation.

PPS Offset

Added a 22 nanosecond of propagation delay to the cable delay of the PPS signal when using PPS out.

Programmable CC, PPCC, MAD, IBCC

Added support for PPCC register with bulk operations, MAD for algorithm configuration and tunable parameters.

Programmable Counters

Added support for programmable counters for PCC via PPCC register and MAD.

Bug Fixes

See Bug Fixes section.




NVIDIA BlueField-2 DPU NIC Operation Mode

In this mode, the DPU behaves exactly like a ConnectX SmartNIC from the perspective of the external host.

As part of this operation mode, this new capability allows the host Physical Function:

  • to supply pages to the Host eSwitch functions

  • to execute the same flows as a Host PF on a regular ConnectX device

For further information, see section "NVIDIA BlueField-2 DPU NIC Operation Mode" in this document.

200Gb/s Throughput on Crypto Capable Devices

Enabled 200Gb/s out-of-the-box throughput on crypto capable devices.

Note: If any crypto offloads is in use, 200Gb/s throughput can be achieved only after the next firmware reset

VF Migration

Added support for VF migration. The hypervisor can now suspend its VF, meaning from that point the VF cannot perform action such as send/receive traffic or run any command. In this firmware version only the suspend resume mode is supported (on the same VM).


Added a new MAD of class SMP that has the attributes hierarchy_Info as defined in the IB Specification and is used to query the hierarchy information stored on the node and the physical port.

NV Configurations via the Relevant Reset Flow

Added pci_rescan_needed field to the MFRL access register to indicate whether a PCI rescan is needed based on the NV configurations issued by the software.

Note: If the Keep Link Up NV configuration is changed, phyless reset will be blocked.

Increasing Firmware's Queue Depth Limit

Changing the mlxconfig parameter allows the user to expose different value of max queue depth in NVMe CAP on static emulated functions which results in NVMe module creating longer NVMe queues.

RSHIM PF Interrupts Implementation

Added support for interrupts on the RSHIM PFs to enable gracefully stop of the RSHIM host driver.

ICM Pages

Added a new register (vhca_icm_ctrl access_reg) to enable querying and limiting the ICM pages in use.

VF Migration

Added support for VF migration.

NetworkPort Schema Replacement

Replaced the deprecated NetworkPort schema with Port schema in NIC RDE implementation.

RShim PF

Added support for "mlxfwreset --sync 1" for the RShim PF.

Steering Definer

Added support for creating a steering definer with a dword selector using create_match_definer_object and the "SELECT" format.

XRQ QP Errors Enhancements

Enhanced the XRQ QP error information provided to the user in case QP goes into an error state. In such case, QUERY_QP will provide information on the syndrome type and which side caused
the error.

HW Steering: WQE Insertion Rules

[Beta] Added HW Steering support for the following:

  • set, add and copy inline STC action

  • set and copy actions for several fields using modify_pattern object and inline stc modify action

  • FDB mode in HW steering using FDB_RX and FDB_TX flow table types

  • ASO flow meter action via STC

  • flow counter query using ASO WQE

  • allocation of large bulks for the objects: STE, ASO flow meter and modify argument

  • jumbo match RTC

  • count action in STC

Holdover Mode

Added support for holdover mode to comply to SyncE specifications (EEC compliance) to limit the maximum phase transient response upon link loss.

SyncE Enhancements

Added support for noise filtering to comply to the SyncE specifications requirements.


Updated the ibstat status reported when the phy link is down. Now QUERY_VPORT_STATE.max_tx_speed of UPLINK will not be reported as 0 anymore.


Added support for advanced ZTR_RTTCC algorithm based on the Programmable CC platform to achieve better congestion control without dependency on the switch ECN marking.


Disabled the option to send SMPs from unauthorized hosts.

Dynamic Completion Event Moderation for vDPA

DIM is used to tune moderation parameter dynamically using an mlxreg command.

To disable this capability, run:

mlxreg -d /dev/mst/mt41686_pciconf0 --reg_id 0xc00d --reg_len 0x8 -s "0x4.1:1=0x0"

UPT Performance Improvement

Improved UPT PPS performance.

SW Steering Cache

Modified the TX or RX cache invalidation behavior. TX or RX cache invalidation now does not occur automatically but only when the software performs the sync operation using the using sync_steering command.

Mega Allocations in Bulk Allocator Mechanism

Modified the maximum bulk size per single allocation from "log_table_size - log_num_unisizes", to allocate any range size, to remove limitations that HWS objects such as counters and modify arguments might encounter.

Dynamic Flex Parser over a VF

Added support for creating a dynamic flex parser on untrusted function, and changed the flex parser cap for untrusted function to the following:

  • maximum flex parser node = 2

  • maximum dw sample = 4

Changing all the Crypto Features to Wrapped or Cleartext

Crypto features can be in either wrapped or unwrapped mode. Meaning, the key can be wrapped or in plaintext when running the CREATE_DEK PRM command. To comply with the requirements specified in FIPS publication, all the created DEKs must be wrapped.

This feature adds new NV_CONFIG per device to control this mode, and enables the user to change all the crypto features to wrapped or cleartext.

ICM Direct Access by the Software to write/modify the DEK Objects

[Beta] This new capability enables the software to directly access ICM and write/modify the DEK objects. Such change improves the DEK object update rate by re-using DEK object instead of creating a new one.

In addition, added the following:

  • New for DEK object: bulk allocation, modify_dek cmd, and new mode - sw_wrapped.

  • New general object INT_KEK

Bug Fixes

See Bug Fixes section.

© Copyright 2023, NVIDIA. Last updated on Sep 5, 2023.