NVIDIA BlueField-2 DPU Firmware Release Notes v24.35.3502 LTS
NVIDIA BlueField-2 DPU Firmware Release Notes v24.35.3502 LTS

Changes and New Feature History

Feature/Change

Description

24.35.3006

Bug Fixes

See Bug Fixes in this Firmware Version section.

Feature/Change

Description

24.35.2000

PCC Algorithm

Enables the users to collect more information from NP to RP for PCC algorithm. To achieve this, the NP ingress bytes information was added to the RTT response packet sent from the NP side.

HPCC: Support per-IP and per-QP methods

Enables the user to configure the PCC algorithm shaper coalescing mode using nvconfig to select CC algorithm shaper coalescing for IB and ROCE.

The new parameters are IB_CC_SHAPER_COALESCE and ROCE_CC_SHAPER_COALESCE.

Bug Fixes

See Bug Fixes in this Firmware Version section.

Feature/Change

Description

24.35.1012

HPCC, Programmable Congestion Control

HPCC related configurations in is now supported via the mlxconfig utility.

UDP

Added support for copy modify header steering action to/from the UDP field.

Resource Dump

Added the following resource dump segments:

  • SEG_HW_STE_FULL that includes dump to STE and all its dependencies

  • SEG_FW_STE_FULL that include dump to FW_STE and to HW_STE_FULL in range

Striding WQE - Headroom and Tail-room

As the software requires additional space before and after a packet is scattered for its processing for stridden RQ, the hardware will allocate the required room while scattering packets to spare a copy.

Connections per Second (CPS)

Improved security offload's Connections per Second (CPS) rate using the general object DEK (PSP TLS etc).

VirtIO vDPA Performance Virtualization

Increased the VirtIO hardware offload message rate to 20/20 MPPS for 256 virtual devices by optimizing the datapath application code.

Open SNAPI: Communication Channel

Open SNAPI steering hop optimization:

  • Only packets with dmac/dlid 0 will enter the SNAPI steering channel. Meaning if the Open SNAPI communication channel exists, all other traffic will have only 1 steering hop penalty

  • SW must use dlid/dmac == 0 in address vector of a SNAPI QP

  • cross eSwitch, Open SNAPI is now allowed if LAG is enabled

SHA Calculations

Added support for SHA calculations via the MMO engine.

UPT Performance

Optimized latency for UPT traffic.

NONDNIC, LAG

Added LAG support on NODNIC interfaces to enable traffic functionality on other ports after the LAG is enabled.

QoS Priority Trust Default State

QoS priority trust default state can now be changed using the new nvconfig below:

  • QOS_TRUST_STATE_P1

  • QOS_TRUST_STATE_P2

The values that can be used to set the default state are:

  • TRUST_PORT

  • TRUST_PCP

  • TRUST_DSCP

  • TRUST_DSCP_PCP

VirtIO Full Emulation

Implemented a transitional device which supports both drivers conforming to the spec 1.x and allows legacy drivers to support virtio legacy driver (virtio 0.95).

Bug Fixes

See Bug Fixes in this Firmware Version section.

Feature/Change

Description

24.34.1002

LLDP Properties Implementation on RDE

Added LLDPEnable, LLDPTransmit and LLDPReceive properties to the RDE Port schema implementation.

PPS Offset

Added a 22 nanosecond of propagation delay to the cable delay of the PPS signal when using PPS out.

Programmable CC, PPCC, MAD, IBCC

Added support for PPCC register with bulk operations, MAD for algorithm configuration and tunable parameters.

Programmable Counters

Added support for programmable counters for PCC via PPCC register and MAD.

Bug Fixes

See Bug Fixes section.

Feature/Change

Description

24.33.1048

NVIDIA BlueField-2 DPU NIC Operation Mode

In this mode, the DPU behaves exactly like a ConnectX SmartNIC from the perspective of the external host.

As part of this operation mode, this new capability allows the host Physical Function:

  • to supply pages to the Host eSwitch functions

  • to execute the same flows as a Host PF on a regular ConnectX device

For further information, see section "NVIDIA BlueField-2 DPU NIC Operation Mode" in this document.

200Gb/s Throughput on Crypto Capable Devices

Enabled 200Gb/s out-of-the-box throughput on crypto capable devices.

Note: If any crypto offloads is in use, 200Gb/s throughput can be achieved only after the next firmware reset

VF Migration

Added support for VF migration. The hypervisor can now suspend its VF, meaning from that point the VF cannot perform action such as send/receive traffic or run any command. In this firmware version only the suspend resume mode is supported (on the same VM).

MADs

Added a new MAD of class SMP that has the attributes hierarchy_Info as defined in the IB Specification and is used to query the hierarchy information stored on the node and the physical port.

NV Configurations via the Relevant Reset Flow

Added pci_rescan_needed field to the MFRL access register to indicate whether a PCI rescan is needed based on the NV configurations issued by the software.

Note: If the Keep Link Up NV configuration is changed, phyless reset will be blocked.

Increasing Firmware's Queue Depth Limit

Changing the mlxconfig parameter allows the user to expose different value of max queue depth in NVMe CAP on static emulated functions which results in NVMe module creating longer NVMe queues.

RSHIM PF Interrupts Implementation

Added support for interrupts on the RSHIM PFs to enable gracefully stop of the RSHIM host driver.

ICM Pages

Added a new register (vhca_icm_ctrl access_reg) to enable querying and limiting the ICM pages in use.

VF Migration

Added support for VF migration.

NetworkPort Schema Replacement

Replaced the deprecated NetworkPort schema with Port schema in NIC RDE implementation.

RShim PF

Added support for "mlxfwreset --sync 1" for the RShim PF.

Steering Definer

Added support for creating a steering definer with a dword selector using create_match_definer_object and the "SELECT" format.

XRQ QP Errors Enhancements

Enhanced the XRQ QP error information provided to the user in case QP goes into an error state. In such case, QUERY_QP will provide information on the syndrome type and which side caused

the error.

HW Steering: WQE Insertion Rules

[Beta] Added HW Steering support for the following:

  • set, add and copy inline STC action

  • set and copy actions for several fields using modify_pattern object and inline stc modify action

  • FDB mode in HW steering using FDB_RX and FDB_TX flow table types

  • ASO flow meter action via STC

  • flow counter query using ASO WQE

  • allocation of large bulks for the objects: STE, ASO flow meter and modify argument

  • jumbo match RTC

  • count action in STC

Holdover Mode

Added support for holdover mode to comply to SyncE specifications (EEC compliance) to limit the maximum phase transient response upon link loss.

SyncE Enhancements

Added support for noise filtering to comply to the SyncE specifications requirements.

ibstat

Updated the ibstat status reported when the phy link is down. Now QUERY_VPORT_STATE.max_tx_speed of UPLINK will not be reported as 0 anymore.

ZTRCC

Added support for advanced ZTR_RTTCC algorithm based on the Programmable CC platform to achieve better congestion control without dependency on the switch ECN marking.

SMPs

Disabled the option to send SMPs from unauthorized hosts.

Dynamic Completion Event Moderation for vDPA

DIM is used to tune moderation parameter dynamically using an mlxreg command.

To disable this capability, run:

mlxreg -d /dev/mst/mt41686_pciconf0 --reg_id 0xc00d --reg_len 0x8 -s "0x4.1:1=0x0"

UPT Performance Improvement

Improved UPT PPS performance.

SW Steering Cache

Modified the TX or RX cache invalidation behavior. TX or RX cache invalidation now does not occur automatically but only when the software performs the sync operation using the using sync_steering command.

Mega Allocations in Bulk Allocator Mechanism

Modified the maximum bulk size per single allocation from "log_table_size - log_num_unisizes", to allocate any range size, to remove limitations that HWS objects such as counters and modify arguments might encounter.

Dynamic Flex Parser over a VF

Added support for creating a dynamic flex parser on untrusted function, and changed the flex parser cap for untrusted function to the following:

  • maximum flex parser node = 2

  • maximum dw sample = 4

Changing all the Crypto Features to Wrapped or Cleartext

Crypto features can be in either wrapped or unwrapped mode. Meaning, the key can be wrapped or in plaintext when running the CREATE_DEK PRM command. To comply with the requirements specified in FIPS publication, all the created DEKs must be wrapped.

This feature adds new NV_CONFIG per device to control this mode, and enables the user to change all the crypto features to wrapped or cleartext.

ICM Direct Access by the Software to write/modify the DEK Objects

[Beta] This new capability enables the software to directly access ICM and write/modify the DEK objects. Such change improves the DEK object update rate by re-using DEK object instead of creating a new one.

In addition, added the following:

  • New for DEK object: bulk allocation, modify_dek cmd, and new mode - sw_wrapped.

  • New general object INT_KEK

Bug Fixes

See Bug Fixes section.

© Copyright 2023, NVIDIA. Last updated on Dec 28, 2023.