NVIDIA BlueField-3 DPU NIC Firmware Release Notes v32.41.1000

Changes and New Feature History

Feature/Change

Description

32.40.1000

Socket Direct Single netdev Mapped to Two PCIe Devices

Enabled Single Netdev mapping to two PCIe devices (Socket Direct).

Now multiple devices (PFs) of the same port can be combined under a single netdev instance. Traffic is passed through different devices belonging to different NUMA sockets, thus saving cross-NUMA traffic and allowing apps running on the same netdev from different NUMAs to still feel a sense of proximity to the device and achieve improved performance.

The netdev is destroyed once any of the PFs is removed. A proper configuration would utilize the correct close NUMA when working on a certain app/CPU.

Currently, this capability is limited to PFs only, and up to two devices (sockets). To enable the feature, one must configure the same Socket Direct group (non zero) for both PFs through mlxconfig SD_GROUP.

ACL

Added support for egress ACL to the uplink by adding a new bit to the Set Flow Table Entry: allow_fdb_uplink_hairpin.

Port Rate Limiting

Added a new access register (PBWS) to set the port maximum bandwidth to a value between 95% to 100%.

mlxconfig

Added a new NVConfig parameter to force Congestion Control algorithm to be SW-DCQCN.

Bug Fixes

See Bug Fixes in this Firmware Version section.

Feature/Change

Description

32.39.2048

FEC Configuration

Changed the default FEC configuration for the "Protocol Aware" and "Active DME Modules" (ETH cables).

For the list of cable identifiers, see tables below.

NC-SI Channels

Added support for two passthrough channels on dual-port adapter cards.

Expansion ROM

Added a caching mechanism to improved expansion ROM performance and to avoid any slow boot occurrences when loading the expansion ROM driver.

Live Migration Support for Image Size above 4GB

Added support for image size above 4GB when performing a live migration by splitting the image to chunks.

Crypto Algorithms

Extended the role-based authentication to cover all crypto algorithms. Now the TLS. IPsec. MACsec. GCM, mem2mem, and NISP work when nv_crypto_conf.crypto_policy = CRYPTO_POLICY_FIPS_LEVEL_2, meaning all cryptographic engines can also work in wrapped mode and not only in plaintext mode.

DSCP (priority) of ACK Packets

Added the ability to configure the DSCP (priority) of ACK packets using the ROCE_ACCL access register.

Performance Improvements

Added support for large MTU for force loopback QPs to improve performance (using the aes_xts_tweak_inc_64 parameter). This capability is enabled by mlxconfig LARGE_MTU_TWEAK_64 parameter.

DDR Poison: DDR Uncorrectable Error

When there is DDR poison (uncorrectable ECC error), firmware reports the health syndrome ICM_FETCH_PCI_DATA_POISONED_ERR (0x14), and triggers the FLR on the the function causing this error.

Due to this error, the DDR data is mostly corrupted therefore, the firmware blocks other operations on this function.

Live Firmware Patch

Added support for Live Firmware Patch.

Reserved mkey

Added new support for reserved mkey index range. When enabled, a range of mkey indexes is reserved for mkey by name use.

Admin Queue

Added support for admin queue in virtio device object.

Enhanced NIC Mode: GGA Modules

Enabled GGA modules for all working modes (except for RXP) when using Enhanced NIC Mode.

Bug Fixes

See Bug Fixes in this Firmware Version section.

Byte 192 of Page 0 for sff cables

Name

Auto Detect FEC

Current Default FEC

Previous Default FEC

P/N - Example of one module

0x1A

100GBase DWDM2

No

NO FEC

RS FEC

0x21

100G BIDI PAM4

No

NO FEC

RS FEC

SFBR-89BDDZ-CS4

0x25

100GBASE-DR

No

NO FEC

RS FEC

MMS1V70-CM

0x26

100GBASE-FR

No

NO FEC

RS FEC

QSFP28-FR-C

0x27

100GBASE-LR

No

NO FEC

RS FEC

SPTSBP4LLCDF

Protocol Aware ETH Cables

Byte 192 of Page 0 for sff cables

Name

Auto Detect FEC

Current Default FEC

Previous Default FEC

P/N - Example of one module

0x1

100G AOC / 25GAUI C2M AOC

Yes

RS FEC

RS FEC

0x2

100GBASE-SR4 / 25GBASE-SR

Yes

RS FEC

RS FEC

MMA2P00-AS

0x3

100GBASE-LR4

Yes

NO FEC

RS FEC

MMA1L10-CR

0x3

25GBASE-LR

Yes

RS FEC

FC FEC

MMA2L20-AR

0x4

100GBASE-ER4

Yes

NO FEC

RS FEC

SPQCEERCDFLM Source Photonics

0x5

100GBASE-SR10

Yes

NO FEC

RS FEC

0x6

100G CWDM4 MSA with FEC

Yes

RS FEC

RS FEC

MMA1L30-CM

0x7

100G PSM4 Parallel SMF

Yes

RS FEC

RS FEC

MMS1C10-CM

0x8

100G ACC / 25GAUI C2M ACC

Yes

RS FEC

RS FEC

0x9

100G CWDM4 MSA without FEC

Yes

NO FEC

RS FEC

LQ210CR-CPA2

0x17

100G CLR4

Yes

RS FEC

RS FEC

0x18

100G AOC

Yes

NO FEC

RS FEC

MFA1A00-C010

0x19

100G ACC

Yes

NO FEC

RS FEC

0x20

100G SWDM4

Yes

RS FEC

RS FEC

FTLC9152RGPL

0x22 / 0x23 / 0x24

4WDM-10 MSA / 4WDM-20 MSA / 4WDM-40 MSA

Yes

RS FEC

RS FEC

Active DME Modules ETH Cables

Warning

To configure FEC or Speed that is different than the default, you must configure both sides.

The following are examples of when FEC detection capability is available:

  • when a 25G SFP module is connected to card, it will support FEC detection in 25G

  • when a 100G QSFP module is connected to a card, it will support FEC detection in 100G, but not in 50G or 25G

Feature/Change

Description

32.38.3056

DPA Signing

Added support for customer-signed DPA application authentication.

Bug Fixes

See Bug Fixes in this Firmware Version section.

Feature/Change

Description

32.38.1002

DOCA Programmable Congestion Control

This new capability enables the user to control the programmability of congestion control based on DOCA including APIs, libraries, reference applications and advanced features such as high availability.

Header Modification

Added support to the metadata reg_c 8-11 (packet fields) for matching and modifying the header, and Advanced Steering Operation (ASO) actions.

Precision Time Protocol (PTP)

Added support for PTP on 200G port link speed. PTP uses an algorithm and method for synchronizing clocks on various devices across packet-based networks to provide sub-microsecond accuracy. NVIDIA Spectrum supports PTP in both one-step and two-step modes and can serve either as a boundary or a transparent clock.

INT Packets

Added support for forwarding INT packets to the user application for monitoring purposes by matching the BTH acknowledge request bit (bth_a).

Crypto Support (GCM algorithm)

Added crypto support (GCM algorithm) via the Memory-to-Memory offload (MMO) engine.

NC-SI, Strap Values

Implemented NVIDIA NC-SI OEM command query_strap_options (command 0x0, parameter 0x34).

mlxconfig

Implemented the following mlxconfig parameters related to the sideband interface enable/disable method:

  • PCIE_IN_BAND_VDM_DISABLE: When TRUE, the management processor will disable PCIe in-band VDM (MCTP over PCIe) interface.

  • PCIE_SMBUS_DISABLE: When TRUE, the management processor will disable SMBUS (embedded on the PCIe connector) interface.

  • RBT_DISABLE: When TRUE, the management processor will disable RBT interface.

  • PLDM_FW_UPDATE_DISABLE: When TRUE, PLDM FW update over PCIe and SMBUS are disabled.

  • HM_RDE_DISABLE: When TRUE, RDE over PCIe and SMBUS are disabled.

AES-XTS

Added the ability to increase the tweak for every block by (1<<64) instead of by 1 in AES-XTS.

DPA PROCESS ERROR

Added support for a new value for coredump_type field in DPA_PROCESS_COREDUMP, [FIRST_ERROR_THREAD_DUMP (1).].

Bug Fixes

See Bug Fixes in this Firmware Version section.

Feature/Change

Description

32.37.3012

General

This is the initial firmware release of NVIDIA BlueField-3 SmartNICs.

Return DPU to 'out of factory' State

Enables the user to return DPU to 'out of factory' state. This capability provides an option to 're-use' the DPUs to allow easy switch of tenants in bare-metal by clearing all the DPU data, and then re-provision it.

1k Emulated virtio-blk Devices

The virtio-blk device presents a block device to the Virtual Machine and offers high performance due to a thin software stack.

This version supports 1k emulated virtio-blk devices.

A typical configuration for this capability is:

  • 4 virtio-blk PFs and 253 virtio-blk VFs on each PF

or

  • 8 virtio-blk PFs and 126 virtio-blk VFs on each PF

Geneve

GENEVE hardware offload enables the traditional offloads to be performed on the encapsulated traffic. The data center operators can decouple the overlay network layer from the physical NIC performance, thus achieving native performance in the new network architecture.

Monitoring Cloud Guest RoCE Statistics on Cloud Provider

This new capability enables the VM to track and limit its Vport's activity. This is done using the new q_counters counter which enables aggregation of other Vport's from PF GVMI.

Linux Bridge Offload

Added a flow rule that enables offloading of multicast traffic by broadcasting it to multi-Flow-Table in FDB.

Selective Repeat

Selective repeat improves network utilization in case of a lossy fabric. This features is enabled by default.

Provisioning Flow

Provisioning flow enables the user to "clean" flash data, and reprogram the flash and and the NIC.

Dynamic VF MSIX Allocation

Added support for dynamic MSIX modification on a VF NVME device emulation.

If a PF NVME device emulation is created with dynamic_vf_msix_control = 1, then the dynamic_vf_msix_reset can set the PF device emulation's VF MSIX number to 0. The num_msix is used in the modified VF device emulation to modify the MSIX number of the VF device emulation.

InfiniBand Congestion Control (IB CC)

Enabled IB CC per Service Level (SL) for RC/UC on the HCA side.

Now different SLs can be configured to be CC on/off according to the bitmask decided by the software.

Hardware Steering: Bulk Allocation

Added support for 32 actions in the header modify pattern using bulk allocation.

InfiniBand Congestion Control - RTT Response Service Level

The software can explicitly set the SL of an RTT response packet, instead of it being taken from the RTT request packet's SL.

The RTT response packet SL may be set/queried via the CONGESTION_CONTROL_HCA_NP_PARAMETER MAD.

PCC Algorithms

Enables a smooth and statically switch between PCC algorithms. In addition, the user can now switch between PCC algorithms while running traffic.

IPSEC Side Acceleration with DPDK

[Beta] Added support for crypto (GCM) via the MMO engine.

AES-XTS

Added the ability to increase the tweak for every AES-XTS block by (1<<64) instead of by 1.

Last updated on May 5, 2024.