NVIDIA MLNX_OFED Documentation v5.9-0.5.6.0.107
Linux Kernel Upstream Release Notes v5.17

Changes and New Features

The following are the new features and changes that were added in this version. The supported adapter cards are specified as follows:

Supported Cards

Description

All HCAs

Supported in the following adapter cards unless specifically stated otherwise:

ConnectX-4 / ConnectX -4 Lx / ConnectX-5 / ConnectX-6 / ConnectX-6 Dx / ConnectX-6 Lx / ConnectX-7 / BlueField-2

ConnectX-6 Dx and above

Supported in the following adapter cards unless specifically stated otherwise:

ConnectX-6 Dx / ConnectX-6 Lx / ConnectX-7 / BlueField-2

ConnectX-6 and above

Supported in the following adapter cards unless specifically stated otherwise:

ConnectX-6 / ConnectX-6 Dx / ConnectX-6 Lx / ConnectX-7 / BlueField-2

ConnectX-5 and above

Supported in the following adapter cards unless specifically stated otherwise:

ConnectX-5 / ConnectX-6 / ConnectX-6 Dx / ConnectX-6 Lx / ConnectX-7 / BlueField-2

ConnectX-4 and above

Supported in the following adapter cards unless specifically stated otherwise:

ConnectX-4 / ConnectX -4 Lx / ConnectX-5 / ConnectX-6 / ConnectX-6 Dx / ConnectX-6 Lx / ConnectX-7 / BlueField-2

Feature/Change

Description

5.9-0.5.6.0.107

Firmware

Updated firmware version to 28.36.2020 to be used for DGX H100 systems.

5.9-0.5.6.0

ASAP2 Features

Linux Bridge VLAN Filtering of 802.1 Q Packets

[ConnectX-6 Dx] Extended mlx5 Linux bridge VLAN offload to support packets tagged with 802.1 Q VLAN ethertype.

Offloading sFlow Sampling Rules

[ConnectX-5 and above] Added support for sFlow sampling rules offloads. sFlow is an industry standard technology for monitoring high speed switched networks. Open vSwitch integrated sFlow to extend the visibility into virtual servers, ensuring data center visibility and control.

Core Features

Configuring Shared Buffer Size

[ConnectX-6 Dx and above] Enabled user to control shared buffer size and configuration, implicitly.

As with each port buffer command the user triggers, the shared buffer configuration will be updated accordingly by the driver.

Control SF Class

[All HCAs] Added support for Control SF Class. Each PCI, PF, VF, SF function, by default, has netdevice, RDMA, and vdpa-net devices always enabled. This feature enables the user to control which device functionality to enable/disable.

Note: Requires kernel 5.18 or higher.

Installation Features

ip2gid Tool

[All HCAs] Added support for ip2gid tool.

This tool does the following:

  1. Resolves a destination IP into a destination GID needed when running a rdmacm applications (ip2gid).

  2. Resolves a GID into one PR (PathRecord) or multiple PRs if needed (gid2lid).

This tool is needed when rdmacm is used to initiate InfiniBand traffic between nodes on different IP subnets in InfiniBand fabrics.

NetDev Features

Support RSS over XSK Queues

[All HCAs] Use default RSS functionality to spread traffic across different XSK queues instead of having to provide explicit steering rules.

TLS TIS Pool

[TLS-Enabled Devices] Per-connection hardware TIS objects is used to maintain the device TLS TX context. Use a SW TIS pool for recycling the TIS objects instead of destroying/creating them. This reduces the interaction with the device via the FW command interface, which increases the TLS connection rate.

RDMA Features

Expand Rep Counters

[ConnectX-5 and above] Adding RDMA traffic-only counters for rep devices. These counters can now be read from host with ethtool or from sysfs and not only from the cointainer.

UMR QP Recilency

[ConnectX-5 and above] Added a recovery flow for the driver's UMR logic so that other UMR requests can be proccessed after the error UMR was dropped and the UMR QP was reset. Previously, a faulty UMR request would have moved the QP to error state and disable any option to continue issuing UMRs.

General

Bug fixes

For additional information on the new features, please refer to MLNX_OFED User Manual.

Customer Affecting Change

Description

5.9-0.5.6.0

Deprecation, LAG Mode via Sysfs

Setting LAG mode via Sysfs is going to be deprecated in a future release. Instead, LAG Hash mode will be used by default, similar to upstream behavior.

LAG Configuration, PCI Error

From version 5.9, LAG configuration will be lost in case driver incurs a PCI error. Make sure to reconfigure the bond after driver completes the recovery from the PCI error.

In releases prior to 5.9, in case of PCI error (EEH injections on PPC setup), the driver recovers LAG bond and reconfigures it automatically in case it what configured before the appearance of the error.

MLNX_OFED Verbs API Migration

As of MLNX_OFED v5.0 release (Q1 of the year 2020), MLNX_OFED Verbs API have migrated from the legacy version of user space verbs libraries (libibervs, libmlx5, etc.) to the Upstream version rdma-core.

For the list of MLNX_OFED verbs APIs that have been migrated, refer to Migration to RDMA-Core document.

© Copyright 2023, NVIDIA. Last updated on Nov 27, 2023.