Changes and New Features

The following are the new features and changes that were added in this version.

Feature/Change

Description

5.5-1.0.3.2

ASAP2 Features

Bridge Offloads with VLAN

[ConnectX-4 and above] Added support for bridge offloads with VLAN support that works on top of mlx5 representors in switchdev mode.

Supporting OVS Groups in Fast-Failover Mode

[ConnectX-6 Dx] Improved OVS failover through support for OVS groups in fast-failover mode + VF_LAG configuration with OVS.

Exposing Hairpin Queues Information

[ConnectX-6 Dx and BlueField-2] Added support for exposing hairpin out of buffer drop counter per device. This feature shows buffer drops related only to hairpin queues which were opened on the queried device.

To enable this counting mode (this must be done before any hairpin rules are created), use the following: echo "on <peer_devname>" > /sys/class/net/<dev>/hp_oob_cnt_mode where <peer_devname> is the peer device to which traffic coming to the configured device will be forwarded to for transmission.

To read the drop counter, use the following: cat /sys/class/net/<dev>/hp_oob_cnt

Linux Bridge Offload

[ConnectX-6 Dx and BlueField-2] Added bridge offloads to support bonding (VF LAG), attaching bond device to bridge instead of uplink representors.

VLAN Pop/Push

[ConnectX-6 Dx] Added OOB support for VLAN push on Rx (wire to VF) and VLAN pop on Tx (wire to VF) in switchdev mode.

Offload Forwarding to Multiple Destinations

[ConnectX-5 and above] Added support for offloading packet replication to up to 32 destination through the use of TC rule.

Slow Path Metering

[ConnectX-4 and above] Expanding the RDMA statistic tool to support setting vendor-specific optional counters dynamically using netlink.

Added to mlx5_ib the following optional counters:

cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts.

Example:

$ rdma statistic mode supported link rocep8s0f0/1

link rocep8s0f0/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts

$ sudo rdma statistic set link rocep8s0f0/1 optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts

$ rdma statistic mode link rocep8s0f0/1

link rocep8s0f0/1 optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts

$ sudo rdma statistic set link rocep8s0f0/1 optional-counters cc_rx_ce_pkts

$ rdma statistic mode link rocep8s0f0/1

link rocep8s0f0/1 optional-counters cc_rx_ce_pkts

$ sudo rdma statistic unset link rocep8s0f0/1 optional-counters

Using Specific Interface for Tunnel Route Lookup

[ConnectX-5 and above] Added ability to use a specific interface for tunnel route lookup if tunnel was created with the current device.

Core Features

Subfunction Trust Configuration Enhancement

[ConnectX-5 and above] Added support via mlxdevm to mark a given PCI subfunction (SF) or virtual function (VF) as a trusted function. The device/firmware decides how to define privileges and access to resources.

Prevent VF Memory Exhaustion

[All] Added support for preventing VF memory exhaustion. This feature exposes a sysfs (to the system admin) which can set a limit on each VF memory consumption.

Note: Currently only supported on Ethernet.

BlueField NIC Separate Reset

[BlueField-2] Added support for resetting the NIC domain of BlueField-2 while keeping ARM alive.

Multiple Steering Priorities for FDB Rules

[ConnectX-6 Dx and BlueField-2] Added support in multiple flow steering priorities for FDB rules.

NetDev Features

Traffic Engineering: Hierarchical QoS

[ConnectX-5 and above] Added support for offloading the HTB qdisc to the NIC, allowing it to scale better by eliminating a single locking point. The configuration is done with the TC commands.

Note: Kernel 5.15 or higher is required. Limited to 256 nodes.

TLS RX Resynchronization Resiliency Feature Description

[ConnectX-6 Dx and above] Added support for driver resiliency against high load of RX resync operations.

Simultaneous PTP and CQE Compression

Added support for the activation of PTP and CQE compression simultaneously. Since CQE compression might harm the accuracy of the PTP, the feature enables PTP packets to be moved to a dedicated queue where they are not subjected to compression. However, this configuration conflicts with setting aRFS. Turning off CQE compression, causes a hiccup in traffic which may cause a loss of synchronization. To overcome this, restart the synchronization.

Note: This combination is supported only for Ethernet drivers. Other driver profiles, like IPoIB and representors, do not support this combination.

RDMA Features

ODP On Demand Synchronization

Added support to expose an option to prefetch ODP MR without faulting. This enables updating the device page table with the presenting CPU pages and reducing page faults in the system.

DV API for AES-XTS

[ConnectX-6 and above] Added DV API that allows configuration of MKey with AES-XTS crypto offloads. The MKey can be configured for both crypto and signature offloads.

Huge Page Support for DEVX UMEMs

[ConnectX-4 and above] Added support to allow DEVX UMEM to be created with larger page sizes than 4K. For some device objects (e.g., RegEx) this is a must. In addition, page size larger than 4K may need less MTTs which may improve performance.

A new API mlx5dv_devx_umem_reg_ex() was added which requests a specific page sizes. It enables better application control on the required UMEM page size. The new API named mlx5dv_devx_umem_reg_ex() will be part of rdma-core V35.

ODP Locking Optimization

[ConnectX-4 and above] Added support for cleanup of the synchronize_srcu() from the ODP flow because it was a time-consuming part of dereg_mr.

Note: This only affects the driver and not the firmware.

Export Object IDs to Users

[ConnectX-4 and above] Extended support for the "rdma res show" command to SRQ and context resources.

Raw WQE

[ConnectX-5 and above] Added support for Raw WQE (mlx5dv_wr_raw_wqe). This feature allows applications to build a new custom work request (WQE) that is not supported by the verbs or driver and post it on normal QP. It is an extension for IBV work request (ibv_wr_*) with mlx5 specific features for sending a work request.

mlx5 Over VFIO

Added support for mlx5 user space driver over VFIO.

This feature enables an application to take full ownership on the opened device and run any firmware command (e.g., port up/down) without any concern to hurt someone else.

The application look and feel is like regular RDMA application over DEVX. It uses verbs API to open/close a device and then mostly uses DEVX APIs to interact with the device.

New mlx5 DV APIs were added to get ibv_device for a given mlx5 PCI name and to manage device specific events.

For description of the relevant APIs and expected usage of those APIs, look up the following:

mlx5dv_get_vfio_device_list()

mlx5dv_vfio_get_events_fd()

mlx5dv_vfio_process_events()

Software Steering Features

system_image_guid to Group Bonding Interfaces

[ConnectX-4 and above] Added support for using system_image_guid to group bonding interfaces.

With some specific NICs, each interface may have different PCIe domain, bus, device, or function IDs. For interfaces with the same system_image_guid, the driver assumes they reside on the same physical device and use a native_port_id to distinguish its index. Fallback is to PCIe BDF, if unsupported.

Software Steering Dump File Parser Tool

[ConnectX-4 and above] mlx_steering_dump tool is used to parse software steering dump files which includes information about domains, tables, matchers and rules created by software steering (mlx5dv_dr API), it can be used offline by providing a dump file as input, or it can be used to trigger DPDK app (like testpmd) to generate the dump and parse it.

Installation Features

Multiple Development Headers Packages

Allowed installing multiple mlnx-ofa_kernel development headers packages (for different kernel versions of the same mlnx-ofa_kernel package version) side by side on the same system.

Kernel Module Signature

Added signature of kernel modules of EulerOS 2.0 SP8-SP10 (x86_64 and aarch64) builds of MLNX_OFED.

Enable sf-cfg-drv by Default in EulerOS2.0

Enabled SF_CFG (SF config dummy driver, --with-sf-cfg-drv) on EulerOS2.0 SP8 and SP10.

For additional information on the new features, please refer to MLNX_OFED User Manual.

Customer Affection Change

Description

5.5-1.0.3.2

Disabling RoCE While Using sysfs

When using sysfs to enable/disable roce in kernel 5.5 and up, the "devlink reload" command (using iproute2 with devlink tool) will need to be used to activate the RoCE status change.

Disable RoCE example:

1. echo 0 > /sys/bus/pci/devices/0000:08:00.0/roce_enable

2. devlink dev reload pci/0000:08:00.0

mlnx-ofa_kernel Installation

The source code for mlnx-ofa_kernel is no longer installed by default on RPM-based distributions (e.g., RHEL and SLES).

Notes:

• mlnx-ofa_kernel is included in the <> in the MLNX_OFED distributions under RPMS/ and may be manually installed from there.

• There is no change for deb-based distributions (Debian and Ubuntu). The full source is included, as before, in the package mlnx-ofed-kernel-dkms.

Software Encapsulation Compatibility

There is an encapL2 compatibility issue with accelerated reformat action creation using mlx5dv_dr API.

Using OFED 5.4 with firmware xx.32.1xxx and above or using OFED 5.5 with firmware lower than xx.32.1xxx will not allow accelerated reformat action. (Using OFED 5.4 and 5.5 with bundle firmware works properly.)

xpmem in RHEL8

Added xpmem packages in RHEL8 builds.

Python3

Starting OVS DPDK 2.15, the Python minimum required version is 3 and OVS-DPDK will not be compiled using Python 2.

MLNX_OFED Verbs API Migration

As of MLNX_OFED v5.0 release (Q1 of the year 2020), MLNX_OFED Verbs API have migrated from the legacy version of user space verbs libraries (libibervs, libmlx5, etc.) to the Upstream version rdma-core.

For the list of MLNX_OFED verbs APIs that have been migrated, refer to Migration to RDMA-Core document.

© Copyright 2023, NVIDIA. Last updated on Oct 23, 2023.