Changes and New Features

Linux Kernel Upstream Release Notes v5.17

The following are the new features and changes that were added in this version.

Feature/Change

Description

5.4-3.0.3.0

GPUDirect

Kernel Space GPUDirect from VM

[ConnectX-5 and above] Added support for kernel space GPUDirect from the VM. To use GDS with high performance in a VM, set the ATS capability in ib_alloc_mr.

ASAP2

CTC Metering

[ConnectX-6 Dx] Added support for per flow metering using OVS or TC, PPS, and BPS.

Slow Path Metering

[ConnectX-6 Dx] Added support for slow path metering on representor using OVS or TC, PPS, and BPS.

Core

VF LAG

[ConnectX-6 Dx and BlueField-2] Added support to have physical port selection based on the hash function defined by the bond so that different packets of the same flow will be egress from the same physical port.

In order to enable this feature, set this mode for both bonded devices through the below sysfs before the device is in switchdev mode:

echo "hash" > /sys/class/net/enp8s0f0/compat/devlink/lag_port_select_mode

In order to have the legacy behaviour (queue affinity based selection), echo the following:

echo "queue_affinity" > /sys/class/net/enp8s0f0/compat/devlink/lag_port_select_mode

This feature requires to set LAG_RESOURCE_ALLOCATION to 1 with mlxconfig.

Single IRQ for PCI Function

[ConnectX-4 and above] Added support for single IRQ for PCI function. To use a high number of VFs, a large amount of IRQs is required which the device cannot always support. This feature enables VFs to function with a minimum of a single IRQ instead of two.

This is done via dynamic MSIX feature. In case dynamic MSIX feature is not supported (old kernels), the following configuration will probe all VFs with single IRQ:

$ mlxconfig -d <pci_dev> s NUM_VF_MSIX=0 STRICT_VF_MSIX_NUM=1

Netdev

ethtool EEPROM Support for DSFP

[ConnectX-6 Dx] Added support for reading DSFP module information. The change includes adding new options to ethtool netlink EEPROM module read API to read a specific page and bank.

RDMA

Dynamic VF MSI-X Allocation

[ConnectX-5 and above] Added support for dynamic assignment of MSI-X vector count.

The number of MSI-X vectors is a PCI property visible through lspci and is read-only field configured by the device. The static assignment of an amount of MSI-X vectors does not allow to utilize the newly created VF because the future load and configuration where that VF will be used is not known to the device. The VFs are created on the hypervisor and forwarded to the VMs that have different properties (for example number of CPUs).

To overcome the inefficiency in the spread of such MSI-X vectors, the kernel is now allowed to instruct the device with the needed number of such vectors before the VF is initialized and bounded to the driver.

DV API for DMA GGA memcpy

[BlueField-2 and above] DMA memcpy is one of several Memory-to-Memory Offloads (MMO) available from BlueField-2 onwards. It utilizes the GGA modules on the DPU to perform DMA memcpy, thus improving performance. The memcpy can be done locally, on the same host, or between the host and the Arm.

To use this feature, expose DV API.

Steering UserSpace

Set DR Matcher Layout

[ConnectX-6 Dx] Added support for a new RDMA CORE DR API to set the DR matcher layout by calling mlx5dv_dr_matcher_set_layout.

Setting the matcher layout allows presetting the matcher size and increasing matcher rule capacity, as well as other performance improvements in case matcher size is known.

Flex Parsers misc4

[ConnectX-5 and ConnectX-6 Dx] Added ability to expose flex parsers 4-7 provided by misc4 to extend matching ability of flex parsers. Now all flex parsers can be matched at the same time.

Software encap Action

[ConnectX-5 and above] Added support for software encap action. There is requirement for more than 1M encap actions, but currently the encap action creation uses devx, which is very slow for 1M encap actions. As such, there is a need to support a way for software to create encap actions.

The encap reformat action creation in rdma-core can now be done via software, rather than devx. It will use the new ICM memory type of software encap and directly copy encap data there, then use the memory pointer for flow creation.

Bug Fixes

See Bug Fixes in This Version.

For additional information on the new features, please refer to MLNX_OFED User Manual.

Customer Affection Change

Description

5.4-3.0.3.0

CUDA, UCX, HCOLL

For UCX-CUDA and hcoll-cuda, CUDA was upgraded from version 10.2 to 11.2.

MLNX_OFED Verbs API Migration

As of MLNX_OFED v5.0 release (Q1 of the year 2020), MLNX_OFED Verbs API have migrated from the legacy version of user space verbs libraries (libibervs, libmlx5, etc.) to the Upstream version rdma-core.

For the list of MLNX_OFED verbs APIs that have been migrated, refer to Migration to RDMA-Core document.

© Copyright 2023, NVIDIA. Last updated on Oct 23, 2023.