MLNX_OFED Documentation Rev 4.6-1.0.1.1
MLNX_OFED Documentation Rev 4.6-1.0.1.1

Changes and New Features

The following are the changes and/or new features that have been added to this version of MLNX_OFED.

HCAs

Feature/Change

Description

ConnectX-3/ConnectX-3 Pro

Devlink Configuration Parameters Tool

Added support for a set of configuration parameters that can be changed by the user through the Devlink user interface.

ConnectX-4 and above

ODP Pre-fetch

Added support for pre-fetching a range of an on-demand paging (ODP) memory region (MR), this way reducing latency by making pages present with RO/RW permissions before the actual IO is conducted.

DevX Privilege Enforcement

Enforced DevX privilege by firmware. This enables future device functionality without the need to make driver changes unless a new privilege type is introduced.

DevX Interoperability APIs

Added support for modifying and/or querying for a verb object (including CQ, QP, SRQ, WQ, and IND_TBL APIs) via the DevX interface.

This enables interoperability between verbs and DevX.

DevX Asynchronous Query Commands

Added support for running QUERY commands over the DevX interface in an asynchronous mode. This enables applications to issue many commands in parallel while firmware processes the commands.

DevX User-space PRM Handles Exposure

Exposed all PRM handles to user-space so DevX user application can mix verbs objects with DevX objects.

For example: Take the cqn from the created ibv_cq and use it on a devx)create(QP).

Indirect Mkey ODP

Added the ability to create indirect Mkeys with ODP support over DevX interface.

XDP Redirect

Added support for XDP_REDIRECT feature for both ingress and egress sides. Using this feature, incoming packets on one interface can be redirected very quickly into the transmission queue of another capable interface. Typically used for load balancing.

RoCE Disablement

Added the option to disable RoCE traffic handling. This enables forwarding of traffic over UDP port 4791 that is handled as RoCE traffic when RoCE is enabled.

When RoCE is disabled, there is no GID table, only Raw Ethernet QP type is supported and RoCE traffic is handled as regular Ethernet traffic.

Forward Error Correction (FEC) Encoding

Added the ability to query and modify Forward Error Correction (FEC) encoding, as well as disabling it via Ethtool.

RAW Per-Lane Counters Exposure

Exposed RAW error counters per cable-module lane via ethtool stats. The counters show the number of errors before FEC correction (if enabled).

For further information, please see phy_raw_errors_lane[i] under Physical Port Counters section in Understanding mlx5 ethtool Counters Community post.

ConnectX-4 Lx and above

VF LAG

Added support for High Availability and load balancing for Virtual Functions of different physical ports in SwitchDev SR-IOV mode.

ConnectX-5 and above

ASAP2 Offloading VXLAN Decapsulation with HW LRO

Added support for performing hardware Large Receive Offload (HW LRO) on VFs with HW-decapsulated VXLAN.

For further information on the VXLAN decapsulation feature, please refer to ASAP2 User Manual under www.mellanox.com -> Products -> Software -> ASAP2.

PCI Atomic Operations

Added the ability to run atomic operations on local memory without involving verbs API or compromising the operation's atomicity.

ConnectX-5

Virtual Ethernet Port Aggregator (VEPA)

Added support for activating/deactivating Virtual Ethernet Port Aggregator (VEPA) mode on a single virtual function (VF). To turn on VEPA on the second VF, run:

echo ON > /sys/class/net/enp59s0/device/sriov/1/vepa

VFs Rate Limit

Added support for setting a rate limit on groups of Virtual Functions rather on an individual Virtual Function.

ConnectX-6

ConnectX-6 Support

[Beta] Added support for ConnectX-6 (VPI only) adapter cards.

NOTE: In HDR installations that are built with remotely managed Quantum-based switches, the switch’s firmware must be upgraded to version 27.2000.1142 prior to upgrading the HCA’s (ConnectX-6) firmware to version 20.25.1500. When using ConnectX-6 HCAs with firmware v20.25.1500 and connecting them to Quantum-based switches, make sure the Quantum firmware version is 27.2000.1142 in order to avoid any critical link issues.

Ethtool 200Gbps

ConnectX-6 hardware introduces support for 200Gbps and 50Gbps-per-lane link mode. MLNX_OFED supports full backward compatibility with previous configurations.

Note that in order to advertise newly added link-modes, the full bitmap related to the link modes must be advertised from ethtool man page. For the full bitmap list per link mode, please refer to MLNX_OFED User Manual.

NOTE: This feature is firmware-dependent. Currently, ConnectX-6 Ethernet firmware supports up to 100Gbps only. Thus, this capability may not function properly using the current driver and firmware versions.

PCIe Power State

Added support for the following PCIe power state indications to be printed to dmesg:

  1. Info message #1: PCIe slot power capability was not advertised.

  2. Warning message: Detected insufficient power on the PCIe slot (xxxW).

  3. Info message #2: PCIe slot advertised sufficient power (xxxW).

    When indication #1 or #2 appear in dmesg, user should make sure to use a PCIe slot that is capable of supplying the required power.

mlx5 Driver

Message Signaled

Interrupts-X (MSI-X)

Vectors

Added support for using a single MSI-X vector for all control event queues instead of one MSI-X vector per queue in a virtual function driver. This frees extra MSI-X vectors to be used for completion event queue, allowing for additional traffic channels in the network device.

Send APIs

Introduced a new set of QP Send operations (APIs) which allows extensibility for new Send opcodes.

DC Data-path

Added DC QP data-path support using new Send APIs introduced in Direct Verbs (DV).

BlueField

BlueField Support

BlueField is now fully supported as part of the Mellanox OFED mainstream version sharing the same code baseline with all the adapters product line.

Representor Name Change

In SwitchDev mode:

  • Uplink representors are now called p0/p1

  • Host PF representors are now called pf0hpf/pf1hpf

  • VF representors are now called pf0vfN/pf1vfN

ECPF Net Devices

In SwitchDev mode, net devices enp3s0f0 and enp3s0f1 are no longer created.

Setting Host MAC and Tx Rate Limit from ECPF

Expanded to support VFs as well as the host PFs.

All

RDMA-CM Application Managed QP

Added support for the RDMA application to manage its own QPs and use RDMA-CM only for exchanging Address information.

RDMA-CM QP Timeout Control

Added a new option to rdma_set_option that allows applications to override the RDMA-CM's QP ACK timeout value.

MLNX_OFED Verbs API

As of MLNX_OFED v5.0 release (Q1 2020) onwards, MLNX_OFED Verbs API will be migrated from the legacy version of the user space verbs libraries (libibervs, libmlx5 ..) to the upstream version rdma-core.

More details are available in MLNX_OFED user manual under Installing Upstream rdma-core Libraries.

Bug Fixes

See “Bug Fixes" section.

For additional information on the new features, please refer to MLNX_OFED User Manual.

MLNX_OFED Verbs API Migration

As of MLNX_OFED v5.0 release (Q1 of the year 2020), the following MLNX_OFED Verbs API will migrate from the legacy version of user space verbs libraries (libibervs, libmlx5, etc.) to the Upstream version rdma-core.
For further details on how to install Upstream rdma-core libraries, refer to Installing Upstream rdma-core Libraries section in the User Manual.

  • ibv_exp_alloc_ec_calc

  • ibv_exp_dealloc_ec_calc

  • ibv_exp_ec_encode_async

  • ibv_exp_ec_encode_sync

  • ibv_exp_ec_decode_async

  • ibv_exp_ec_decode_sync

  • intibv_exp_ec_update_async

  • intibv_exp_ec_update_sync

  • ibv_exp_ec_poll

  • ibv_exp_ec_encode_send

  • ibv_exp_create_qp

  • ibv_exp_use_priv_env

  • ibv_exp_poll_dc_info

  • ibv_exp_setenv

  • ibv_exp_query_device

  • ibv_exp_create_dct

  • ibv_exp_destroy_dct

  • ibv_exp_query_dct

  • ibv_exp_arm_dct

  • ibv_exp_query_port

  • ibv_exp_post_task

  • ibv_exp_query_values

  • ibv_exp_cqe_ts_to_ns

  • ibv_exp_create_flow

  • ibv_exp_destroy_flow

  • ibv_exp_poll_cq

  • ibv_exp_post_send

  • ibv_exp_reg_shared_mr

  • ibv_exp_modify_cq

  • ibv_exp_create_cq

  • ibv_exp_modify_qp

  • ibv_exp_reg_mr

  • ibv_exp_bind_mw

  • ibv_exp_prefetch_mr

  • ibv_exp_get_provider_func

  • ibv_exp_create_mr

  • ibv_exp_query_mkey

  • ibv_exp_dealloc_mkey_list_memory

  • ibv_exp_alloc_mkey_list_memory

  • ibv_exp_create_srq

  • ibv_exp_create_res_domain

  • ibv_exp_destroy_res_domain

  • ibv_exp_query_intf

  • ibv_exp_release_intf

  • ibv_exp_create_wq

  • ibv_exp_modify_wq

  • ibv_exp_destroy_wq

  • ibv_exp_create_rwq_ind_table

  • ibv_exp_destroy_rwq_ind_table

  • ibv_exp_query_gid_attr

  • ibv_exp_open_device

  • ibv_exp_post_srq_ops

  • ibv_exp_alloc_dm

  • ibv_exp_free_dm

  • ibv_exp_memcpy_dm

Deprecated APIs

Warning

Note that the following APIs are deprecated and replaced with the new APIs as of MLNX-OFED version 4.0, as listed in the table below.

Feature

Type

Current API

New API

Rereg MR

Verb

ibv_exp_rereg_mr

ibv_rereg_mr

Memory Window

Verb

ibv_exp_bind_mw

ibv_bind_mw

Structure

ibv_exp_send_wr -> bind_mw

ibv_send_wr -> bind_mw

Opcodes

IBV_EXP_WR_SEND_WITH_INV

IBV_WR_SEND_WITH_INV

IBV_EXP_WR_LOCAL_INV

IBV_WR_LOCAL_INV

IBV_EXP_WR_BIND_MW

IBV_WR_BIND_MW

Capability

IBV_EXP_DEVICE_MEM_WIN- DOW

IBV_DEVICE_MEM_WIN- DOW

Completion

IBV_EXP_WC_WITH_INV

IBV_WC_WITH_INV


The following are the unsupported functionalities/features/HCAs in MLNX_OFED:

  • ConnectX®-2 Adapter Card

  • Relational Database Service (RDS)

  • Ethernet over InfiniBand (EoIB) - mlx4_vnic

  • mthca InfiniBand driver

© Copyright 2023, NVIDIA. Last updated on Dec 26, 2023.