NVIDIA MLNX_OFED Documentation v23.04-1.1.3.0
v5.17

Bug Fixes History

This table lists the bugs fixed in the last three major GA releases. For a list of old bug fixes, please refer to the release notes of the desired version.

Internal Reference Number

Description

3247519

Description: On an Ubuntu 22.04 system, when installing using the apt install method to install MLNX_OFED including Open vSwitch, and if the distribution Open vSwitch package was previously installed, the install may fail because of a left-over systemd generated file: the symbolic link /etc/systemd/system/openvswitch-switch.service.requires/ovs-record-hostname.service -> /lib/systemd/system/ovs-record-hostname.service .

Keywords: Installation, Ubuntu 22.04, Open vSwitch

Discovered in Release: 5.8- 1.0.1.1

Fixed in Release: 5.9-0.5.6.0

3296578

Description: Dapltest on RHEL9.x (ppc64le) could fail to run with a segmentation fault.

Keywords: Installation, RHEL9.x, Dapltest

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.9-0.5.6.0

3261289

Description: The host driver probe does not check whether there are existing SFs which are present in the device. As such, the host driver did not re-create those SFs.

Keywords: Core, Scalable Functions

Fixed in Release: 5.9-0.5.6.0

3228719

Description: If there are multiple encapsulations and not all neighbors are valid, the kernel will go into panic mode.

Keywords: ASAP2, Kernel Panic

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.9-0.5.6.0

2946873

Description: Moving to switchdev mode while deleting namespace may cause a deadlock.

Keywords: ASAP2, Switchdev, Namespace

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.9-0.5.6.0

3239291

Description: In some topologies, like logical partitions, mlxfwreset is not supported.

Keywords: Core, mlxfwreset

Discovered in Release: 5.8- 1.0.1.1

Fixed in Release: 5.9-0.5.6.0

3220855

Description: Creating external SFs on BF ARM when the host (x86) operating system does not support SFs may cause the host to crash.

Keywords: Core, Scalable Functions

Discovered in Release: 5.8- 1.0.1.1

Fixed in Release: 5.9-0.5.6.0

Internal Reference Number

Description

3253500

Description: The redundant freeing of a list item could lead to memory corruption, potentially causing the application to crash or incorrect traffic handling.

Keywords: Steering, Memory Corruption, List, Pattern/Argument

Fixed in Release: 5.8- 1.1.2.1

3214161

Description: The knem-dkms package explicitly requires GCC to build the knem driver (at install times). Under some circumstances, on Debian systems, the apt install method may result in a system that has only gcc-<version> (e.g., gcc-10) installed.

Keywords: Installation, Debian, GCC

Fixed in Release: 5.8- 1.1.2.1

3230613

Description: Installing MLNX_OFED_LINUX on an Ubuntu system with CUDA (version < 11.6) may result in an automatic installation of the ucx-cuda package that will fail with an error message in the log file ucx-cuda.debinstall.log about missing dependencies.

Keywords: Installation, Ubuntu, CUDA

Fixed in Release: 5.8- 1.1.2.1

3235521

Description: The host driver probe did not check whether there are existing SFs which are present in the device, causing the host driver to not recreate those SFs.

Keywords: Core, Scalable Functions

Fixed in Release: 5.8- 1.1.2.1

3228357

Description: If there are multiple encapsulations and not all neighbors are valid, the kernel will go into panic mode.

Keywords: ASAP2, Encapsulation

Discovered in Release: 5.5-1.0.3.2, 5.7-1.0.2.0

Fixed in Release: 5.8- 1.1.2.1

3232445

Description: When using BlueField with old kernels, multiple OVS meter do not work.

Keywords: ASAP2, BlueField, Meter, OVS, Offload

Fixed in Release: 5.8- 1.1.2.1

Internal Reference Number

Description

3234066

Description: When configuring IPsec full offload, after sending traffic for approximately 30 minutes, the traffic stops at some point and the connection gets lost.

Keywords: Steering, SMFS, Matcher Disconnect

Fixed in Release: 5.8- 1.0.1.1

3179535

Description: SMFS will try to merge flow rules with the same matching criteria (as they share the same matcher) into one multi-destination rule.

If merging fails, the matcher is disconnected by mistake.

Keywords: Steering, SMFS, Matcher Disconnect

Fixed in Release: 5.8- 1.0.1.1

3214198

Description: ibv_reg_mr for huge pages was optimized in kernel >= 5.12

Keywords: RDMA, ibv_reg_mr

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.8- 1.0.1.1

2984134

Description: Moving to SwitchDev mode while deleting namespace over Linux-6.0 can sometimes cause a deadlock.

Keywords: RDMA, SwitchDev

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.8- 1.0.1.1

3106228

Description: A net device validation issue prevented running IPv6 traffic using an RDMA communication manager between two interfaces on same host with same subnet.

Keywords: RDMA, IPv6, Communication Manager

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.8- 1.0.1.1

3151843

Description: I n mlx5dv_mkey_check manpage, there is an i naccurate description of signature error handling flow.

Keywords: RDMA, manpage

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.8- 1.0.1.1

3229002

Description: Creating and deleting MRs, caused a kernel slab cache leak issue.

Keywords: RDMA, Cache

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.8- 1.0.1.1

3236217

Description: The rdma res show cm_id command does not list all cm_ids when some of them are in LISTEN state.

Keywords: RDMA, cm_ids

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.8- 1.0.1.1

3146128

Description: In older kernel version, PTP was not supported over VLAN interfaces.

Keywords: NetDev, PTP, VLAN

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.8- 1.0.1.1

2969772

Description: HW-GRO feature was blocked due to firmware limitations.

Keywords: NetDev, HW-GRO

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.8- 1.0.1.1

3096393

Description: STP packets failed to be transmitted.

Keywords: NetDev, STP

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.8- 1.0.1.1

3236984

Description: When using sysfs to read the hash function used to distribute the traffic between the T I R s ( Transpo rt Interface Receive) , on occasion, the server crashed.

Keywords: NetDev, sysfs

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.8- 1.0.1.1

3126000

Description: Upgrading from version 5.6-2 to 5.7 failed.

Keywords: Installation

Discovered in Release: 5.6-2.0.9.0

Fixed in Release: 5.8- 1.0.1.1

3230524

Description: Building with KMP enabled fails due to missing packages. OFED packages will now be built with KMP disabled.

Keywords: Installation, KMP

Fixed in Release: 5.8- 1.0.1.1

3158725

Description: The script install.pl, used for (re)building kernel modules, used the name "kernel-source" as the package of the kernel-source on SLES systems.

Keywords: Installation, SLES

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.8- 1.0.1.1

3142212

Description: Starting firmware version xx.34.0350, a new NVCONFIG has been added to the ARM side only: MANAGEMENT_PF_MODE.

If this config is on, the user will see a PCI Function (PF) which failed to probe:

Copy
Copied!
            

[    6.837102] mlx5_core 0000:03:00.2: mlx5_cmd_check:756:(pid 206): ENABLE_HCA(0x104) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x6ca1f5) [    6.864227] mlx5_core 0000:03:00.2: mlx5_peer_pf_init:40:(pid 206): Failed to enable peer PF HCA err(-22)                                                                [    6.883453] mlx5_core 0000:03:00.2: mlx5_load:1129:(pid 206): Failed to init embedded CPU [    8.261268] mlx5_core 0000:03:00.2: init_one:1365:(pid 206): mlx5_load_one failed with error code -22                                                                     [    8.280056] mlx5_core: probe of 0000:03:00.2 failed with error -22  

Keywords: Installation

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.8- 1.0.1.1

3174928

Description: Using a 1-CPU system casues possible command flush deadlock.

Keywords: Core

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.8- 1.0.1.1

3228721/3228357

Description: An incorrect termination table was used with the uplink-to-uplink forward rule.

Keywords: ASAP2, eSwitch

Discovered in Release: 5.7-1.0.2.0

Fixed in Release: 5.8- 1.0.1.1

3220120

Description: In old kernels, when a VXLAN tunnel is set up on one OVS bridge and PF is up on another OVS bridge, traffic does not offload as expected.

Keywords: ASAP2, VXLAN

Discovered in Release: 5.4-3.0.3.0

Fixed in Release: 5.8- 1.0.1.1

Internal Reference Number

Description

3032335

Description: Creating multiple steering rules that modify a packet and match on the same packet headers can cause an error to be displayed in dmesg when deleting the steering rules.

Keywords: Steering Rules

Fixed in Release: 5.7-1.0.2.0

3011368

Description: Some IB spec QP state behaviour on post_send()/recv() is not being fully enforced. The fix makes the QP complaint to IB spec about when it is allowed to post_send()/recv() and when it should return an error.

Keywords: RDMA, IB spec QP

Fixed in Release: 5.7-1.0.2.0

3075125

Description: When changing trust state from PCP to DSCP, the TC number changes by default to 8, in some cases, disrupting traffic prioritization if trust state is changed back to PCP.

Keywords: NetDev, QoS

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.7-1.0.2.0

3054413

Description: In the current release, the following OPNs/PSIDs should be manually upgraded:

MCX753106AS-HEA-N NVD0000000023

MCX75310AAS-HEA-N NVD0000000024

Keywords: ConnectX-7, Upgrade

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.7-1.0.2.0

3070653

Description: In versions of MLNX_OFED before 5.7, the xpmem kernel module was not signed. When it was installed on systems (mostly RHEL and other compatible systems) the following error message would appear: "xpmem: loading out-of-tree module taints kernel."

Keywords: Installation, xpmem

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.7-1.0.2.0

3075357

Description: In Debian-based distributions, in /etc/init.d/openibd, the path to enable the firmware tracer is /sys/kernel/debug/tracing/events/mlx5/fw_tracer/enable instead of /sys/kernel/debug/tracing/events/mlx5/mlx5_fw/enable . As a result, firmware tracer will never get enabled even when supported.

Keywords: Installation, Kernel Trace Debug

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.7-1.0.2.0

2688191

Description: The minimum Tx rate limit is not supported with link speed of 1Gb/s.

Keywords: Rate Limit, 1Gb/s

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.7-1.0.2.0

3044255

Description: Destroying mlxdevm group while SF is attached to it is not supported.

Keywords: ASAP2, mlxdevm, QoS, Group, Scalable Functions

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.7-1.0.2.0

3047142

Description: Using OVS offload with NIC mode (non switchdev mode) causes traffic to drop.

Keywords: ASAP2, Offload, NIC Mode, OVS

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.7-1.0.2.0

3123986

Description: In some cases VF metering configuration failure caused a deadlock.

Keywords: ASAP2, VF Metering

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.7-1.0.2.0

3053842

Description: A race condition may cause some connection aging to set to 24 hours instead of 30 seconds.

Keywords: ASAP2, Connection Tracking, Aging

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.7-1.0.2.0

Internal Reference Number

Description

3079038

Description: When an already-loaded 'non-mellanox' auxiliary device on the auxiliary bus OFED driver exists, load may fail and cause kernel panic.

Keywords: Driver Load

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.6-2.0.9.0

3066233

Description: On SLES15 systems that have both python3 and python2 installed, rebuilding kernel modules fails with an error in the mlnx-tools package, and specifically in the mlnx-tools build log, about missing ib2ibsetup.8.

Keywords: Installation

Discovered in Release: 5.6-1.0.3.3

Fixed in Release: 5.6-2.0.9.0

Internal Reference Number

Description

2858237

Description: NULL dereference may occur when performing port up/down.

Keywords: Host

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.6-1.0.3.3

2697443

Description: Reloading devlink in NetDev profile caused deadlock.

Keywords: Devlink

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.6-1.0.3.3

2947734

Description: When using NFS over RDMA rpcrdma.ko created some entry files under /proc folder (e.g., “-rw-r--r-- 1 root root 0 . . .).

Keywords: NFS, RDMA

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2905055

Description: ECPF port was not properly recognized.

Keywords: ECPF

Fixed in Release: 5.6-1.0.3.3

2771739

Description: Gratuitous ARP during rdma_connect is not handled properly.

Keywords: Gratuitous ARP

Fixed in Release: 5.6-1.0.3.3

2888178

Description: A locking issue in steering rules deletion, at times, could cause a deadlock while inserting or deleting new rules.

Keywords: Deadlock, Steering

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2979734

Description: ibdump sources were not shared under the sources directory.

Keywords: ibdump Sources

Fixed in Release: 5.6-1.0.3.3

2820245

Description: Crypto offload of UDP traffic on top of IPv6 was unsupported.

Keywords: IPsec, Crypto, Offload

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2869109

Description: IPsec crypto offload for non TCP/UDP encapsulated traffic broke.

Keywords: IPsec, Crypto, Offload

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2905896

Description: Leaving a multicast group (rdma_leave_multicast) used the wrong address and left the interface in the multicast group.

Keywords: RoCE, Multicast

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2939691

Description: Unsupported parameters were ignored. Now, when using unsupported syntax or unsupported command line parameters, the application will fail with an error message.

Keywords: Command Line, Parameters

Fixed in Release: 5.6-1.0.3.3

3024670

Description: When many MRs are allocated, the driver searches for a free MR with the best size fit. When the cache is dry, a huge MRs (1 MB) instead of small once (32B) could be selected because there was no better fit. The memory overhead was limited to avoid synchronous MR creation.

Keywords: Memory, MR

Fixed in Release: 5.6-1.0.3.3

3020746

Description: In the rdma-core library, the CMA device was retrieved in the wrong way when libnl is not used.

Keywords: libnl, rdma-core

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.6-1.0.3.3

2939037

Description: Ethtool that is part of the original OFED package failed to dump correct EEPROM values when using -m flag.

Keywords: Ethtool, EEPROM

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2582616

Description: Manual pages were missing from infiniband-diags tools. Added manpages for all OSs that provide the python-docutils package.

Keywords: Manual Pages (manpages), infiniband-diags

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.6-1.0.3.3

2979137

Description: An increment of count variable was missing when looping over output buffer in mlx5e_self_test(). As a result, the garbage value of ethtool -t was resolved.

Keywords: ethtool, selftest

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2752622

Description: On SLES 15, the inbox modules in the directory mlxsw (such as mlxsw_spectrum) was not supported. When they were installed when installing MLNX_OFED, they no longer worked (as they depend on a different version of the mlx* modules) and could cause an error at time of installation.

Keywords: Installation

Discovered in Release: 5.4-3.0.3.0

Fixed in Release: 5.6-1.0.3.3

2200320

Description: In cases where MLNX_OFED was reinstalled on a certain system without using --force, the installation could fail requiring the removal of the infiniband-diags package.

Keywords: Installation, Force, infiniband-diags

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.6-1.0.3.3

2984013

Description: When uninstalling the kmod-xpmem package, xpmem module was not unloaded. From now on, after uninstalling, xpmem module will be removed automatically.

Keywords: Installation, xpmem

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2984098

Description: OFED installation modified file "/etc/yum.conf" to exclude some packages from the Yum repositories. As of RHEL 8, /etc/yum.conf is a symlink to /etc/dnf.conf and this edit breaks the symlink. As there is no use in such an edit, OFED no longer edits this file.

Keywords: Installation, Yum Repositories, RHEL

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2934828

Description: PCIe device address to RDMA device name mapping on x86 host may change after the driver restarts in Arm.

Keywords: RDMA, Arm, Driver

Fixed in Release: 5.6-1.0.3.3

2946450

Description: In some cases, the firmware tracer did not work with NEO-Host.

Keywords: NEO-Host, Firmware Tracer

Fixed in Release: 5.6-1.0.3.3

2947645

Description: current_link_speed sysfs was missing.

Keywords: sysfs

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

3025582

Description: If the commands were not entered in the correct order when setting buffer size and allocation using the mlnx_qos command, on some occasions, the xoff_threshold calculation broke pausing functionality.

Keywords: Driver, xoff, Buffer

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2859206

Description: First generation BlueField SoC based DPUs were not supported.

Keywords: BlueField, SoC

Fixed in Release: 5.6-1.0.3.3

2936867

Description: Creating a TC rules with more than 30 actions caused kernel panic.

Keywords: ASAP2, Call Trace, TC

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

3016685

Description: IP-in-IP packets received in one queue instead of hashing to multi queues.

Keywords: NetDev, Tunneling, RSS

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.6-1.0.3.3

3023304

Description: Fixed compatibility issue of mlnx_qos for python3.9 deprecated tostring/fromstring.

Keywords: Python3, Compatibility

Fixed in Release: 5.6-1.0.3.3

2887387

Description: IPsec flow tables design caused the number of IPsec tunnels to be limited to 16K.

Changed the flow tables design to support up to 32K IPsec tunnels per protocol (IPv4/IPv6).

Keywords: IPsec

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.6-1.0.3.3

2887394/2887381

Description: When configuring over 1000 IPsec sessions caused performance issues.

Keywords: IPsec

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2906002

Description: Hairpin rules failed to send packet back to wire when IPsec full offload is enabled.

Keywords: IPsec Full Offload, Hairpin

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2901506

Description: On rare occasions, the application did not use any raw WQE feature and unexpectedly got wc opcode IBV_WC_DRIVER2.

Keywords: RDMA, Raw WQE, IB_WC_DRIVER2

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2890024

Description: Under certain conditions, incorrect handling of resources caused memory corruption over software steering resources leading to failure of OVS to offloaded the traffic to the hardware.

Keywords: ASAP2, Steering, OVS, Memory

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2874200

Description: Using hairpin tunnel traffic, caused incorrect TC rules to be created.

Example:

tunnel(tun_id=0×65,src=10.10.11.3,dst=10.10.11.2,ttl=0/0,tp_dst=4789,flags(+key)),…,in_port(vxlan_sys_4789),…, actions:set(tunnel(tun_id=0×66,src=10.10.12.2,dst=10.10.12.3,tp_dst=4789,flags(key))),vxlan_sys_4789

Keywords: ASAP2, Hairpin, OVS, SwitchDev

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

2891499

Description: Adding a route with next hop object caused a warning in dmesg and could possibly lead to kernel panic.

Keywords: ASAP2, Route, SwitchDev, Call Trace, Nexthop

Discovered in Release: 5.5-1.0.3.2

Fixed in Release: 5.6-1.0.3.3

Internal Reference Number

Description

2842077

Description: Between scripts there was a possibility for Inconsistency in python3 header line (shebang line) because some distributions may no longer have /usr/bin/python.

Keywords: Python3

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.5-1.0.3.2

2792432

Description: The driver didn't set the PCP-based priority for DCT, hence DCT response packets were transmitted without user priority.

Keywords: User Priority, DCT

Fixed in Release: 5.5-1.0.3.2

2434399

Description: Node reboots may trigger memory corruption in OFED CM.

Keywords: CM Migration

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.5-1.0.3.2

2696789

Description: Redesigned the locks around peer MR invalidation flow to avoid a potential deadlock as Peer-direct patch may cause deadlock due to lock inversion.

Notes:

  • For GPU drivers prior to r470, the user should update nv_peer_mem to the next version, probably 1.2.

  • For GPU drivers from r470 or later branches shipped with nvidia-peermem, the driver will have an option to update to newer releases which take advantage of the redesigned MLNX_OFED support.

Keywords: lock inversion, nv_peer_mem

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.5-1.0.3.2

2792480

Description: Running tcpdump on bonding standby port caused to lose the network.

Keywords: NetDev

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.5-1.0.3.2

2782406

Description: Running yum update will upgrade kylin-release to a higher version. The version of this package is used for kylin10sp2 detection so the script will detect kylin 10 instead of kylin10sp2 and use its repository by mistake.

Workaround: Upgrade, kylin

Discovered in Release: 5.4-3.0.3.0

Fixed in Release: 5.5-1.0.3.2

2736003

Description: Starting from GPU Driver version r465, nv_peer_mem was shipping in the GPU driver package under the name nvidia-peermem. Updating OFED required nvidia-peermem rebuild, otherwise it was stubbed out by the kernel.

Keywords: Installation, GPU Driver

Discovered in Release: 5.4-3.0.3.0

Fixed in Release: 5.5-1.0.3.2

2823700

Description: xpmem driver is not supported on PowerPC.

Keywords: Installation, xpmem, PowerPC

Fixed in Release: 5.5-1.0.3.2

2802508

Description: Suspend flow freed the VLAN data so the data was not restored during the resume flow.

Keywords: VLAN, Suspend Flow, Resume Flow

Fixed in Release: 5.5-1.0.3.2

2796010

Description: Connection tracking rules with fragmentation had 0 stats.

Keywords: BlueField, Connection Tracking, Fragments, ASAP2

Discovered in Release: 5.4-2.4.1.3

Fixed in Release: 5.5-1.0.3.2

2803403

Description: Traffic failed to pass when OVS bridge is configured with bond interface and IP is configured over the OVS internal (bridge) port.

Keywords: Bond, VF LAG, OVS, Internal Port, ASAP2

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.5-1.0.3.2

2438392

Description: VXLAN with IPsec crypto offload does not work.

Keywords: VXLAN; IPsec crypto

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.5-1.0.3.2

2677225

Description: Conducting a driver restart while in VF LAG mode may cause unwanted behaviour such as kernel crashes.

Keywords: ASAP2, Bonding, Driver Restart, VF LAG

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.5-1.0.3.2

Internal Reference Number

Description

2852904

Description: In version 5.4, there was some offload breakage when using OVS.

Keywords: TSO, UDP Tunnels

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.1.0.0

2792480

Description: Running tcpdump on a bonding standby port resulted in the loss of the network.

Keywords: NetDev

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2696789

Description: Redesigned the locks around peer MR invalidation flow to avoid a potential deadlock as Peer-direct patch may cause deadlock due to lock inversion.

Notes:

  • For GPU drivers prior to r470, the user should update nv_peer_mem to the next version, probably 1.2.

  • For GPU drivers from r470 or later branches shipped with nvidia-peermem, the driver will have an option to update to newer releases which take advantage of the redesigned MLNX_OFED support.

Keywords: lock inversion, nv_peer_mem

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2739689

Description: A race that resulted in a QCE with an error, caused errors in UMR QP. To prevent the UMR QP from getting into error, we fixed the MR deregistration flow (e.g., Peer lkey which is always revoked before destroying it).

Keywords: QCE, UMR

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2691656

Description: When using bonding, ibdev2netdev would sometimes match the infiniband device to the net device bonding interface, and sometimes to the underlying Infiniband net device interface.

ibdev2netdev now skips InfiniBand net device bonding interfaces, and always matches InfiniBand devices to the underlying InfiniBand net device interfaces.

Keywords: ibdev2netdev Bonding

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.4-3.0.3.0

2687643

Description: Fixed Decap flows inner IP_ECN match to take into account software modification of the match value according to RFC 6040 4.2.

Keywords: decap, ASAP2, ECN, RoCE

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2691081

Description: Removed metadata from the rpm package mlnx-ofa_kernel where it claimed to Provide an older version of rdma-core. This made sense in older versions where we needed to avoid installing rdma-core. But does not make sense anymore. And caused problems to some users installing rdma-core-devel through meta-packages.

Keywords: Installation

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2727062

Description: Removed manual build-time file list generation in mlnx-tools. Only keep it for python-installed files. And avoid guessing the version of python we use and the directory to which we install.

Keywords: Installation

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2708220

Description: Removed useless build-time editing of uninstall.sh in ofed-scripts that caused the build to fail (in the case of --add-kernel-support) in some rare cases.

Keywords: Installation

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2730547

Description: Some Dell OFED Factory Installation packages were missing dependencies. Removed the package rdma-core-devel from the Dell MLNX_OFED packages as it was not needed and some of its dependencies are not included.

Keywords: Installation, Dell

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2699662

Description: MLNX_OFED build scripts fixed to also build hcoll with CUDA support on RHEL8 x86_64 platforms.

Keywords: Installation, CUDA

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2686877

Description: Changing mtu takes too long. Reduced number of calls to synchronize_net to once for all channels.

Keywords: mtu, synchronize_net

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2748328

Description: When trying to upgrade a kmp package, it conflicts and needs user help to choose whether to replace it or not. The fix avoids conflicts from /usr/lib/rpm/kernel-module-subpackage script which was changed in the builder. Building the packages with kmp enabled on the other image will cause the issue to reproduce.

Keywords: Upgrade, kmp Package

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2707023

Description: On Ubuntu and Debian systems for openvswitch-switch (in case installing using e.g. --ovs-dpdk or --with-openvswitch), the installer misses a run-time dependency of libpcap0.8.

Keywords: Installation, Ubuntu, Debian

Discovered in Release: 5.4-1.0.3.0

Fixed in Release: 5.4-3.0.3.0

2563366

Description: The full path to the directory that contains the installer must not contain a space or any similar white-space character, otherwise the installer will fail.

Keywords: Installation, White Space

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-3.0.3.0

Internal Reference Number

Description

2684302

Description: To support scalability, function representor channels were limited to 4. However in scenarios when SF are not used, certain use cases require representors to support a large number of channels.

Hence, representor channel limit to 4 is applicable only when a PCI device, such as Scalable Function support, is enabled.

Keywords: Representor Channels

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2644217

Description: Matching on ipv4_ihl (internet header length) was supported only for outer headers.

Support has been added for inner headers too.

Keywords: Internet Header Length, ipv4_ihl

Fixed in Release: 5.4-1.0.3.0

2626906

Description: When using one counter for both pop/push VLAN actions, the counter value is incorrect. Split the counter for pop_vlan_action_counter and push_vlan_action_counter.

Keywords: Pop/Push VLAN

Fixed in Release: 5.4-1.0.3.0

2653382

Description: Incorrect L3 decapsulation occurs when the original inner frame is small and was padded to comply with minimum frame size of 64-bytes.

Keywords: SW Steering, Decapsulation, Padding

Fixed in Release: 5.4-1.0.3.0

2612725

Description: dapl and libmlx4 are needed by libdat2 and libdpdk. In order to remove or update dapl, its dependencies need to be removed.

Keywords: dapl, libmlx4

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.4-1.0.3.0

2649134

Description: An override of log_max_qp by other devices occurs if the devices share the same mlx5_core module.

Keywords: log_max_qp, mlx5_core

Fixed in Release: 5.4-1.0.3.0

2638029

Description: A synchronization issue where closing and opening channels (which may happen on configuration changes such as changing number of channels) may cause null pointer dereference in function mlx5e_select_queue.

Keywords: mlx5e_select_queue, Synchronization, Tx

Fixed in Release: 5.4-1.0.3.0

2678982

Description: Enabling tx-udp_tnl-csum-segmentation has no effect on the driver. tx-udp_tnl-csum-segmentation has been moved to "off [fixed]".

Keywords: tx-udp_tnl-csum-segmentation

Discovered in Release: 5.4-0.5.1.1

Fixed in Release: 5.4-1.0.3.0

2610870

Description: Some MLNX_OFED dkms packages ignored (install-time) build errors and considered the packages properly built.

Those errors are now not ignored and indicated as package installation errors.

Keywords: dkms, Installation

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.4-1.0.3.0

2617820

Description: Old udevd versions could get stuck renaming network devices, leaving interfaces named eth* instead of enp*.

Updating the systemd version resolves this issue. For example, if an issue detected on RHEL 7.6 with systemd-219-62, updating the systemd version to systemd-219-67 resolves the issue.

Keywords: udev, systemd, RHEL

Discovered in Release: 5.4-0.5.1.1

Fixed in Release: 5.4-1.0.3.0

2632768

Description: Flows with t commit action with ct state -trk are not be offloaded (i.e., table=0,ct_state=-trk,ip actions=ct(commit,table=1)).

Keywords: ASAP2, Connection Tracking

Fixed in Release: 5.4-1.0.3.0

2247143

Description: Connection tracking over VF LAG with tunnel encapsulation/decapsulation is not supported and may cause traffic drop.

Keywords: ASAP2, Connection Tracking, VF LAG, Tunnel

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.4-1.0.3.0

2597327

Description: When stack size is limit to 1024, OFED compilation fails.

Keywords: Compilation, Stack

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2609641

Description: Setting rate/burst values higher than 2,147,483,648 are rejected.

Keywords: VF Metering

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2626920

Description: Offloaded remote mirroring flows on tunnel device caused forwarded traffic to VF to not be decapsulated.

Keywords: ASAP2, Offload, Remote Mirroring, Tunnel

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2660247

Description: Trying to set VPort match mode on VF (cat/sys/class/net/enp8s0f2/compat/devlink/vport_match_mode), leads to kernel crash.

Keywords: ASAP2, Kernel Crash

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2667484

Description: OVS flows are not being offloaded over socket-direct devices.

Keywords: ASAP2, Socket-Direct

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2663042

Description: When VXLAN is configured and illegal route is added, the system crashes with call trace.

Keywords: ASAP2, Offload, Tunnel, Call Trace

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2354761

Description: If any traffic is sent before the netdev goes up for the first time, a division by zero caused by a modulo operation may occur in ndo_select_queue, leading to a kernel panic.

Keywords: NetDev; ndo_select_queue

Discovered in Release: 5.3-1.0.0.1

Fixed in Release: 5.4-1.0.3.0

2562053/2667551

Description: After restarting driver, the x86 host may be in grace period and may not recover on its own. As part of the fix, 5 FW_fatal recoveries are allowed within the 20-minute grace period. As a result, the grace period in the devlink health show command will appear as 0 for FW_fatal reporter.

Keywords: BlueField Reload, recovery, reset flow

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.4-1.0.3.0

Internal Reference Number

Description

2635638

Description: In fork situation, if parent/children processes happen to have same virtual address, then the doorbell mechanism may not work well and may leads to errors in application behavior.

Keywords: RDMA, Fork

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.4-1.0.3.0

Internal Reference Number

Description

2393352

Description: Using "--with-openvswitch" flag during MLNX_OFED installation may not work on Debian 10 systems.

Keywords: --with-openvswitch, Debian

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2445058

Description: ib_uverbs module parameter disable_raw_qp_enforcement is deprecated and should no longer be used.

Keywords: disable_raw_qp_enforcement, ib_uverbs

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2434650

Description: Fixed an issue in ConnectX-5 and earlier that when the module is missing, the driver reported a connector type that is different than OTHER.

Keywords: Module, Connector Type

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2434650

Description: Solved a compilation error by fixing a backport issue with unpin_user_pages_dirty_lock function.

Keywords: Memory, unpin_user_pages_dirty_lock

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2434650

Description: Fixed an issue that caused a deadlock on IPoIB interface. While IPoIB driver handes multicast groups, the fix assures that everything is done under safe lock while handled.

Keywords: IPoIB, Multicast

Discovered in Release: 4.6

Fixed in Release: 5.3-1.0.0.1

2505615

Description: Fixed an issue where VLAN header was not popped on VF Rx when the eSwitch priority tagging was configured.

Keywords: ASAP2, Priority Tagging, VLAN

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2494257

Description: Fixed connection tracking (CT) offload in NIC mode by using correct steering domain for the rules.

Keywords: ASAP2, Connection Tracking, NIC Mode

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2461213

Description: Fixed an issue where offload of rules from OVS internal port to uplink failed.

Keywords: ASAP2, OVS

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2444523

Description: Fixed an issue in the tunnel mishandling that can happen when the tunnel overlay device is an OVS internal port.

Keywords: ASAP2, OVS internal port offloading

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2566354

Description: Fixed incorrect parsing of network configuration when the option --net (-n) was given to mlnxofedinstall: get network configuration from the output of 'ip' instead of 'ifconfig'.

Keywords: Installation

Fixed in Release: 5.3-1.0.0.1

2495065

Description: Dropped unsupported devices from OFED rdma-core description.

Keywords: rdma-core

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2482696

Description: Backported MLNX_OFED kernel to support elrepo 5.8 kernel.

Keywords: add-kernel-support

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2481104

Description: Fixed ability to build xpmem on kernel version 5.6.

Keywords: add-kernel-support, xpmem

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2440062

Description: Fixed an issue where kernel build on SLES 15 systems that configures scripts assume SLES 15 systems have /etc/SuSE-release or /etc/SUSE-brand. These files no longer exist on SLES 15.

Keywords: add-kernel-support, SLES 15

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2445146

Description: Fixed an issue where running data on Geneve tunnel on a VF may result in CQE error and a failure t to transmit data.

Keywords: Virtual Function

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2494008

Description: Fixed an issue where the driver silently ignores the settings of an already-set ECN value (0->0, 1->1) via sysfs.

Keywords: RDMA, ECN

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2581127

Description: Fixed an issue where KVS offload, under certain conditions, takes too long. Improved malloc performance by increasing the memory reuse and reducing the stress on malloc and free.

Keywords: MLNX5DR, Software Steering

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2502564

Description: Fixed an issue where when using switchdev mode with SMFS, inserting duplicate rules from userspace was not supported (required when there are a few instances of the same application). As part of the fix, added support for update_fte which is called in case a duplicate rule is being added.

Keywords: SwitchDev, Steering

Fixed in Release: 5.3-1.0.0.1

2433351

Description: Fixed an issue where creating 127 ports on each VF may fail as the current kernel does not support an RDMA device with more than 255 ports.

Keywords: VF, RDMA, virtualization

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2333971

Description: Fixed an issue where changing the "other" channels count by "ethtool -L other " command on Kernel 5.10 may cause a kernel panic.

Keywords: Kernel 5.10, kernel panic, ethtool, "other" channels

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2454952

Description: Fixed an issue where MLNX_OFED cannot be built on top of Kernel 5.4.87.

Workaround: operating system, kernel

Discovered in Release: 5.2-2.2.0.0

Fixed in Release: 5.3-1.0.0.1

2383355

Description: Fixed an issue where Switch and eSwitch offloads are not supported for SR-IOV and its sub functions when installing MLNX_OFED over upstream kernel v5.10 or higher.

Keywords: eSwitch, Kernel, SR-IOV

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.3-1.0.0.1

2278833

Description: Creating a bond via NetworkManager and restarting the driver (openibd restart) results in no pf0hpf and bond creation failure.

Keywords: Bond, LAG, network manager, driver reload

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-2.2.0.0

2293460

Description: In rare cases, mlx5_cmd_exec may lead to recount warning, such as:

WARNING: CPU: 1 PID: 30811 at lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0

This refcount warning can be ignored.

Keywords: mlx5_cmd_exec

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2328206

Description: Prior to stopping Open vSwitch, the following command should be run.

ovs-appctl exit --cleanup

Keywords: Open vSwitch, ovs-appctl

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.2-2.2.0.0

2278352

Description: Under Containers environment, having "rdma-core" package installed on the Hypervisor may conflict with loading updated MLNX_OFED drivers from within a Container (using "/etc/init.d/openibd restart" for example). The "rdma-core" services may load the Inbox drivers on the Hypervisor, causing the loading of the MLNX_OFED drivers to fail, and errors about symbol incompatibility will appear in the dmesg log.

Keywords: openibd, symbol, rdma-core, driver load

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.2-2.2.0.0

2397281

Description: When disabling KEEP_ETH_LINK_UP_P1/2 configuration while in SwitchDev mode, packets will no longer be received.

Keywords: ASAP2;KEEP_ETH_LINK_UP_P, SwitchDev

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.2-2.2.0.0

2359599

Description: When an error occurs during adding a new flow rule (e.g. FW command failure in DMFS, unsupported sequence in SMFS, etc.), the driver might fail to release some resources.

Keywords: Steering, SMFS, DMFS

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.2-2.2.0.0

2407076

Description: On EulerOS 2.0 SP9 Aarch64 with errata kernel kernel-4.19.90-vhulk2009.2.0.h269.eulerosv2r9.aarch64, MLNX_OFED installs correctly, but driver restart fails.

Keywords: OpenEuler, Aarch64, installation, --add-kernel-support

Discovered in Release: 5.2-1.0.4.0

Fixed in Release: 5.2-2.2.0.0

2244416

Description: Configuring "other" channels over one representor is not supported and may cause a call trace.

Keywords: ASAP, SwitchDev, ethtool, representor

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2209987

Description: aRFS feature (activated using "ethtool ntuple on") is disabled for kernel 4.1 or below.

Keywords: aRFS

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2312015

Description: Fixed the issue where clearing min_rate on all SR-IOV legacy VFs after setting min_rate to at least one of the VFs did not disable QoS min_rate.

Keywords: SR-IOV, Legacy, QoS, VF, min_rate

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.2-1.0.4.0

2322044

Description: Allowed installation of KMP (kmod) on Lustre kernels using --add-kernel-support --kmp on RHEL 7.x and above systems.

Keywords: Installation, KMP, kmod, RHEL, Lustre, add-kernel-support

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2325883

Description: An empty /usr/src/mlnx-ofa_kernel/default is now no longer created.

Keywords: mlnx-ofa_kernel

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.2-1.0.4.0

2326961

Description: Enabled installation of mlnx-ofa_kernel package to /bin/python3 instead of /usr/bin/python3 on RHEL 8.x systems.

Keywords: /bin/python3, mlnx-ofa_kernel, RHEL, installation

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.2-1.0.4.0

2329152

Description: Fixed the issue where installing packages through a repository failed after generation of metapackages using --add-kernel-support. The failure occurred due to excessive and incorrect Obsolete headers in those metapackages.

Keywords: --add-kernel-support, Installation, metapackage, repository

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2152217

Description: Fixed the issue where all IPoIB offload packets were wrongly counted as rx_csum_complete. These packets are now identified as rx_csum_unnecessary packets.

Keywords: IPoIB, rx_csum_complete, rx_csum_unnecessary

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2392338

Description: Fixed an issue where mlx5dv_dr_rule_destroy segmentation fault could happen in rare cases with multiple rules on the same matcher with a different number of actions. This could happen after reusing an already deleted rule memory with less actions.

Keywords: mlx5dv_dr, software steering, RMDA-Core

Discovered in Release: 5.1-2.3.7.1

Fixed in Release: 5.2-1.0.4.0

2297535

Description: Fixed the OFED udev script to not modify non-NVIDIA NIC names.

Keywords: udev, SwitchDev

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2333242

Description: Fixed an issue of moving to SwitchDev mode after configuring DSCP.

Keywords: DSCP, SwitchDev

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2368330

Description: Fixed the issue where global traffic class configuration did not take effect on DC QPs.

Keywords: DC, QP, TC

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2241983

Description: Fixed an issue where ib_send_bw traffic frequently dropped to zero when RDMA CM was used, because of incorrect min_rnr_timer setting on the responder side. The min_rnr_timer setting is now aligned with the setting in non-RDMA CM cases.

Keywords: RDMA CM, ib_send_bw

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.2-1.0.4.0

2334009

Description: Fixed an issue where traffic did not pass over VFs with VST QinQ feature is enabled.

Keywords: VST, QinQ

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2336091

Description: Fixed an issue of when traffic was sent over Geneve VLAN with Tx VLAN offload enabled and TSO or Tx csum enabled, traffic could be dropped and not sent to the wire.

Keywords: Geneve, VLAN, Tx, TSO, offload

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2317257

Description: Fixed an issue that caused the firmware to restart upon installing mlnx-ofed-dpdk-upstream-libs package manually.

Keywords: Firmware, mlnx-ofed-dpdk-upstream-libs, installation

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2365123

Description: Fixed bad utility paths in rdma-core "dracut" hooks on SLES systems, which used to result in the following errors when running "dracut" with the "--add rdma" option.

dracut-install: ERROR: installing ‘/usr/libexec/mlx4-setup.sh’

dracut: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.UdCOSJ/initramfs /usr/libexec/mlx4-setup.sh

dracut-install: ERROR: installing ‘/usr/libexec/rdma-set-sriov-vf’

dracut: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.UdCOSJ/initramfs /usr/libexec/rdma-set-sriov-vf

Keywords: rdma-core dracut

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2379620

Description: Fixed an issue where MLNX_OFED installation using 'yum' tool failed with the following errors.

Requires: rdma-core(x86-64) = 50mlnx1-1.50218

Removing: rdma-core-22.4-1.el7.x86_64 (@anaconda)

rdma-core(x86-64) = 22.4-1.el7

Obsoleted By: mlnx-ofed-all-5.0-2.1.8.0.rhel7.8.noarch (MLNX_LOCAL_REPO)

Not found

Updated By: rdma-core-50mlnx1-1.50218.x86_64 (MLNX_LOCAL_REPO)

rdma-core(x86-64) = 50mlnx1-1.50218

Available: rdma-core-22.4-5.el7.x86_64 (base)

rdma-core(x86-64) = 22.4-5.el7

Keywords: YUM, rdma-core conflict

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.2-1.0.4.0

2299982

Description: Fixed the issue where traffic class value was not updated in DCT when set via sysfs.

Keywords: DCT, sysfs, TC

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.2-1.0.4.0

2335165

Description: Fixed a doorbell loss issue on AMD platforms with Secure Memory Encryption (SME).

Keywords: AMD, SME

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2355878

Description: Fixed an issue with registering memory using mlx5dv_devx_umem_reg while forking. Without this fix, applications which use fork() or similar syscalls while using a memory registered with umem_reg could hang due to incorrect physical page mapping. This fix requires setting the IBV_FORK_SAFE environment variable.

Keywords: mlx5dv_dr, SW steering, RDMA core

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2288599

Description: Fixed the issue where unbinding the device resulted in the following message being printed to the dmesg: "failed to disable DC tracer"

Keywords: Unbind

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2083942

Description: Fixed the issue where the content of file /sys/class/net//statistics/multicast may have been out of date and may have displayed values lower than the real values.

Keywords: Multicast counters

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.2-1.0.4.0

2282316

Description: Fixed the issue where ERSPAN protocol was available only when turning off Tx checksum offload.

Keywords: ERSPAN, TX checksum offload

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2310695

Description: Fixed a udev script issue which caused non-NVIDIA devices to be renamed.

Keywords: udev, naming

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.2-1.0.4.0

2334518

Description: Fixed missing representor statistics when using ifconfig.

Keywords: SwitchDev, representor, statistics, ifconfig

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2342348

Description: Fixed wrong value of skb mark of received packets on representors.

Keywords: SwitchDev, skb mark

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2363982

Description: Fixed an issue which caused second port representors to be named as first port representors.

Keywords: SwitchDev, udev, representor

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.2-1.0.4.0

2292762

Description: Fixed a kernel panic scenario that may have taken place when using sysfs to cancel the probing of VFs and performing reboot while the VFs are still managed by the mlx5 driver.

Keywords: Proved VFs

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.1-2.5.8.0

2302010

Description: Fixed a GPUDirect locking bug that may have caused instability and communication loss in Peer-direct applications.

Keywords: Peer-direct, GPUDirect

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.1-2.5.8.0

2298308

Description: Fixed the issue where pinned pages were not handled properly by peer flow, which resulted in ENOMEM error.

Keywords: Peer flow

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.1-2.5.8.0

2323446

Description: Fixed the issue where num_free_callbacks counter was not functional, and querying the counters returned the value of 0 all the time.

Keywords: GPUDirect

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.1-2.5.8.0

2288779

Description: Fixed a KMP support issue in EulerOS 2.0 SP9 with non-default kernel.

Keywords: KMP, EulerOS, OS

Discovered in Release: 5.1-2.5.8.0

Fixed in Release: 5.1-2.5.8.0

2298285

Description: Fixed the following VST issues.

  1. Ingress VST traffic tagging issue caused by incorrect configuration settings over VF interface.

  2. Enabled proper reset of ingress and egress legacy ACL.

Keywords: VST

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2247363

Description: Added command interface resilience by manually polling the async commands EQE in case of command timeout.

Keywords: Command interface, mlx5_core

Discovered in Release: 4.2-1.2.0.0

Fixed in Release: 5.1-2.5.8.0

2244729

Description: Enabled on-demand device memory sync to free cached memory.

Keywords: Device memory, cache

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2229683

Description: Eliminated long mlx5 recovery-flow delay on a VF driver when PCI interface goes down.

Keywords: mlx5, VF, PCI

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

Description: ibv_get_device_list returns only accessible devices now.

Keywords: ibv_get_device_list

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2257363

Description: The install script will now also install the package rpm-build (that includes basic support for building RPM packages) on SLES15.x systems when trying to rebuild packages, as is the case with the rest of the distributions.

Keywords: RPM, SLES, OS

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2244336

Description: AF_XDP is now functional.

Keywords: AF_XDP

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2192791

Description: Fixed the issue where packages neohost-backend and neohost-sdk were not properly removed by the uninstallation procedure and may have required manual removal before re-installing or upgrading the MLNX_OFED driver.

Keywords: NEO-Host, SDK

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2247404

Description: srptools installation no longer fails in case the srp_daemon service fails to start on Debian and Ubuntu systems.

Keywords: SRP, srptool, srp_daemon, Debian, Ubuntu, OS

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2248201

Description: Fixed the issue where during MLNX_OFED installation, warning messages related to modules iw_cxgb3 and iw_nes may have appeared in the log.

Keywords: SLES, RHEL, KMP, weak updates, kmod

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2261475

Description: Simplified the sanity test that verifies which systems must use python3 and not python2, so that when rebuilding the kernel packages using --add-kernel-support, the systems based on CentOS 8 are detected.

Keywords: CentOS 8, RHEL 8, OS, sanity test, python

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2288099/2257445

Description: Simplified sanity tests for building MLNX_OFED/MLNX_EN with kernel v5.7.11 or newer.

Keywords: Kernel, sanity test, installation

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2288783

Description: MLNX_OFED/MLNX_EN installer will no longer automatically remove the spdk package when installing the driver.

Keywords: spdk, installation

Discovered in Release: 5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2292273

Description: Marked additional packages as obsolete so they would be removed when installing the HPE-roce mlnx-ofa-kernel package.

Keywords: RPM, HPE-roce

Discovered in Release: 5.1-1.0.7.1

Fixed in Release: 5.1-2.5.8.0

2265094

Description: Fixed VM driver to avoid command timeouts when the Hypervisor disables the VF's PCI interface. This reduces the device removal time to 2 seconds.

Keywords: VF disable

Discovered in Release:5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

2272539

Description: Added back rx_dct_connect counter that queries the number of received connection requests for the associated DCTs.

Keywords: rx_dct_connect, DCT

Discovered in Release:5.1-0.6.6.0

Fixed in Release: 5.1-2.5.8.0

1731939

Description: Get/Set Forward Error Correction FEC configuration is not supported on ConnectX-6 HCAs with 200Gbps speed rate.

Keywords: Forward Error Correction, FEC, 200Gbps

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.1-0.6.6.0

1980884

Description: Setting VF VLAN, state and spoofchk using ip link tool is not supported in SwitchDev mode.

Keywords: ASAP, ip tool, VF, SwitchDev

Discovered in Release: 4.7-3.2.9.0

Discovered in Release: 5.0-1.0.0.0

2117845

Description: Relaxed ordering memory regions are not supported when working with CAPI. Registering memory region with relaxed ordering while CAPI enabled will result in a registration failure.

Keywords: Relaxed ordering, memory region, MR, CAPI

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2118956

Description: mlx5dv_dr API does not support sub functions (SFs) as destination actions.

Keywords: mlx5dv_dr, sub functions, SF

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2097045

Description: Userspace Software Steering using mlx5dv_dr API support on ConnectX-6 Dx adapter cards is now at GA level.

Keywords: Software Steering, SW, mlx5dv_dr, ConnectX-6 Dx

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2132332

Description: Fixed a sporadic reporting bandwidth issue in case of running with --run_infinitely flag.

Keywords: perftest, bandwidth

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2151658

Description: Optimized XRC target lookup by modifying the locking scheme to enable multiple readers and changing the linked list that holds the QPs to xarray.

Keywords: XRC, QP, xarray

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2196118

Description: Fixed a driver issue that led to panic after DPDK application crashes.

Keywords: DPDK, panic

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.1-0.6.6.0

2245228

Description: Fixed an issue of a crash when attempting to access roce_enable sysfs in unprobed VFs.

Keywords: roce_enable, unprobed VFs

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2061294

Description: Fixed a race of commands executed by command interface in parallel to AER recovery causing the kernel to crash.

Keywords: mlx5e, AER

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.1-0.6.6.0

2131951

Description: Fixed an issue in MLNX_OFED build system that broke RPM sign process for random packages, all RPMs are now signed properly.

Keywords: RPM, sign

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.1-0.6.6.0

2143067

Description: If Openibd was configured to enable the SRP daemon, it now also enables srp_daemon from rdma-core.

Keywords: Openibd, SRP daemon, srp_daemon, rdma-core

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.1-0.6.6.0

2143094

Description: Regenerated package repository in the correct location after rebuilding the kernel using add-kernel-support. This allows for installing the newly generated packages with a package manager.

Keywords: add-kernel-support, RPM, deb

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.1-0.6.6.0

2172130

Description: Fixed an issue with metadata packages generation in the eth-only directory. This allows using the directory as a repository for package managers.

Keywords: Metadata packages

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2214543

Description: Moved ibdev2netdev script from /usr/bin to /usr/sbin in the RPM package to avoid package conflict with RHEL 8 and consequent MLNX_OFED installation failure on some systems.

Keywords: ibdev2netdev, RPM, RHEL, RedHat

Discovered in Release:

Fixed in Release: 5.1-0.6.6.0

2211311

Description: Fixed an issue where Rx port buffers cell size was wrong, leading to wrong buffers size reported by mlnx_qos/netdev qos/buffer_size sysfs.

Keywords: mlx5e, RX buffers, mlnx_qos

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2111349

Description: Fixed the issue where ethtool --show-fec/--get-fec were not supported over ConnectX-6 and ConnectX-6 Dx adapter cards.

Keywords: Ethtool, ConnectX-6 Dx

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

2165668

Description: Fixed an issue related to mlx5 command interface that in some scenarios caused the driver to hang.

Keywords: ConnectX-5, mlx5, panic

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.1-0.6.6.0

2119984

Description: Fixed the issue where IPsec crypto offloads did not work when ESN was enabled.

Keywords: IPsec, ESN

Discovered in Release: 5.0-2.1.8.0

Fixed in Release: 5.1-0.6.6.0

1630228

Description: Fixed the issue where tunnel stateless offloads were wrongly forbidden for E-Switch manager function.

Keywords: Stateless offloads cap

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.1-0.6.6.0

2089996

Description: Fixed the issue where dump flows were not supported and may have been corrupted when using tc tool with connection tracking rules.

Keywords: ASAP, iproute2, tc, connection tracking

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.1-0.6.6.0

2094216

Description: Fixed the issue of when one of the LAG slaves went down, LAG deactivation failed, ultimately causing bandwidth degradation.

Keywords: RoCE LAG

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2133778

Description: The mlx5 driver maintains a subdirectory for every open eth port in /sys/kernel/debug/. For the default network namespace, the sub-directory name is the name of the interface, like "eth8". The new convention for the network interfaces moved to the non-default network namespaces is the interfaces name followed by "@" and the port's PCI ID. For example: "eth8@0000:af:00.3".

Keywords: Namespace

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2076546

Description: Fixed the issue where in RPM-based OSs with non-default kernels, using repositories after re-creating the installer (using --add-kernel-support) would result in improper installation of the drivers.

Keywords: Installation, OS

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.1-0.6.6.0

2114957

Description: Fixed the issue where MLNX_OFED installation may have depended on python2 package even when attempting to install it on OSs whose default package is python3.

Keywords: Installation, python

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2122684

Description: Fixed the issue where OFED uninstallation resulted in the removal of dependency packages, such as qemu-system-* (qemu-system-x86).

Keywords: Uninstallation, dependency, qemu-system-x86

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2135476

Description: Added KMP ability to install MLNX_OFED Kernel modules on SLES12 SP5 and SLES15 kernel maintenance updates.

Keywords: KMP, SLES, kernel

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2143258

Description: Fixed a typo in perftest package where help messages wrongly displayed the conversion result between Gb/s and MB/s (20^2 instead of 2^20).

Keywords: perftest

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2149577

Description: Fixed the issue where openibd script load used to fail when esp6_offload module did not load successfully.

Keywords: openibd, esp6_offload

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2163879

Description: Added dependency of package mpi-selectors on perl-Getopt-Long system package. On minimal installs of RPM-based OSs, installing mpi-selectors will also install the required system package perl-Getopt-Long.

Keywords: Dependency, perl-Getopt-Long

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2119017

Description: Fixed the issue where injecting EEH may cause extra Kernel prints, such as: “EEH: Might be infinite loop in mlx5_core driver”.

Keywords: EEH, kernel

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2107532

Description: Fixed the issue where in certain rare scenarios, due to Rx page not being replenished, the same page fragment mistakenly became assigned to two different Rx descriptors.

Keywords: Memory corruption, Rx page recycle

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-2.1.8.0

2116234

Description: Fixed the issue where ibsim was missing after OFED installation.

Keywords: ibsim, installation

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2116233

Description: Fixed an issue where ucx-kmem was missing after OFED installation.

Keywords: ucx-kmem, installation

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2109716

Description: Fixed a dependency issue between systemd and RDMA-Core.

Keywords: Dependency, RDMA-Core

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2107776

Description: Fixed a driver load issue with Errata-kernel on SLES15 SP1.

Keywords: Load, SLES, Errata

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2105536

Description: Fixed an issue in the Hairpin feature which prevented adding hairpin flows using TC tool.

Keywords: Hairpin, TC

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2090321

Description: Fixed the issue where WQ queue flushing was not handled properly in the event of EEH.

Keywords: WQ, EEH

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-2.1.8.0

2076311

Description: Fixed a rare kernel crash scenario when exiting an application that uses RMPP mads intensively.

Keywords: MAD RMPP

Discovered in Release: 4.0-1.0.1.0

Fixed in Release: 5.0-2.1.8.0

2094545

Description: Fixed the issue where perftest applications (ib_read_*, ib_write_* and others) supplied with MLNX_OFED v5.0 and above did not work correctly if corresponding applications on another side of client-server communication were supplied with previous versions of MLNX_OFED due to an interoperability issue.

Keywords: perftest, interoperability

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2096998

Description: Fixed the issue where NEO-Host could not be installed from the MLNX_OFED package when working on Ubuntu and Debian OSs.

Keywords: NEO-Host, Ubuntu, Debian

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2094012

Description: Fixed the issue where MLNX_OFED installation failed to upgrade firmware version on ConnectX-6 Dx NICs with secure-fw.

Keywords: ConnectX-6 Dx, installation, firmware, NIC

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2057076

Description: Added support for installing MLNX_OFED using --add-kernel-support option over RHEL 8 OSs.

Keywords: --add-kernel-support, installation, RHEL

Discovered in Release: 5.0-1.0.0.0

Fixed in Release: 5.0-2.1.8.0

2090186

Description: Fixed a possible kernel crash scenario when AER/slot reset in done in parallel to user space commands execution.

Keywords: mlx5_core, AER, slot reset

Discovered in Release: 4.3-1.0.1.0

Fixed in Release: 5.0-2.1.8.0

2093410

Description: Added missing ECN configuration under sysfs for PFs in SwitchDev mode.

Keywords: sysfs, ASAP, SwitchDev, ECN

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-2.1.8.0

1731005

Description: Fixed the issue where MLNX_OFED v4.6 YUM and Zypper installations failed on RHEL8.0, SLES15.0 and PPCLE OSs.

Keywords: YUM, Zypper, installation, RHEL, RedHat, SLES, PPCLE

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.0-1.0.0.0

1779150

Description: Fixed the issue of when upgrading the MLNX_OFED version over SLES 15 SP0 and SP1 OSs on PPCLE platforms, it might have failed due to an isert-kmp-default issue.

Keywords: Installation, SLES, PPCLE

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.0-1.0.0.0

1897199

Description: Fixed the issue of when using the RDMA statistics feature and attempting to unbind a QP from a counter, not including the counter-id as an argument in the CLI would have resulted in a segmentation fault.

Keywords: RDMA, QP, segfault, unbinding

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-1.0.0.0

1916029

Description: Fixed the issue of when firmware response time to commands became very long, some commands failed upon timeout. The driver may have then triggered a timeout completion on the wrong entry, leading to a NULL pointer call trace.

Keywords: Firmware, timeout, NULL

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

2036394

Description: Added driver support for kernels with the old XDP_REDIRECT infrastructure that uses the following NetDev operations: .ndo_xdp_flush and .ndo_xdp_xmit.

Keywords: XDP_REDIRECT, Soft lockup

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

1973238

Description: Fixed the issue where ib_core unload may fail on Ubuntu 18.04.2 OS with the following error message:

"Module ib_core is in use"

Keywords: ib_core, Ubuntu, ibacm

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.1-0.6.6.0

2072871

Description: Fixed an issue where the usage of --excludedocs Open MPI RPM option resulted in the removal of non-documentation related files.

Keywords: --excludedocs, Open MPI, RPM

Discovered in Release: 4.5-1.0.1.0

Fixed in Release: 5.0-1.0.0.0

2060216

Description: Legacy mlnx-libs are now installed by default on SLES11 SP3 OS, as building MLNX_OFED on RDMA-Core based packages with this OS is not supported.

Keywords: mlnx-libs, SLES, RDMA-Core

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

2072884

Description: Removed all cases of automated loading of MLNX_OFED kernel modules outside of openibd to preserve the startup process of previous MLNX_OFED versions. These loads conflict with openibd, which has its own logic to overcome issues. Such issues can be inbox driver load instead of MLNX_OFED, or module load with wrong parameter value. They might also load modules while openibd is trying to unload the driver stack.

Keywords: Installation, openibd, RDMA-Core

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

2052037

Description: Disabled automated loading of some modules through udev triggers to preserve the startup process of previous MLNX_OFED versions.

Keywords: Installation, udev, RDMA-Core

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

2022634

Description: Fixed a typo in the packages build command line which could cause the installation of MLNX_OFED on SLES OSs to fail when using the option --without-depcheck.

Keywords: Installation, SLES

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

2022619

Description: Fixed the issue where uninstallation of MLNX_OFED would hang due to a bug in the package dependency check.

Keywords: Uninstallation, dependency

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-1.0.0.0

1995843

Description: ibdump is now provided with the default rdma-core-based build.

Keywords: ibdump, RDMA-Core

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

1995631

Description: Proper package dependencies are now set on Debian and Ubuntu libibverbs-dev package that is generated from RDMA-Core.

Keywords: Dependency, libibverbs, RDMA-Core

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

2047221

Description: Reference count (refcount) for RDMA connection ID (cm_id) was not incremented in rdma_resolve_addr() function, resulting in a cm_id use-after-free access.

A fix was applied to increment the cm_id refcount.

Keywords: rdma_resolve_addr(), cm_id

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.0-1.0.0.0

2045181

Description: Fixed a race condition which caused kernel panic when moving two ports to SwitchDev mode at the same time.

Keywords: ASAP, SwitchDev, race

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-1.0.0.0

2004488

Description: Allowed accessing sysfs hardware counters in SwitchDev mode.

Keywords: ASAP, hardware counters, sysfs, SwitchDev

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-1.0.0.0

2030943

Description: Function smp_processor_id() is called in the RX page recycle flow to determine the core to run on. This is intended to run in NAPI context. However, due to a bug in backporting, the RX page recycle was mistakenly called also in the RQ close flow when not needed.

Keywords: Rx page recycle, smp_processor_id

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.0-1.0.0.0

2074487

Description: Fixed an issue where port link state was automatically changed (without admin state involvement) to "UP" after reboot.

Keywords: Link state, UP

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-1.0.0.0

2064711

Description: Fixed an issue where RDMA CM connection failed when port space was small.

Keywords: RDMA CM

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-1.0.0.0

2076424

Description: Traffic mirroring with OVS offload and non-offload over VxLAN interface is now supported.

Note: For kernel 4.9, make sure to use a dedicated OVS version.

Keywords: VxLAN, OVS

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

1828321

Description: Fixed the issue of when working with VF LAG while the bond device is in active-active mode, running fwreset would result in unequal traffic on both PFs, and PFs would not reach line rate.

Keywords: VF LAG, bonding, PF

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.0-1.0.0.0

1975293

Description: Installing OFED with --with-openvswitch flag no longer requires manual removal of the existing Open vSwitch.

Keywords: OVS, Open vSwitch, openvswitch

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

1939719

Description: Fixed an issue of when running openibd restart after the installation of MLNX_OFED on SLES12 SP5 and SLES15 SP1 OSs with the latest Kernel (v4.12.14) resulted in an error that the modules did not belong to that Kernel. This was due to the fact that the module installed by MLNX_OFED was incompatible with new Kernel's module.

Keywords: SLES, operating system, OS, installation, Kernel, module

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

2001966

Description: Fixed an issue of when bond was created over VF netdevices in SwitchDev mode, the VF netdevice would be treated as representor netdevice. This caused the mlx5_core driver to crash in case it received netdevice events related to bond device.

Keywords: PF, VF, SwitchDev, netdevice, bonding

Discovered in Release: 4.7-3.2.9.0

Fixed in Release: 5.0-1.0.0.0

1816629

Description: Fixed an issue where following a bad affinity occurrence in VF LAG mode, traffic was sent after the port went up/down in the switch.

Keywords: Traffic, VF LAG

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.0-1.0.0.0

1718531

Description: Added support for VLAN header rewrite on CentOS 7.2 OS.

Keywords: VLAN, ASAP, switchdev, CentOS 7.2

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 5.0-1.0.0.0

1556337

Description: Fixed the issue where adding VxLAN decapsulation rule with enc_tos and enc_ttl failed.

Keywords: VxLAN, decapsulation

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 5.0-1.0.0.0

1921799

Description: Fixed the issue where MLNX_OFED installation over SLES15 SP1 ARM OSs failed unless --add-kernel-support flag was added to the installation command.

Keywords: SLES, installation

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 4.7-3.2.9.0

1949260

Description: Fixed a race condition that resulted in kernel panic when running IPoIB traffic in Connected mode.

Keywords: IPoIB

Discovered in Release: 4.5-1.0.1.0

Fixed in Release: 4.7-3.2.9.0

1973828

Description: Fixed wrong EEPROM length for small form factor (SFF) 8472 from 256 to 512 bytes.

Keywords: EEPROM, SFF

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 4.7-3.2.9.0

1915553

Description: Fixed the issue where errno field was not sent in all error flows of ibv_reg_mr API.

Keywords: ibv_reg_mr

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-3.2.9.0

1970901

Description: Fixed the issue where mlx5 IRQ name did not change to express the state of the interface.

Keywords: Ethernet, PCIe, IRQ

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 4.7-3.2.9.0

1915587

Description: Udaddy application is now functional in Legacy mode.

Keywords: Udaddy, MLNX_OFED legacy, RDMA-CM

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 4.7-3.2.9.0

1931421

Description: Added support for E-Switch (SR-IOV Legacy) mode in RHEL 7.7 OSs.

Keywords: E-Switch, SR-IOV, RHEL, RedHat

Discovered in Release: 4.7-1.0.0.1

Fixed in Release: 4.7-3.2.9.0

1945411/1839353

Description: Fixed the issue of when XDP_REDIRECT fails, pages got double-freed due to a bug in the refcnt_bias feature.

Keywords: XDP, XDP_REDIRECT, refcnt_bias

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-3.2.9.0

1715789

Description: Fixed the issue where NVIDIA Firmware Tools (MFT) package was missing from Ubuntu v18.04.2 OS.

Keywords: MFT, Ubuntu, operating system

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-1.0.0.1

1547200

Description: Fixed an issue where IPoIB Tx queue may get stuck, leading to timeout warnings in dmesg.

Keywords: IPoIB

Discovered in Release: 4.5-1.0.1.0

Fixed in Release: 4.7-1.0.0.1

1817636

Description: Fixed the issue of when disabling one port on the Server side, VF-LAG Tx Affinity would not work on the Client side.

Keywords: VF-LAG, Tx Affinity

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-1.0.0.1

1800525

Description: When configuring the Time-stamping feature, CQE compression will be disabled. This fix entails the removal of a warning message that appeared upon attempting to disable CQE compression when it has already been disabled.

Keywords: Time-stamping, CQE compression

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-1.0.0.1

1431282

Description: Fixed the issue where software reset may have resulted in an order inversion of interface names.

Keywords: Software reset

Discovered in Release: 4.4-1.0.0.0

Fixed in Release: 4.7-1.0.0.1

1843020

Description: Server reboot may result in a system crash.

Keywords: reboot, crash

Discovered in Release: 4.2-1.2.0.0

Fixed in Release: 4.7-1.0.0.1

1734102

Description: Fixed the issue where Ubuntu v16.04.05 and v16.04.05 OSs could not be used with their native kernels.

Keywords: Ubuntu, Kernel, OS

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-1.0.0.1

1811973

Description: VF mirroring offload is now supported.

Keywords: ASAP2, VF mirroring

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-1.0.0.1

1841634

Description: The number of guaranteed counters per VF is now calculated based on the number of ports mapped to that VF. This allows more VFs to have counters allocated.

Keywords: Counters, VF

Discovered in Release: 4.4-1.0.0.0

Fixed in Release: 4.7-1.0.0.1

1758983

Description: Installing MLNX_OFED on RHEL 7.6 OSs platform x86_64 and RHEL 7.6 ALT OSs platform PPCLE using YUM is now supported.

Keywords: RHEL, RedHat, YUM, OS, operating system

Discovered in Release: 4.6-1.0.1.1

Fixed in Release: 4.7-1.0.0.1

1523548

Description: Fixed the issue where RDMA connection persisted even after dropping the network interface.

Keywords: Network interface, RDMA

Discovered in Release: 4.4-1.0.0.0

Fixed in Release: 4.6-1.0.1.1

1712870

Description: Fixed the issue where small packets with non-zero padding were wrongly reported as "checksum complete" even though the padding was not covered by the csum calculation. These packets now report "checksum unnecessary".

In addition, an ethtool private flag has been introduced to control the "checksum complete" feature: ethtool --set-priv-flags eth1 rx_no_csum_complete on/off

Keywords: csum error, checksum, mlx5_core

Discovered in Release: 4.5-1.0.1.0

Fixed in Release: 4.6-1.0.1.1

1648597

Description: Fixed the wrong wording in the FW tracer ownership startup message (from "FW Tracer Owner" to "FWTracer: Ownership granted and active").

Keywords: FW Tracer

Discovered in Release: 4.5-1.0.1.0

Fixed in Release: 4.6-1.0.1.1

1581631

Description: Fixed the issue where GID entries referenced to by a certain user application could not be deleted while that user application was running.

Keywords: RoCE, GID

Discovered in Release: 4.5-1.0.1.0

Fixed in Release: 4.6-1.0.1.1

1368390

Description: Fixed the issue where MLNX_OFED could not be installed on RHEL 7.x Alt OSs using YUM repository.

Keywords: Installation, YUM, RHEL

Discovered in Release: 4.3-3.0.2.1

Fixed in Release: 4.6-1.0.1.1

1531817

Description: Fixed an issue of when the number of channels configured was less than the number of CPUs available, part of the CPUs would not be used by Tx queues.

Keywords: Performance, Tx, CPU

Discovered in Release: 4.4-1.0.0.0

Fixed in Release: 4.5-1.0.1.0

1571977

Description: Fixed an issue of when the same CQ is connected to some QPs with SRQ and some without, wrong wr_id might be reported by ibv_poll_cq .

Keywords: libmlx5, wr_id

Discovered in Release: 4.4-1.0.0.0

Fixed in Release: 4.5-1.0.1.0

1380135

Description: Fixed the issue where IB port link used to flap due to MAD heartbeat response delay when using new CQ API.

Keywords: IB port link, CQ API, MAD heartbeat

Discovered in Release: 4.2-1.2.0.0

Fixed in Release: 4.5-1.0.1.0

1498931

Description: Fixed the issue where establishing TCP connection took too long due to failure of SA PathRecord query callback handler.

Keywords: TCP, SA PathRecord

Discovered in Release: 4.4-1.0.0.0

Fixed in Release: 4.5-1.0.1.0

1514096

Description: Fixed the issue where lack of high order allocations caused driver load failure. All high order allocations are now changed to order-0 allocations.

Keywords: mlx5, high order allocation

Discovered in Release: 4.0-2.0.2.0

Fixed in Release: 4.5-1.0.1.0

1524932

Description: Fixed a backport issue on some OSs, such as RHEL v7.x, where mlx5 driver would support ip link set DEVICE vf NUM rate TXRATE old command, instead of ip link set DEVICE vf NUM max_tx_rate TXRATE min_tx_rate TXRATE new command.

Keywords: mlx5 driver

Discovered in Release: 4.0-2.0.2.0

Fixed in Release: 4.5-1.0.1.0

1498585

Description: Fixed the issue of when performing configuration changes, mlx5e counters values were reset.

Keywords: Ethernet counters

Discovered in Release: 4.0-2.0.2.0

Fixed in Release: 4.5-1.0.1.0

1425027

Description: Fixed the issue where attempting to establish a RoCE connection on the default GID or on IPv6 link-local address might have failed when two or more netdevices that belong to HCA ports were slaves under a bonding master.

This might also have resulted in the following error message in the kernel log: “ __ib_cache_gid_add: unable to add gid fe80:0000:0000:0000:f652:14ff:fe46:7391 error=-28 ”.

Keywords: RoCE, bonding

Discovered in Release: 4.4-1.0.0.0

Fixed in Release: 4.5-1.0.1.0

© Copyright 2023, NVIDIA. Last updated on Nov 27, 2023.