Bug Fixes
This table lists the bugs fixed in this release.
For the list of old bug fixes, please refer to MLNX_OFED Archived Bug Fixes file at:
http://www.mellanox.com/pdf/prod_software/MLNX_OFED_Archived_Bug_Fixes.pdf
Internal Reference Number |
Description |
1547200 |
Description: Fixed an issue where IPoIB Tx queue may get stuck, leading to timeout warnings in dmesg. |
Keywords: IPoIB |
|
Discovered in Release: 4.4-2.0.7.0 |
|
Fixed in Release: 4.6-1.0.1.1 |
|
1523548 |
Description: Fixed the issue where RDMA connection persisted even after dropping the network interface. |
Keywords: Network interface, RDMA |
|
Discovered in Release: 4.4-1.0.0.0 |
|
Fixed in Release: 4.6-1.0.1.1 |
|
1712870 |
Description: Fixed the issue where small packets with non-zero padding were wrongly reported as "checksum complete" even though the padding was not covered by the csum calculation. These packets now report "checksum unnecessary". In addition, an ethtool private flag has been introduced to control the "checksum complete" feature: ethtool --set-priv-flags eth1 rx_no_csum_complete on/off |
Keywords: csum error, checksum, mlx5_core |
|
Discovered in Release: 4.5-1.0.1.0 |
|
Fixed in Release: 4.6-1.0.1.1 |
|
1648597 |
Description: Fixed the wrong wording in the FW tracer ownership startup message (from "FW Tracer Owner" to "FWTracer: Ownership granted and active"). |
Keywords: FW Tracer |
|
Discovered in Release: 4.5-1.0.1.0 |
|
Fixed in Release: 4.6-1.0.1.1 |
|
1581631 |
Description: Fixed the issue where GID entries referenced to by a certain user application could not be deleted while that user application was running. |
Keywords: RoCE, GID |
|
Discovered in Release: 4.5-1.0.1.0 |
|
Fixed in Release: 4.6-1.0.1.1 |
|
1368390 |
Description: Fixed the issue where MLNX_OFED could not be installed on RHEL 7.x Alt OSs using YUM repository. |
Keywords: Installation, YUM, RHEL |
|
Discovered in Release: 4.3-3.0.2.1 |
|
Fixed in Release: 4.6-1.0.1.1 |
|
1531817 |
Description: Fixed an issue of when the number of channels configured was less than the number of CPUs available, part of the CPUs would not be used by Tx queues. |
Keywords: Performance, Tx, CPU |
|
Discovered in Release: 4.4-1.0.0.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1571977 |
Description: Fixed an issue of when the same CQ is connected to some QPs with SRQ and some without, wrong wr_id might be reported by ibv_poll_cq . |
Keywords: libmlx5, wr_id |
|
Discovered in Release: 4.4-1.0.0.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1380135 |
Description: Fixed the issue where IB port link used to flap due to MAD heartbeat response delay when using new CQ API. |
Keywords: IB port link, CQ API, MAD heartbeat |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1498931 |
Description: Fixed the issue where establishing TCP connection took too long due to failure of SA PathRecord query callback handler. |
Keywords: TCP, SA PathRecord |
|
Discovered in Release: 4.4-1.0.0.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1514096 |
Description: Fixed the issue where lack of high order allocations caused driver load failure. All high order allocations are now changed to order-0 allocations. |
Keywords: mlx5, high order allocation |
|
Discovered in Release: 4.0-2.0.2.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1524932 |
Description: Fixed a backport issue on some OSs, such as RHEL v7.x, where mlx5 driver would support ip link set DEVICE vf NUM rate TXRATE old command, instead of ip link set DEVICE vf NUM max_tx_rate TXRATE min_tx_rate TXRATE new command. |
Keywords: mlx5 driver |
|
Discovered in Release: 4.0-2.0.2.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1498585 |
Description: Fixed the issue of when performing configuration changes, mlx5e counters values were reset. |
Keywords: Ethernet counters |
|
Discovered in Release: 4.0-2.0.2.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1484603 |
Description: Fixed the issue of when using ibv_exp_cqe_ts_to_ns verb to convert a packet's hardware timestamp to UTC time in nanoseconds, the result may appear backwards compared to the converted time of a previous packet. |
Keywords: libibverbs |
|
Discovered in Release: 4.4-1.0.0.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1425027 |
Description: Fixed the issue where attempting to establish a RoCE connection on the default GID or on IPv6 link-local address might have failed when two or more netdevices that belong to HCA ports were slaves under a bonding master. This might also have resulted in the following error message in the kernel log: “ __ib_cache_gid_add: unable to add gid fe80:0000:0000:0000:f652:14ff:fe46:7391 error=-28 ”. |
Keywords: RoCE, bonding |
|
Discovered in Release: 4.4-1.0.0.0 |
|
Fixed in Release: 4.5-1.0.1.0 |
|
1480206 |
Description: Modified mlx5_ib SRQs behavior. Now the SRQs are allocated to “order 1” pages instead of contiguous ones to lower the probability of out-of-memory scenarios. |
Keywords: SRQ, mlx5_ib |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.4-2.0.7.0 |
|
1363375 |
Description: Modified mlx5_ib QPs behavior. Now the QPs are allocated to “order 1” pages instead of contiguous ones to lower the probability of out-of-memory scenarios. |
Keywords: IPoIB, mlx5_ib |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.4-2.0.7.0 |
|
1332080 |
Description: Modified mlx4_ib QPs behavior. Now the QPs are allocated to “order 1” pages instead of contiguous ones to lower the probability of out-of-memory scenarios. |
Keywords: IPoIB, mlx4_ib |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.4-1.0.0.0 |
|
1412468 |
Description: Added support for multi-host connection on mstflint’s mstfwreset. |
Keywords: mstfwreset, mstflint, MFT, multi-host |
|
Discovered in Release: 4.3-1.0.1.0 |
|
Fixed in Release: 4.4-1.0.0.0 |
|
1423319 |
Description: Removed the following prints on server shutdown: mlx5_core 0005:81:00.1: mlx5_enter_error_state:96:(pid1): start mlx5_core 0005:81:00.1: mlx5_enter_error_state:109:(pid1): end |
Keywords: mlx5, fast shutdown |
|
Discovered in Release: 4.3-1.0.1.0 |
|
Fixed in Release: 4.4-1.0.0.0 |
|
1433092 |
Description: Fixed an issue of when querying for IBV_EXP_VALUES_HW_CLOCK_NS (using ibv_exp_query_values function) without querying for IBV_EXP_VALUES_HW_CLOCK, 0 value was returned. |
Keywords: mlx5, CQE time-stamping |
|
Discovered in Release: 4.3-1.0.1.0 |
|
Fixed in Release: 4.4-1.0.0.0 |
|
1318251 |
Description: Fixed the issue of when bringing mlx4/mlx5 devices up or down, a call trace in nvme_rdma_remove_one or nvmet_rdma_remove_one may occur. |
Keywords: NVMEoF, mlx4, mlx5, call trace |
|
Discovered in Release: 4.3-1.0.1.0 |
|
Fixed in Release: 4.4-1.0.0.0 |
|
1181815 |
Description: Fixed an issue where 4K UD packets were dropped when working with 4K MTU on mlx4 devices. |
Keywords: mlx4, 4K MTU, UD |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.3-1.0.1.0 |
|
1247458 |
Description: Added support for VLAN Tag (VST) creation on RedHat v7.4 with new iproute2 packages (iptool). |
Keywords: SR-IOV, VST, RedHat |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.3-1.0.1.0 |
|
1229554 |
Description: Enabled RDMA CM to honor incoming requests coming from ports of different devices. |
Keywords: RDMA CM |
|
Discovered in Release: 4.2-1.0.0.0 |
|
Fixed in Release: 4.3-1.0.1.0 |
|
1262257 |
Description: Fixed an issue where sending Work Requests (WRs) with multiple entries where the first entry is less than 18 bytes used to fail. |
Keywords: ConnectX-5; libibverbs; Raw QP |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.3-1.0.1.0 |
|
1249358/1261023 |
Description: Fixed the issue of when the interface was down, ethtool counters ceased to increase. As a result, RoCE traffic counters were not always counted. |
Keywords: Ethtool counters, mlx5 |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.3-1.0.1.0 |
|
1244509 |
Description: Fixed compilation errors of MLNX_OFED over kernel when CONFIG_PTP_1588_CLOCK parameter was not set. |
Keywords: PTP, mlx5e |
|
Discovered in Release: 4.2-1.2.0.0 |
|
Fixed in Release: 4.3-1.0.1.0 |
|
1266802 |
Description: Fixed an issue where the system used to hang when trying to allocate multiple device memory buffers from different processes simultaneously. |
Keywords: Device memory programming |
|
Discovered in Release: 4.2-1.0.0.0 |
|
Fixed in Release: 4.3-1.0.1.0 |
|
1120424 |
Description: Fixed incorrect SGE number of RSS QP. |
Keywords: RSS, SGE |
|
Discovered in Release: 4.1-1.0.2.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
1078887 |
Description: Fixed an issue where post_list and CQ_mod features in perftest did not function when running the --run_infinitely flag. |
Keywords: perftest, --run_infinitely |
|
Discovered in Release: 4.2-1.0.1.0 |
|
Fixed in Release: 4.2-1.2.0.0 |
|
1186260 |
Description: Fixed the issue where CNP counters exposed under /sys/class/infiniband/mlx5_bond_0/ports/1/hw_counters/ did not aggregate both physical functions when working in RoCE LAG mode. |
Keywords: RoCE, LAG, ECN, Congestion Counters |
|
Discovered in Release: 4.2-1.0.1.0 |
|
Fixed in Release: 4.2-1.2.0.0 |
|
1178129 |
Description: Fixed an issue that prevented Windows virtual machines running over MLNX_OFED Linux hypervisors from operating ConnectX-3 IB ports. When such failures occurred, the following message (or similar) appeared in the Linux HV message log when users attempted to start up a Windows VM running a ConnectX-3 VF: “mlx4_core 0000:81:00.0: vhcr command 0x1a slave:1 in_param 0x793000 in_mod=0x210 op_mod=0x0 failed with error:0, status -22” |
Keywords: SR-IOV, RDMA, VM, KVM, Windows |
|
Discovered in Release: 4.2-1.0.1.0 |
|
Fixed in Release: 4.2-1.2.0.0 |
|
1192374 |
Description: Fixed wrong calculation of max_device_ctx capability in ConnectX-4, ConnectX-4 Lx, and ConnectX-5 HCAs. |
Keywords: ibv_exp_query_device, max_device_ctx mlx5 |
|
Discovered in Release: 4.2-1.0.1.0 |
|
Fixed in Release: 4.2-1.2.0.0 |
|
1084791 |
Description: Fixed the issue where occasionally, after reboot, rpm commands used to fail and create a core file, with messages such as “Bus error (core dumped)”, causing the openibd service to fail to start. |
Keywords: rpm, openibd |
|
Discovered in Release: 3.4-2.0.0.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
960642/960653 |
Description: Added support for min_tx_rate and max_tx_rate limit per virtual function ConnectX-5 and ConnectX-5 Ex adapter cards. |
Keywords: SR-IOV, mlx5 |
|
Discovered in Release: 4.0-1.0.1.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
866072/869183 |
Description: Fixed the issue where RoCE v2 multicast traffic using RDMA-CM with IPv4 address was not received. |
Keywords: RoCE |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
1163835 |
Description: Fixed an issue where ethtool -P output was 00:00:00:00:00:00 when using old kernels. |
Keywords: ethtool, Permanent MAC address, mlx4, mlx5 |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
1067158 |
Description: Replaced a few “GPL only” legacy libibverbs functions with upstream implementation that conforms with libibverbs GPL/BSD dual license model. |
Keywords: libibverbs, license |
|
Discovered in Release: 4.1-1.0.2.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
1119377 |
Description: Fixed an issue where ACCESS_REG command failure used to appear upon RoCE Multihost driver restart in dmesg. Such an error message looked as follows: mlx5_core 0000:01:00.0: mlx5_cmd_check:705:(pid 20037): ACCESS_REG(0x805) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x15c356) |
Keywords: RoCE, multihost, mlx5 |
|
Discovered in Release: 4.1-1.0.2.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
1122937 |
Description: Fixed an issue where concurrent client requests got corrupted when working in persistent server mode due to a race condition on the server side. |
Keywords: librdmacm, rping |
|
Discovered in Release: 4.1-1.0.2.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
1102158 |
Description: Fixed an issue where client side did not exit gracefully in RTT mode when the server side was not reachable. |
Keywords: librdmacm, rping |
|
Discovered in Release: 4.1-1.0.2.0 |
|
Fixed in Release: 4.2-1.0.0.0 |
|
1038933 |
Description: Fixed a backport issue where IPv6 procedures were called while they were not supported in the underlying kernel. |
Keywords: iw_cm |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
1064722 |
Description: Added log debug prints when changing HW configuration via DCB. To enable log debug prints, run: ethtool -s <devname> msglvl hw on/off |
Keywords: DCB, msglvl |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
1013076 |
Description: Fixed the issue where reassembly of packets larger than 64k might have failed when ipfrag threshold was low. This issue was present only on RHEL 6.3, 6.4, 6.5, and Ubuntu 12.04. This packet drop could be seen from the netstat tool, indicated by the “packet reassembles failed” counter. |
Keywords: IPoIB, Packet Fragmentation |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
1022251 |
Description: Fixed SKB memory leak issue that was introduced in kernel 4.11, and added warning messages to the Soft RoCE driver for easy detection of future SKB leaks. |
Keywords: Soft RoCE |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
1044546 |
Description: Fixed the issue where a kernel crash used to occur when RXe device was coupled with a virtual (dummy) device. |
Keywords: Soft RoCE |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
1047617 |
Description: Fixed the issue where a race condition in the RoCE GID cache used to cause for the loss of IP-based GIDs. |
Keywords: RoCE, GID |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
1006768 |
Description: Fixed the issue where an rdma_cm connection between a client and a server that were on the same host was not possible when working over VLAN interfaces. |
Keywords: RDMACM |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
801807 |
Description: Fixed an issue where RDMACM connection used to fail upon high connection rate accompanied with the error message: RDMA_CM_EVENT_UNREACHABLE . |
Keywords: RDMACM |
|
Discovered in Release: 3.0-2.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
869768 |
Description: Fixed the issue where SR-IOV was not supported in systems with a page size greater than 16KB. |
Keywords: SR-IOV, mlx5, PPC |
|
Discovered in Release: 4.0-2.0.0.1 |
|
Fixed in Release: 4.1-1.0.2.0 |
|
1155972 |
Description: Fixed mlx4 kernel crash upon server shutdown due to NULL pointer dereference. |
Keywords: mlx4, shutdown |
|
Discovered in Release: 3.3-1.0.4.0 |
|
Fixed in Release: 4.0-2.0.0.1 |
|
919545 |
Description: Fixed the issue of when the Kernel becomes out of memory upon driver start, it could crash on SLES 12 SP2. |
Keywords: mlx_5 Eth Driver |
|
Discovered in Release: 3.4-2.0.0.0 |
|
Fixed in Release: 4.0-2.0.0.1 |
|
970668 |
Description: Fixed the issue where very high stress on DC QP transport might have triggered NMI messages on specific servers. |
Keywords: mlx5 Driver |
|
Discovered in Release: 4.0-1.0.1.0 |
|
Fixed in Release: 4.0-2.0.0.1 |
|
966134 |
Description: Allowed Ethernet VFs to open Raw Ethernet QPs even if RoCE is not supported for the VF. |
Keywords: mlx4_ib |
|
Discovered in Release: 3.0-1.0.1 |
|
Fixed in Release: 4.0-2.0.0.1 |
|
864063 |
Description: Fixed the issue of when Spoof-check may have been turned on for MAC address 00:00:00:00:00:00. |
Keywords: mlx4 |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.0-2.0.0.1 |
|
869209 |
Description: Fixed an issue that caused TCP packets to be received in an out of order manner when Large Receive Offload (LRO) is on. |
Keywords: mlx5_en |
|
Discovered in Release: 3.3-1.0.0.0 |
|
Fixed in Release: 4.0-2.0.0.1 |
|
913319 |
Description: Fixed the issue of low performance when creating many address handles. |
Keywords: libibverbs |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.0-1.0.1.0 |
|
912897 |
Description: Added debug prints to ib_umem_get function to fix lack of error indication when this function fails. |
Keywords: InfiniBand |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.0-1.0.1.0 |
|
945887 |
Description: [ConnectX-3] Fixed the issue where multicast traffic over Raw Ethernet QP on virtual functions were received on the same QP (loopback). |
Keywords: SR-IOV |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.0-1.0.1.0 |
|
920292 |
Description: Fixed three issues in libmlx5 that were found by NVIDIA in the patches that are part of MLNX_OFED v3.4:
|
Keywords: libmlx5 |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.0-1.0.1.0 |
|
890285 |
Description: Fixed the issue where memory allocation for CQ buffers used to fail when increasing the RX ring size. |
Keywords: mlx5_core |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.0-1.0.1.0 |
|
867094 |
Description: Fixed the issue where MLNX_OFED used to fail to load on 4K page Arm architecture. |
Keywords: Arm |
|
Discovered in Release: 3.4-1.0.0.0 |
|
Fixed in Release: 4.0-1.0.1.0 |