RSS Support
The device has the ability to use XOR as the RSS distribution function, instead of the default Toplitz function.
The XOR function can be better distributed among driver's receive queues in a small number of streams, where it distributes each TCP/UDP stream to a different queue.
mlx4 RSS Hash Function
MLNX_OFED provides one of the following options to change the working RSS hash function from Toplitz to XOR, and vice versa:
Through ethtool priv-flags, in case mlx4_rss_xor_hash_function is not part of the priv-flags list.
ethtool --set-priv-flags eth<x> mlx4_rss_xor_hash_function on/off
Through ethtool, provided as part of MLNX_OFED package, in case mlx4_rss_x- or_hash_function is not part of the priv-flags list:
/opt/mellanox/ethtool# ./sbin/ethtool -X ens4 hfunc xor /opt/mellanox/ethtool# ./sbin/ethtool --show-rxfh ens4
Output:
RX flow hash indirection table
for
ens4 with8
RX ring(s):0
:0
1
2
3
4
5
6
7
RSS hash key: 7c:9c:37
:de:18
:dc:43
:86
:d9:27
:0f:6f:26
:03
:74
:b8:bf:d0:40
:4b:78
:72
:e2:24
:dc:1b:91
:bb:01
:1b:a7:a6:37
:6c:c8:7e:d6:e3:14
:17
RSS hash function: toeplitz: off xor : on
For further information, please refer to E thtool Supported Options table.
mlx5 RSS Hash Function
MLNX_OFED provides the following option to change the working RSS hash function from Toplitz to XOR, and vice versa:
Through sysfs, located at: /sys/class/net/eth*/settings/hfunc.
To query the operational and supported hash functions:
cat /sys/class
/net/eth*/settings/hfunc
Example:
cat /sys/class
/net/eth2/settings/hfunc
Operational hfunc: toeplitz
Supported hfuncs: xor toeplitz
To change the operational hash function:
echo xor > /sys/class
/net/eth*/settings/hfunc
RSS Support for IP Fragments
Supported in ConnectX-3 and ConnectX-3 Pro only.
As of MLNX_OFED v2.4-.1.0.0, RSS will distribute incoming IP fragmented datagrams according to its hash function, considering the L3 IP header values. Different IP fragmented datagram flows will be directed to different rings.
When the first packet in IP fragments chain contains upper layer transport header (e.g. UDP packets larger than MTU), it will be directed to the same target as the proceeding IP fragments that follow it, to prevent out-of-order processing.
Receive Side Scaling (RSS) technology allows spreading incoming traffic between different receive descriptor queues. Assigning each queue to different CPU cores allows better load balancing of the incoming traffic and improves performance.
This technology was extended to user space by the verbs layer and can be used for RAW ETH QP.
RSS Flow Steering
Steering rules classify incoming packets and deliver a specific traffic type (e.g. TCP/UDP, IP only) or a specific flow to "RX Hash" QP. "RX Hash" QP is responsible for spreading the traffic it handles between the Receive Work Queues using RX hash and Indirection Table. The Receive Work Queue can point to different CQs that can be associated with different CPU cores.
Verbs
The below experimental verbs should be used to achieve this task in both control and data path. Details per verb should be referenced from its man page.
Control path:
ibv_exp_create_wq, ibv_exp_modify_wq, ibv_exp_destory_wq
ibv_exp_create_rwq_ind_table, ibv_exp_destroy_rwq_ind_table
ibv_exp_create_qp with specific RX configuration to create the "RX hash" QP.
ibv_exp_query_device
Data path
The accelerated verbs should be used to post receive to the created WQ(s) and to poll for com- pletions. Specifically, ibv_exp_query_intf should be used to get IBV_EXP_INTF_WQ family and work with its functions to post receive.
For the polling should use the IBV_EXP_INTF_CQ family with poll_length which fits the receive side.