NVIDIA MLNX_EN Documentation Rev 4.9-6.0.6.0 LTS

RSS Support

The device has the ability to use XOR as the RSS distribution function, instead of the default Toplitz function.
The XOR function can be better distributed among driver's receive queues in a small number of streams, where it distributes each TCP/UDP stream to a different queue.

mlx4 RSS Hash Function

MLNX_OFED provides one of the following options to change the working RSS hash function from Toplitz to XOR, and vice versa:

  • Through ethtool priv-flags, in case mlx4_rss_xor_hash_function is not part of the priv-flags list.

    Copy
    Copied!
                

    ethtool --set-priv-flags eth<x> mlx4_rss_xor_hash_function on/off

  • Through ethtool, provided as part of MLNX_OFED package, in case mlx4_rss_x- or_hash_function is not part of the priv-flags list:

    Copy
    Copied!
                

    /opt/mellanox/ethtool# ./sbin/ethtool -X ens4 hfunc xor /opt/mellanox/ethtool# ./sbin/ethtool --show-rxfh ens4

    Output:

    Copy
    Copied!
                

    RX flow hash indirection table for ens4 with 8 RX ring(s): 0: 0 1 2 3 4 5 6 7 RSS hash key: 7c:9c:37:de:18:dc:43:86:d9:27:0f:6f:26:03:74:b8:bf:d0:40:4b:78:72:e2:24:dc:1b:91:bb:01:1b:a7:a6:37:6c:c8:7e:d6:e3:14:17 RSS hash function: toeplitz: off xor : on

For further information, please refer to E thtoolSupportedOptions table.

mlx5 RSS Hash Function

MLNX_OFED provides the following option to change the working RSS hash function from Toplitz to XOR, and vice versa:

Through sysfs, located at: /sys/class/net/eth*/settings/hfunc.

To query the operational and supported hash functions:

Copy
Copied!
            

cat /sys/class/net/eth*/settings/hfunc

Example:

Copy
Copied!
            

cat /sys/class/net/eth2/settings/hfunc Operational hfunc: toeplitz Supported hfuncs: xor toeplitz

To change the operational hash function:

Copy
Copied!
            

echo xor > /sys/class/net/eth*/settings/hfunc

RSS Support for IP Fragments

Warning

Supported in ConnectX-3 and ConnectX-3 Pro only.

As of MLNX_OFED v2.4-.1.0.0, RSS will distribute incoming IP fragmented datagrams according to its hash function, considering the L3 IP header values. Different IP fragmented datagram flows will be directed to different rings.

Warning

When the first packet in IP fragments chain contains upper layer transport header (e.g. UDP packets larger than MTU), it will be directed to the same target as the proceeding IP fragments that follow it, to prevent out-of-order processing.

Receive Side Scaling (RSS) technology allows spreading incoming traffic between different receive descriptor queues. Assigning each queue to different CPU cores allows better load balancing of the incoming traffic and improves performance.
This technology was extended to user space by the verbs layer and can be used for RAW ETH QP.

RSS Flow Steering

Steering rules classify incoming packets and deliver a specific traffic type (e.g. TCP/UDP, IP only) or a specific flow to "RX Hash" QP. "RX Hash" QP is responsible for spreading the traffic it handles between the Receive Work Queues using RX hash and Indirection Table. The Receive Work Queue can point to different CQs that can be associated with different CPU cores.

Verbs

The below experimental verbs should be used to achieve this task in both control and data path. Details per verb should be referenced from its man page.

  • Control path:

    • ibv_exp_create_wq, ibv_exp_modify_wq, ibv_exp_destory_wq

    • ibv_exp_create_rwq_ind_table, ibv_exp_destroy_rwq_ind_table

    • ibv_exp_create_qp with specific RX configuration to create the "RX hash" QP.

    • ibv_exp_query_device

  • Data path
    The accelerated verbs should be used to post receive to the created WQ(s) and to poll for com- pletions. Specifically, ibv_exp_query_intf should be used to get IBV_EXP_INTF_WQ family and work with its functions to post receive.
    For the polling should use the IBV_EXP_INTF_CQ family with poll_length which fits the receive side.

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.