Flow Control

Global Pause is a control frame sent by the receiver to alert the sender that the receiver buffer is about to overflow.

Warning

Both Tx and Rx global pause frames are enabled by default.

PFC, IEEE 802.1Qbb, applies pause functionality to specific classes of traffic on the Ethernet link. For example, PFC can provide lossless service for the RoCE traffic and best-effort service for the standard Ethernet traffic. PFC can provide different levels of service to specific classes of Ethernet traffic (using IEEE 802.1p traffic classes).

PFC Local Configuration

  1. Disable global pause frames. Example:

    Copy
    Copied!
                

    # ifconfig mce<N> media autoselect mediaopt full-duplex

  2. Enable PFC on the desired priority by using the sysctl utility:

    Copy
    Copied!
                

    dev.mce.<N>.rx_priority_flow_control_<prio>: <enabled:1 disabled:0> dev.mce.<N>.tx_priority_flow_control_<prio>: <enabled:1 disabled:0>

Procedure_Heading_Icon.PNG

To check PFC statistics:

Copy
Copied!
            

# sysctl -a dev.mce.<N>.pstats | grep prio<prio>

  • dev.mce.X.conf.qos.buffers_size - the hardware allows to configure up to eight buffer sizes. Total sum of all buffers must not exceed the hardware memory size. The limitation is enforced automatically. The sysctl allows to set each buffer size. Buffer space exhaustion causes the card to send xoff to the other side of the link.
  • dev.mce.X.conf.qos.buffers_prio - maps buffer index into the hardware-defined priority. Note that the priority is the internal number after translation from the external QoS parameters.
  • dev.mce.X.conf.qos.cable_length - for more precise determination of the moment when xoff should be issued, user might specify the cable length in meters, which is used to calculate the signal propagation delay.

ECN in ConnectX-4 and ConnectX-5 HCAs enables end-to-end congestion notifications between two end-points when a congestion occurs, and works over Layer 3. ECN must be enabled on all nodes in the path (nodes, routers, etc.) between the two end points and the intermediate devices (switches) between them to ensure reliable communication. ECN handling is supported only for RoCEv2 packets.

Procedure_Heading_Icon.PNG

To enable ECN on the hosts:

  1. Load mlx5ib(4) and mlx5en(4):

    Copy
    Copied!
                

    # kldload mlx5ib mlx5en

  2. Query the relevant attributes:

    Copy
    Copied!
                

    # sysctl -a sys.class.infiniband.mlx5_<devno>.cong.conf

  3. Modify the attributes:

    Copy
    Copied!
                

    # sysctl sys.class.infiniband.mlx5_<devno>.cong.conf.<attr>=<value>

ECN supports the following algorithms:

  • r_roce_ecn_rp - Reaction point

  • r_roce_ecn_np - Notification point

Each algorithm has a set of relevant parameters and statistics, which are defined per device. ECN and QCN are not compatible.

© Copyright 2023, NVIDIA. Last updated on May 24, 2023.