Out-of-Order (OOO) Data Placement Experimental Verbs
This feature is only supported on:
ConnectX-5 adapter cards and above
RC and XRC QPs
DC transport
In certain fabric configurations, InfiniBand packets for a given QP may take up different paths in a network from source to destination. This results into packets being received in an out-of-order manner. These packets can now be handled instead of being dropped, in order to avoid retransmission, by:
Achieving better network utilization
Decreasing latency
Data will be placed into host memory in an out-of-order manner when out-of-order messages are received.
User Space Application QPs
OOO Data Placement for user space applications QPs can be enabled using the Linux environment variable:
To enable for all devices, set the variable to "all". Example:
$ export ="all"
To enable for a specific device (in this example, mlx5_0):
$ export MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0"
To enable for multiple devices:
$ export MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0 mlx5_3 mlx5_4"
Kernel ULP QPs
OOO Data Placement for kernel QPs can be enabled by modifying the driver's sysfs entry. Example:
$ echo <0
|1
> > /sys/kernel/debug/mlx5/<pci-bus>/ooo/enable
To make sure this configuration remains permanent, configuration file /etc/infiniband/mlx5.conf should be updated as follows:
# Enable all mlx5 devices.
MLX5_RELAXED_PACKET_ORDERING_ON="all"
# Enable mlx5_0 only
MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0"
# Enable mlx5_0, mlx5_3 and mlx5_4
MLX5_RELAXED_PACKET_ORDERING_ON="mlx5_0 mlx5_3 mlx5_4"
# To Disable the feature, use MLX5_RELAXED_PACKET_ORDERING_OFF variable
# which supports same syntax as MLX5_RELAXED_PACKET_ORDERING_ON:
MLX5_RELAXED_PACKET_ORDERING_OFF="mlx5_1"
Notes
On the responder side, contents of the RDMA write buffer are guaranteed to be fully received only if one of the following events takes place:
Completion of the RDMA Write with immediate data
Arrival and completion of the subsequent Send message
Update of a memory element by subsequent RDMA Atomic operation
On the requester side, contents of the RDMA read buffer are guaranteed to be fully received only if one of the following events takes place:
Completion of the RDMA Read Work Request (if completion is requested)
Completion of the subsequent Work Request