image image image image image

On This Page

You can control the behavior of XLIO by configuring:

  • The libxlio.conf file
  • XLIO configuration parameters, which are Linux OS environment variables
  • XLIO extra API

Configuring libxlio.conf

The installation process creates a default configuration file, /etc/libxlio.conf, in which you can define and change the following settings:

  • The target applications or processes to which the configured control settings apply. By default, XLIO control settings are applied to all applications.
  • The transport to be used for the created sockets.
  • The IP addresses and ports in which you want offload.

By default, the configuration file allows XLIO to offload everything except for the DNS server-side protocol (UDP, port 53) which will be handled by the OS.

In the libxlio.conf file:

  • You can define different XLIO control statements for different processes in a single configuration file. Control statements are always applied to the preceding target process statement in the configuration file.
  • Comments start with # and cause the entire line after it to be ignored.
  • Any beginning whitespace is skipped.
  • Any line that is empty is skipped.
  • It is recommended to add comments when making configuration changes.

The following sections describe configuration options in libxlio.conf. For a sample libxlio.conf file, see Example of XLIO Configuration.

Configuring Target Application or Process

The target process statement specifies the process to which all control statements that appear between this statement and the next target process statement apply.

Each statement specifies a matching rule that all its sub-expressions must evaluate as true (logical and) to apply.

If not provided (default), the statement matches all programs.

The format of the target process statement is: 

application-id <program-name|*> <user-defined-id|*>
OptionDescription
<program-name|*>

Define the program name (not including the path) to which the control statements appearing below this statement apply.
Wildcards with the same semantics as "ls" are supported (* and ?).
For example:

  • db2* matches any program with a name starting with db2.
  • t?cp matches ttcp, etc.
<user-defined-id|*>

Specify the process ID to which the control statements appearing below this statement apply.

You must also set the XLIO_APPLICATION_ID environment variable to the same value as user-defined-id.

Configuring Socket Transport Control

Use socket control statements to specify when libxlio will offload AF_INET/SOCK_STREAM or AF_INET/SOCK_DATAGRAM sockets (currently SOCK_RAW is not supported).

Each control statement specifies a matching rule that all its sub-expressions must evaluate as true (logical and) to apply. Statements are evaluated in order of definition according to "first-match".

Socket control statements use the following format: 

use <transport> <role> <address|*>:<port range|*>

Where:

OptionDescription

transport

Define the mode of transport:

  • xlio – XLIO should be used.
  • os the socket should be handled by the OS network stack. In this mode, the sockets are not offloaded.

The default is xlio.

role

Specify one of the following roles:

  • tcp_server – for listen sockets. Accepted sockets follow listen sockets. Defined by local_ip:local_port.
  • tcp_client – for connected sockets. Defined by remote_ip:remote_port:local_ip:local_port
  • udp_sender – for TX flows. Defined by remote_ip:remote_port
  • udp_receiver – for RX flows. Defined by local_ip:local_port
  • udp_connect – for UDP connected sockets. Defined by remote_ip:remote_port:local_ip:local_port

address

You can specify the local address the server is bind to or the remote server address the client connects to.

The syntax for address matching is:

<IPv4 address>[/<prefix_length>]|*

  • IPv4 address [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+ each sub number < 255
  • prefix_length [0-9]+ and with value <= 32. A prefix_length of 24 # matches the subnet mask 255.255.255.0 . A prefix_length of 32 requires matching of the exact IP.

port range

Define the port range as:

start-port[-end-port]

Port range: 0-65536

Example of XLIO Configuration

To set the following:

  • Apply the rules to program tcp_lat with ID B1
  • Use XLIO by TCP clients connecting to machines that belong to subnet 192.168.1.*
  • Use OS when TCP server listens to port 5001 of any machine

In libxlio.conf, configure: 

application-id tcp-lat B1
use xlio tcp_client 192.168.1.0/24:*:*:*
use os  tcp_server   *:5001
use os  udp_connect  *:53

You must also set the XLIO parameter:

XLIO_APPLICATION_ID=B1

XLIO Configuration Parameters

XLIO configuration parameters are Linux OS environment variables that are controlled with system environment variables.

It is recommended that you set these parameters prior to loading the application with XLIO. You can set the parameters in a system file, which can be run manually or automatically.

All the parameters have defaults that can be modified.

On default startup, the XLIO library prints the XLIO version information, as well as the configuration parameters being used and their values to stderr.

XLIO always logs the values of the following parameters, even when they are equal to the default value:

  • XLIO_TRACELEVEL
  • XLIO_LOG_FILE

For all other parameters, XLIO logs the parameter values only when they are not equal to the default value.

The XLIO version information, parameters, and values are subject to change.

For example: 

XLIO INFO   : XLIO_VERSION: X.Y.Z-R Release built on MM DD YYYY HH:mm:ss
XLIO INFO   : Cmd Line: sockperf server -i 11.138.2.230
XLIO INFO   : ---------------------------------------------------------------------------
XLIO DEBUG  : Current Time: DayMonth DD HH:mm:ss YYYY
XLIO DEBUG  : Pid: 3945469
XLIO INFO   : OFED Version: MLNX_OFED_LINUX-X.X-X.X.X.X:
XLIO DEBUG  : System: 5.4.0-121-generic
XLIO DEBUG  : Architecture: x86_64
XLIO DEBUG  : Node: luna02 XLIO INFO   : ---------------------------------------------------------------------------

XLIO INFO   : Log Level                      DETAILS                    [XLIO_TRACELEVEL]
XLIO DETAILS: Log Details                    0                          [XLIO_LOG_DETAILS]
XLIO DETAILS: Log Colors                     Enabled                    [XLIO_LOG_COLORS]
XLIO DETAILS: Log File                                                  [XLIO_LOG_FILE]
XLIO DETAILS: Stats File                                                [XLIO_STATS_FILE]
XLIO DETAILS: Stats shared memory directory  /tmp/xlio                  [XLIO_STATS_SHMEM_DIR]
XLIO DETAILS: SERVICE output directory       /tmp/xlio                  [XLIO_SERVICE_NOTIFY_DIR]
XLIO DETAILS: Stats FD Num (max)             100                        [XLIO_STATS_FD_NUM]
XLIO DETAILS: Conf File                      /etc/libxlio.conf          [XLIO_CONFIG_FILE]
XLIO DETAILS: Application ID                 XLIO_DEFAULT_APPLICATION_ID [XLIO_APPLICATION_ID]
XLIO DETAILS: Polling CPU idle usage         Disabled                   [XLIO_CPU_USAGE_STATS]
XLIO DETAILS: SigIntr Ctrl-C Handle          Enabled                    [XLIO_HANDLE_SIGINTR]
XLIO DETAILS: SegFault Backtrace             Disabled                   [XLIO_HANDLE_SIGSEGV]
XLIO INFO   : Ring allocation logic TX       20(Ring per thread)        [XLIO_RING_ALLOCATION_LOGIC_TX]
XLIO INFO   : Ring allocation logic RX       20(Ring per thread)        [XLIO_RING_ALLOCATION_LOGIC_RX]
XLIO DETAILS: Ring migration ratio TX        -1                         [XLIO_RING_MIGRATION_RATIO_TX]
XLIO DETAILS: Ring migration ratio RX        100                        [XLIO_RING_MIGRATION_RATIO_RX]
XLIO DETAILS: Ring limit per interface       0 (no limit)               [XLIO_RING_LIMIT_PER_INTERFACE]
XLIO DETAILS: Ring On Device Memory TX       0                          [XLIO_RING_DEV_MEM_TX]
XLIO DETAILS: TCP max syn rate               0 (no limit)               [XLIO_TCP_MAX_SYN_RATE]
XLIO DETAILS: Zerocopy Mem Bufs              200000                     [XLIO_ZC_BUFS]
XLIO DETAILS: Zerocopy Cache Threshold       10240                      [XLIO_ZC_CACHE_THRESHOLD]
XLIO DETAILS: Tx Mem Segs TCP                1000000                    [XLIO_TX_SEGS_TCP]
XLIO DETAILS: Tx Mem Bufs                    200000                     [XLIO_TX_BUFS]
XLIO DETAILS: Tx Mem Buf size                0                          [XLIO_TX_BUF_SIZE]
XLIO DETAILS: ZC TX size                     32768                      [XLIO_ZC_TX_SIZE]
XLIO DETAILS: Tx QP WRE                      32768                      [XLIO_TX_WRE]
XLIO DETAILS: Tx QP WRE Batching             64                         [XLIO_TX_WRE_BATCHING]
XLIO DETAILS: Tx Max QP INLINE               204                        [XLIO_TX_MAX_INLINE]
XLIO DETAILS: Tx MC Loopback                 Enabled                    [XLIO_TX_MC_LOOPBACK]
XLIO DETAILS: Tx non-blocked eagains         Disabled                   [XLIO_TX_NONBLOCKED_EAGAINS]
XLIO DETAILS: Tx Prefetch Bytes              256                        [XLIO_TX_PREFETCH_BYTES]
XLIO INFO   : Tx Bufs Batch TCP              1                          [XLIO_TX_BUFS_BATCH_TCP]
XLIO DETAILS: Tx Segs Batch TCP              64                         [XLIO_TX_SEGS_BATCH_TCP]
XLIO DETAILS: TCP Send Buffer size           1000000                    [XLIO_TCP_SEND_BUFFER_SIZE]
XLIO INFO   : Rx Mem Bufs                    32384                      [XLIO_RX_BUFS]
XLIO INFO   : Rx QP WRE                      1024                       [XLIO_RX_WRE]
XLIO DETAILS: Rx QP WRE Batching             1                          [XLIO_RX_WRE_BATCHING]
XLIO DETAILS: Rx Byte Min Limit              65536                      [XLIO_RX_BYTES_MIN]
XLIO DETAILS: Rx Poll Loops                  100000                     [XLIO_RX_POLL]
XLIO DETAILS: Rx Poll Init Loops             0                          [XLIO_RX_POLL_INIT]
XLIO DETAILS: Rx UDP Poll OS Ratio           100                        [XLIO_RX_UDP_POLL_OS_RATIO]
XLIO DETAILS: HW TS Conversion               3                          [XLIO_HW_TS_CONVERSION]
XLIO DETAILS: Rx Poll Yield                  Disabled                   [XLIO_RX_POLL_YIELD]
XLIO DETAILS: Rx Prefetch Bytes              256                        [XLIO_RX_PREFETCH_BYTES]
XLIO DETAILS: Rx Prefetch Bytes Before Poll  0                          [XLIO_RX_PREFETCH_BYTES_BEFORE_POLL]
XLIO DETAILS: Rx CQ Drain Rate               Disabled                   [XLIO_RX_CQ_DRAIN_RATE_NSEC]
XLIO INFO   : GRO max streams                8192                       [XLIO_GRO_STREAMS_MAX]
XLIO DETAILS: Disable flow tag               0                          [XLIO_DISABLE_FLOW_TAG]
XLIO DETAILS: TCP 3T rules                   Disabled                   [XLIO_TCP_3T_RULES]
XLIO DETAILS: UDP 3T rules                   Enabled                    [XLIO_UDP_3T_RULES]
XLIO DETAILS: ETH MC L2 only rules           Disabled                   [XLIO_ETH_MC_L2_ONLY_RULES]
XLIO DETAILS: Force Flowtag for MC           Disabled                   [XLIO_MC_FORCE_FLOWTAG]
XLIO DETAILS: Striding RQ                    Enabled                    [XLIO_STRQ]
XLIO INFO   : STRQ Strides per RWQE          4096                       [XLIO_STRQ_NUM_STRIDES]
XLIO INFO   : STRQ Stride Size (Bytes)       128                        [XLIO_STRQ_STRIDE_SIZE_BYTES]
XLIO DETAILS: STRQ Initial Strides Per Ring  262144                     [XLIO_STRQ_STRIDES_NUM_BUFS]
XLIO INFO   : STRQ Strides Compensation Level 131072                     [XLIO_STRQ_STRIDES_COMPENSATION_LEVEL]
XLIO DETAILS: Select Poll (usec)             100000                     [XLIO_SELECT_POLL]
XLIO DETAILS: Select Poll OS Force           Disabled                   [XLIO_SELECT_POLL_OS_FORCE]
XLIO DETAILS: Select Poll OS Ratio           10                         [XLIO_SELECT_POLL_OS_RATIO]
XLIO DETAILS: Select Skip OS                 4                          [XLIO_SELECT_SKIP_OS]
XLIO INFO   : CQ Drain Thread                Disabled                   [XLIO_PROGRESS_ENGINE_INTERVAL]
XLIO DETAILS: CQ Interrupts Moderation       Enabled                    [XLIO_CQ_MODERATION_ENABLE]
XLIO DETAILS: CQ Moderation Count            48                         [XLIO_CQ_MODERATION_COUNT]
XLIO DETAILS: CQ Moderation Period (usec)    50                         [XLIO_CQ_MODERATION_PERIOD_USEC]
XLIO DETAILS: CQ AIM Max Count               560                        [XLIO_CQ_AIM_MAX_COUNT]
XLIO DETAILS: CQ AIM Max Period (usec)       250                        [XLIO_CQ_AIM_MAX_PERIOD_USEC]
XLIO DETAILS: CQ AIM Interval (msec)         250                        [XLIO_CQ_AIM_INTERVAL_MSEC]
XLIO DETAILS: CQ AIM Interrupts Rate (per sec) 5000                       [XLIO_CQ_AIM_INTERRUPTS_RATE_PER_SEC]
XLIO DETAILS: CQ Poll Batch (max)            16                         [XLIO_CQ_POLL_BATCH_MAX]
XLIO DETAILS: CQ Keeps QP Full               Enabled                    [XLIO_CQ_KEEP_QP_FULL]
XLIO INFO   : QP Compensation Level          256                        [XLIO_QP_COMPENSATION_LEVEL]
XLIO DETAILS: Offloaded Sockets              Enabled                    [XLIO_OFFLOADED_SOCKETS]
XLIO DETAILS: Timer Resolution (msec)        10                         [XLIO_TIMER_RESOLUTION_MSEC]
XLIO DETAILS: TCP Timer Resolution (msec)    100                        [XLIO_TCP_TIMER_RESOLUTION_MSEC]
XLIO DETAILS: TCP control thread             0 (Disabled)               [XLIO_TCP_CTL_THREAD]
XLIO DETAILS: TCP timestamp option           0                          [XLIO_TCP_TIMESTAMP_OPTION]
XLIO DETAILS: TCP nodelay                    0                          [XLIO_TCP_NODELAY]
XLIO DETAILS: TCP quickack                   0                          [XLIO_TCP_QUICKACK]
XLIO DETAILS: Exception handling mode        -1(just log debug message) [XLIO_EXCEPTION_HANDLING]
XLIO DETAILS: Avoid sys-calls on tcp fd      Disabled                   [XLIO_AVOID_SYS_CALLS_ON_TCP_FD]
XLIO DETAILS: Allow privileged sock opt      Enabled                    [XLIO_ALLOW_PRIVILEGED_SOCK_OPT]
XLIO DETAILS: Delay after join (msec)        0                          [XLIO_WAIT_AFTER_JOIN_MSEC]
XLIO DETAILS: Internal Thread Affinity       -1                         [XLIO_INTERNAL_THREAD_AFFINITY]
XLIO DETAILS: Internal Thread Cpuset                                    [XLIO_INTERNAL_THREAD_CPUSET]
XLIO DETAILS: Internal Thread Arm CQ         Disabled                   [XLIO_INTERNAL_THREAD_ARM_CQ]
XLIO DETAILS: Internal Thread TCP Handling   0 (deferred)               [XLIO_INTERNAL_THREAD_TCP_TIMER_HANDLING]
XLIO DETAILS: Thread mode                    Multi spin lock            [XLIO_THREAD_MODE]
XLIO INFO   : Buffer batching mode           0 (No batching buffers)    [XLIO_BUFFER_BATCHING_MODE]
XLIO DETAILS: Mem Allocate type              2 (Huge Pages)             [XLIO_MEM_ALLOC_TYPE]
XLIO DETAILS: Num of UC ARPs                 3                          [XLIO_NEIGH_UC_ARP_QUATA]
XLIO DETAILS: UC ARP delay (msec)            10000                      [XLIO_NEIGH_UC_ARP_DELAY_MSEC]
XLIO DETAILS: Num of neigh restart retries   1                          [XLIO_NEIGH_NUM_ERR_RETRIES]
XLIO DETAILS: TSO support                    Auto                       [XLIO_TSO]
XLIO INFO   : LRO support                    Disabled                   [XLIO_LRO]
XLIO DETAILS: BF (Blue Flame)                Enabled                    [XLIO_BF]
XLIO DETAILS: UTLS RX support                Disabled                   [XLIO_UTLS_RX]
XLIO DETAILS: UTLS TX support                Enabled                    [XLIO_UTLS_TX]
XLIO DETAILS: UTLS high watermark DEK cache size 1024                       [XLIO_UTLS_HIGH_WMARK_DEK_CACHE_SIZE]
XLIO DETAILS: UTLS low watermark DEK cache size 512                        [XLIO_UTLS_LOW_WMARK_DEK_CACHE_SIZE]
XLIO DETAILS: Src port stirde                2                          [XLIO_SRC_PORT_STRIDE]
XLIO DETAILS: Number of Nginx workers        0                          [XLIO_NGINX_WORKERS_NUM]
XLIO DETAILS: Size of UDP socket pool        0                          [XLIO_NGINX_UDP_POOL_SIZE]
XLIO DETAILS: Max RX reuse buffs UDP pool    0                          [XLIO_NGINX_UDP_POOL_REUSE_BUFFS]
XLIO DETAILS: fork() support                 Enabled                    [XLIO_FORK]
XLIO DETAILS: close on dup2()                Enabled                    [XLIO_CLOSE_ON_DUP2]
XLIO DETAILS: MTU                            0 (follow actual MTU)      [XLIO_MTU]
XLIO DETAILS: MSS                            0 (follow XLIO_MTU)        [XLIO_MSS]
XLIO DETAILS: TCP CC Algorithm               0 (LWIP)                   [XLIO_TCP_CC_ALGO]
XLIO DETAILS: TCP abort on close             Disabled                   [XLIO_TCP_ABORT_ON_CLOSE]
XLIO DETAILS: Polling Rx on Tx TCP           Disabled                   [XLIO_RX_POLL_ON_TX_TCP]
XLIO DETAILS: RX CQ wait control             Disabled                   [XLIO_RX_CQ_WAIT_CTRL]
XLIO DETAILS: Trig dummy send getsockname()  Disabled                   [XLIO_TRIGGER_DUMMY_SEND_GETSOCKNAME]
XLIO INFO   : Skip CQ polling in rx          Epoll Only                 [XLIO_SKIP_POLL_IN_RX]
XLIO INFO   : ---------------------------------------------------------------------------  

Configuration Parameters Values

The following table lists the XLIO configuration parameters and their possible values.

XLIO Configuration ParameterDescription and Examples

XLIO_TRACELEVEL

PANIC = 0 Panic level logging.
This trace level causes fatal behavior and halts the application, typicallXLIO_TX_NONBLOCKED_EAGAINSy caused by memory allocation problems. PANIC level is rarely used.

ERROR = 1 – Runtime errors in XLIO.
Typically, this trace level assists you to identify internal logic errors, such as errors from underlying OS or InfiniBand verb calls, and internal double mapping/unmapping of objects.

WARN = WARNING = 2– Runtime warning that does not disrupt the application workflow.
A warning may indicate problems in the setup or in the overall setup configuration. For example, address resolution failures (due to an incorrect routing setup configuration), corrupted IP packets in the receive path, or unsupported functions requested by the user application.

INFO = INFORMATION = 3– General information passed to the application user.
This trace level includes configuration logging or general information to assist you with better use of the XLIO library.

DETAILS – Greater general information passed to the user of the application.
This trace level includes printing of all environment variables of XLIO at start up.

DEBUG = 4 – High-level insight to the operations performed in XLIO.
All socket API calls are logged in this logging level, and internal high-level control channels log their activity.

FINE = FUNC = 5 – Low-level runtime logging of activity.
This logging level includes basic Tx and Rx logging in the fast path. Note that using this setting lowers application performance. We recommend that you use this level with the XLIO_LOG_FILE parameter.

FINER = FUNC_ALL = 6 – Very low-level runtime logging of activity. This logging level drastically lowers application performance. We recommend that you use this level with the XLIO_LOG_FILE parameter.

XLIO_LOG_DETAILS

Provides additional logging details on each log line.
0 = Basic log line
1 = With ThreadId
2 = With ProcessId and ThreadId
3 = With Time, ProcessId, and ThreadId (Time is the amount of milliseconds from the start of the process)
Default: 0
For XLIO_TRACELEVEL >= 4, this value defaults to 2.

XLIO_LOG_FILE

Redirects all XLIO logging to a specific user-defined file.
This is very useful when raising the XLIO_TRACELEVEL.
The XLIO replaces a single '%d' appearing in the log file name with the pid of the process loaded with XLIO. This can help when running multiple instances of XLIO, each with its own log file name.
Example: XLIO_LOG_FILE=/tmp/xlio_log.txt

XLIO_CONFIG_FILE

Sets the full path to the XLIO configuration file.
Example: XLIO_CONFIG_FILE=/tmp/libxlio.conf
Default: /etc/libxlio.conf

LOG_COLORS

Uses a color scheme when logging; red for errors and warnings, and dim for very low level debugs.
XLIO_LOG_COLORS is automatically disabled when logging is done directly to a non-terminal device (for example, when XLIO_LOG_FILE is configured).
Default: 1 (Enabled)

XLIO_CPU_USAGE_STATS

Calculates the XLIO CPU usage during polling hardware loops. This information is available through XLIO stats utility.
Default: 0 (Disabled)

XLIO_APPLICATION_ID

Specifies a group of rules from libxlio.conf for XLIO to apply.
Example: XLIO_APPLICATION_ID=iperf_server
Default: XLIO_DEFAULT_APPLICATION_ID (match only the '*' group rule)

XLIO_HANDLE_SIGINTR

When enabled, the XLIO handler is called when an interrupt signal is sent to the process.
XLIO also calls the application's handler, if it exists.
Range: 0 to 1
Default: 0 (Enabled)

XLIO_HANDLE_SIGSEGV

When enabled, a print backtrace is performed, if a segmentation fault occurs.
Range: 0 to 1
Default: 1 (Disabled)

XLIO_STATS_FD_NUM

Maximum number of sockets monitored by the XLIO statistics mechanism.
Range: 0 to 1024
Default: 100

XLIO_STATS_FILE

Redirects socket statistics to a specific user-defined file.
XLIO dumps each socket's statistics into a file when closing the socket.
Example:  XLIO_STATS_FILE=/tmp/stats

XLIO_STATS_SHMEM_DIR

Sets the directory path for XLIO to create the shared memory files for xlio_stats.
If this value is set to an empty string: “ “, no shared memory files are created.
Default: /tmp/

XLIO_XLIOD_NOTIFY_DIR

Sets the directory path for XLIO to write files used by xliod.
Default value is /tmp/xlio
Note: when used xliod must be run with --notify-dir directing the same folder.

XLIO_TCP_MAX_SYN_RATE

Limits the number of TCP SYN packets that XLIO handles per second for each listen socket.
Example: by setting this value to 10, the maximal number of TCP connection accepted by XLIO per second for each listen socket will be 10. 
Set this value to 0 for XLIO to handle an unlimited number of TCP SYN packets per second for each listen socket.
Value range is 0 to 100000.
Default value is 0 (no limit)

XLIO_TX_SEGS_TCP

Number of TCP LWIP segments allocation for each XLIO process.
Default: 1000000

XLIO_TX_BUFS

Number of global Tx data buffer elements allocation.
Default: 200000

XLIO_TX_WRE

Number of Work Request Elements allocated in all transmit QP's. The number of QP's can change according to the number of network offloaded interfaces.
Default: 3000
The size of the Tx buffers is determined by the XLIO_MTU parameter value (see below).
If this value is raised, the packet rate peaking can be better sustained; however, this increases memory usage. A smaller number of data buffers gives a smaller memory footprint, but may not sustain peaks in the data rate.

XLIO_TX_WRE_BATCHING

Controls the number of aggregated Work Requests Elements before receiving a completion signal (CQ entry) from the hardware. Previously this number was hard coded as 64.
The new update allows a better control of the jitter encountered in the Tx completion handling.
Valid value range: 1-64
Default: 64

XLIO_TX_MAX_INLINE

Max send inline data set for QP.
Data copied into the INLINE space is at least 32 bytes of headers and the rest can be user datagram payload.
XLIO_TX_MAX_INLINE=0 disables INLINEing on the TX transmit path. In older releases this parameter was called XLIO_MAX_INLINE.
Default: 220

XLIO_TX_MC_LOOPBACK

Sets the initial value used internally by the XLIO to control multicast loopback packet behavior during transmission. An application that calls setsockopt() with IP_MULTICAST_LOOP overwrites the initial value set by this parameter.
Range:  0 - Disabled, 1 - Enabled
Default: 1

XLIO_TX_NONBLOCKED_EAGAINS

Returns value 'OK' on all send operations performed on a non-blocked UDP socket. This is the OS default behavior. The datagram sent is silently dropped inside the XLIO or the network stack.
When set to Enabled (set to 1), XLIO returns with error EAGAIN in case it could not perform the send operation, and the datagram was dropped.
In both cases, a dropped Tx statistical counter is incremented.
Default: 0 (Disabled)

XLIO_TX_PREFETCH_BYTES

Accelerates an offloaded send operation by optimizing the cache. Different values give an optimized send rate on different machines. We recommend that you adjust this parameter to your specific hardware.
Range: 0 to MTU size
Disable with a value of 0
Default: 256 bytes

XLIO_TX_BUFS_BATCH_TCP

The number of buffers fetched from the ring pool by a socket at once. Higher number for less ring accesses to fetch buffers. Lower number for less memory consumption by a socket.

Min value: 1
Default value: 16

XLIO_RX_BUFS

The number of Rx data buffer elements allocated for the processes. These data buffers are used by all QPs on all HCAs, as determined by the XLIO_QP_LOGIC.
Default: 200000

XLIO_RX_WRE 

The number of Work Request Elements allocated in all received QPs.
Default: 16000 

XLIO_RX_WRE_BATCHING

Number of Work Request Elements and RX buffers to batch before recycling.
Batching decreases the latency mean, but might increase latency STD.
Valid value range: 1-1024
Default: 1024

XLIO_RX_BYTES_MIN

The minimum value in bytes used per socket by the XLIO when applications call to setsockopt(SO_RCVBUF).
If the application tries to set a smaller value than configured in XLIO_RX_BYTES_MIN, XLIO forces this minimum limit value on the socket.
XLIO offloaded sockets receive the maximum amount of ready bytes. If the application does not drain sockets and the byte limit is reached, newly received datagrams are dropped.
The application's socket usage of current, max,dropped bytes and packet counters, can be monitored using xlio_stats.
Default: 65536

XLIO_RX_POLL

The number of times to unsuccessfully poll an Rx for XLIO packets before going to sleep.
Range:   -1, 0 … 100,000,000
Default: 100,000
This value can be reduced to lower the load on the CPU. However, the price paid for this is that the Rx latency is expected to increase.
Recommended values:

  • 10000 – when CPU usage is not critical and Rx path latency is critical.
  • 0 – when CPU usage is critical, while Rx path latency is not.
  • -1 – causes infinite polling.

Once the XLIO has gone to sleep, if it is in blocked mode, it waits for an interrupt; if it is in non-blocked mode, it returns -1.
This Rx polling is performed when the application is working with direct blocked calls to read(), recv(), recvfrom(), and recvmsg().
When the Rx path has successful poll hits, the latency improves dramatically. However, this causes increased CPU utilization. For more information, see Debugging, Troubleshooting, and Monitoring.

XLIO_RX_POLL_INIT

XLIO maps all UDP sockets as potential Offloaded-capable. Only after ADD_MEMBERSHIP is set, the offload starts working and the CQ polling starts XLIO.
This parameter controls the polling count during this transition phase where the socket is a UDP unicast socket and no multicast addresses were added to it.
Once the first ADD_MEMBERSHIP is called, the XLIO_RX_POLL (above) takes effect.
Value range is similar to the XLIO_RX_POLL (above).
Default: 0

XLIO_RX_UDP_POLL_OS_RATIO

Defines the ratio between XLIO CQ poll and OS FD poll.
This will result in a single poll of the not-offloaded sockets every XLIO_RX_UDP_POLL_OS_RATIO offloaded socket (CQ) polls. No matter if the CQ poll was a hit or miss. No matter if the socket is blocking or non-blocking.
When disabled, only offloaded sockets are polled.
This parameter replaces the two old parameters:

  • XLIO_RX_POLL_OS_RATIO and
  • XLIO_RX_SKIP_OS

Disable with 0
Default: 10

XLIO_NGINX_UDP_POOL_SIZEDefines the size of UDP socket pool for NGINX.
For any value different that 0 - close() socket will not destroy the socket, but will place it in a pool for the next socket UDP creation.
Disable with 0
Default: 0

XLIO_HW_TS_CONVERSION

Defines timestamp conversion method.
The value of XLIO_HW_TS_CONVERSION is determined by all devices, that is, if the hardware of one device does not support the conversion, then it will be disabled for the other devices.

Currently only UDP RX flow is supported.
Options = [0,1,2,3,4]:

  • 0 Disabled
  • 1 Raw-HW time
    Only convert the timestamp to seconds.nano_seconds time units (or disable if hardware does not supports).
  • 2 Best possible Raw-HW or system time.
    Sync to system time, then Raw hardware time. Disable if none of them are supported by hardware.
  • 3 Sync to system time
    Convert the timestamp to seconds.nano_seconds time units. Comparable to UDP receive software timestamp. Disable if hardware does not support.
  • 4 PTP Sync
    Convert the timestamp to seconds.nano_seconds time units. In case it is not supported will apply option 3 (or disable if hardware does not support).
  • 5 - RTC Sync
    Convert the time stamp to seconds.nano_seconds time units. In case it is not supported - will apply option 3 (or disable if hardware does not support it)

Default value: 3 (Sync to system time)

XLIO_RX_POLL_YIELD

When an application is running with multiple threads on a limited number of cores, there is a need for each thread polling inside XLIO (read, readv, recv, and recvfrom) to yield the CPU to another polling thread so as not to starve them from processing incoming packets.
Default: 0 (Disabled)

XLIO_RX_PREFETCH_BYTES

The size of the receive buffer to prefetch into the cache while processing ingress packets.
The default is a single cache line of 64 bytes which should be at least 32 bytes to cover the IP+UDP headers and a small part of the user payload.
Increasing this size can help improve performance for larger user payloads.
Range: 32 bytes to MTU size
Default: 256 bytes

XLIO_RX_CQ_DRAIN_RATE_NSEC

Socket's receive path CQ drain logic rate control.
When disabled (default), the socket's receive path attempts to return a ready packet from the socket's receive ready packet queue. If the ready receive packet queue is empty, the socket checks the CQ for ready completions for processing.
When enabled, even if the socket's receive ready packet queue is not empty, this parameter checks the CQ for ready completions for processing. This CQ polling rate is controlled in nanosecond resolution to prevent CPU consumption due to over CQ polling. This enables improved 'real-time' monitoring of the socket ready packet queue.
Recommended value is 100-5000 (nsec)
Default: 0 (Disabled)

XLIO_RX_POLL_ON_TX_TCP

Enables TCP RX polling during TXP TX operation for faster TCP ACK reception
Default: 0 (Disabled)

XLIO_GRO_STREAMS_MAX

Controls the number of TCP streams to perform GRO (generic receive offload) simultaneously.
Disable GRO with a value of 0.
Default: 32

XLIO_TCP_3T_RULES


Uses only 3 tuple rules for TCP, instead of using 5 tuple rules.
This can improve performance for a server with a listen socket which accepts many connections from the same source IP.
Enable with a value of 1.
Default: 0 (Disabled)

XLIO_UDP_3T_RULES

This parameter is relevant in case the application uses connected UDP sockets. 3 tuple rules are used in hardware flow steering rule when the parameter is enabled, and in 5 tuple flow steering rule when it is disabled. Enabling this option can reduce hardware flow steering resources. However, when it is disabled, the application might see benefits in latency and cycles per packet. Default: 1 (Enable)

XLIO_ETH_MC_L2_ONLY_RULES

Uses only L2 rules for Ethernet Multicast.
All loopback traffic will be handled by XLIO instead of OS.
Enable with a value of 1.
Default: 0 (Disabled)

XLIO_STRQ

Enables and disables Striding Receive Queues.

Each WQE in a Striding RQ may receive several packets. Thus, the WQE buffer size is controlled by XLIO_STRQ_NUM_STRIDES x XLIO_STRQ_STRIDE_SIZE_BYTES

Values: on, off

Default: on (Enabled)

XLIO_STRQ_NUM_STRIDES


The number of strides in each receive WQE. Must be power of two and in range [512 - 65536].

Default: 16384

XLIO_STRQ_STRIDE_SIZE_BYTES

The size, in bytes, of each stride in a receive WQE. Must be power of two and in range [64 - 8192].

Default: 512

XLIO_STRQ_STRIDES_NUM_BUFS

The initial number of stride objects in the strides pool. Each received packet is represented by a stride object. Each stride object points to a portion of a buffer allocated for a receive WQE. When the pool runs out of stride objects it expands by another portion of this value.

Default: 262144

XLIO_STRQ_STRIDES_COMPENSATION_LEVEL

Number of spare stride objects a CQ holds to allow faster allocation of a stride object when a packet arrives.

Default: 16384

XLIO_SELECT_POLL

The duration in micro-seconds (usec) in which to poll the hardware on Rx path before blocking for an interrupt (when waiting and also when calling select(), poll(), or epoll_wait()).
Range:  -1, 0 … 100,000,000
Default: 100,000
When the selected path has successfully received poll hits, the latency improves dramatically. However, this comes at the expense of CPU utilization. For more information, see Debugging, Troubleshooting, and Monitoring.

XLIO_SELECT_POLL_OS_RATIO

This enables polling the OS file descriptors while the user thread calls select(), poll(), or epoll_wait(), and XLIO is busy in the offloaded socket polling loop. This results in a single poll of the non-offloaded sockets every XLIO_SELECT_POLL_RATIO offloaded socket (CQ) polls.
When disabled, only offloaded sockets are polled.
(See XLIO_SELECT_POLL for more information.)
Disable with 0
Default: 10

XLIO_SELECT_SKIP_OS

In select(), poll(), or epoll_wait()forces the XLIO to check the non-offloaded sockets even though an offloaded socket has a ready packet that was found while polling.
Range:  0 … 10,000
Default: 4

XLIO_CQ_POLL_BATCH_MAX

The maximum size of the array while polling the CQs in the XLIO.
Default: 16

XLIO_PROGRESS_ENGINE_INTERVAL

Internal XLIO thread safety which checks that the CQ is drained at least once every N milliseconds. This mechanism allows XLIO to progress the TCP stack even when the application does not access its socket (so it does not provide a context to XLIO). If the CQ was already drained by the application receive socket API calls, this thread goes back to sleep without any processing.
Disable with 0
Default: 10 milliseconds

XLIO_PROGRESS_ENGINE_WCE_MAX

Each time the XLIO's internal thread starts its CQ draining, it stops when it reaches this maximum value.
The application is not limited by this value in the number of CQ elements that it can ProcessId from calling any of the receive path socket APIs.
Default: 2048

XLIO_CQ_MODERATION_ENABLE

Enable CQ interrupt moderation.
Default: 1 (Enabled)

XLIO_CQ_MODERATION_COUNT

Number of packets to hold before generating interrupt.
Default: 48

XLIO_CQ_MODERATION_PERIOD_USEC

Period in microseconds for holding the packet before generating interrupt.
Default: 50

XLIO_CQ_AIM_MAX_COUNT

Maximum count value to use in the adaptive interrupt moderation algorithm.
Default: 560

XLIO_CQ_AIM_MAX_PERIOD_USEC

Maximum period value to use in the adaptive interrupt moderation algorithm.
Default: 250

XLIO_CQ_AIM_INTERVAL_MSEC

Frequency of interrupt moderation adaptation.
Interval in milliseconds between adaptation attempts.
Use value of 0 to disable adaptive interrupt moderation.
Default: 250

XLIO_CQ_AIM_INTERRUPTS_RATE_PER_SEC


Desired interrupts rate per second for each ring (CQ).
The count and period parameters for CQ moderation will change automatically to achieve the desired interrupt rate for the current traffic rate.
Default: 5000

XLIO_CQ_KEEP_QP_FULL

If disabled (default), the CQ does not try to compensate for each poll on the receive path. It uses a "debt" to remember how many WRE are missing from each QP, so that it can fill it when buffers become available.
If enabled, CQ tries to compensate QP for each polled receive completion. If there is a shortage of buffers, it reposts a recently completed buffer. This causes a packet drop, and is monitored in xlio_stats.
Default: 1 (Enabled)

XLIO_QP_COMPENSATION_LEVEL

The number of spare receive buffer CQ holds that can be allowed for filling up QP while full receive buffers are being processed inside XLIO.
Default: 256 buffers

XLIO_OFFLOADED_SOCKETS

Creates all sockets as offloaded/not-offloaded by default.

  • 1 is used for offloaded
  • 0 is used for not-offloaded

Default: 1 (Enabled)

XLIO_TIMER_RESOLUTION_MSEC

Control XLIO internal thread wakeup timer resolution (in milliseconds).
Default: 10 (milliseconds)

XLIO_TCP_TIMER_RESOLUTION_MSEC

Controls XLIO internal TCP timer resolution (fast timer) (in milliseconds). Minimum value is the internal thread wakeup timer resolution (XLIO_TIMER_RESOLUTION_MSEC).
Default: 100 (milliseconds)

XLIO_TCP_CTL_THREAD

Does all TCP control flows in the internal thread.
This feature should be disabled if using blocking poll/select (epoll is OK).

  • Use value of 0 to disable
  • Use value of 1 to wake up the thread when there is work to be done
  • Use value of 2 to wait for thread timer to expire

Default: 0 (disabled)

XLIO_TCP_TIMESTAMP_OPTION

Currently, LWIP is not supporting RTTM and PAWS mechanisms.
See RFC1323 for info.

  • Use value of 0 to disable (enabling causing a slight performance degradation of ~50-100 nano sec per half round trip).
  • Use value of 1 for enable.
  • Use value of 2 for OS follow up.

Default: 0 (disabled)

XLIO_TCP_NODELAY

If set, it disables the Nagle algorithm option for each TCP socket during initialization. Meaning that TCP segments are always sent as soon as possible, even if there is only a small amount of data.
For more information on TCP_NODELAY flag refer to TCP manual page.
Valid Values are:

  • 0 to disable.
  • 1 to enable (default)

XLIO_TCP_QUICKACK

If set, it disables the delayed acknowledge ability. Meaning that TCP will respond after every packet.
For more information on TCP_QUICKACK flag refer to TCP manual page.
Valid Values are:

  • 0 to disable.
  • 1 to enable (default)

XLIO_EXCEPTION_HANDLING

Handles missing support or error cases in Socket API or functionality by XLIO.
It quickly identifies XLIO unsupported Socket API or features.

  • Use value of -1 to handle DEBUG severity
  • Use value of 0 to log DEBUG message and try recovering via Kernel network stack (un-offloading the socket)
  • Use value of 1 to log ERROR message and try recovering via Kernel network stack (un-offloading the socket)
  • Use value of 2 to log ERROR message and return API respectful error code
  • Use value of 3 to log ERROR message and abort application (throw xlio_error exception).

Default: -1

XLIO_AVOID_SYS_CALLS_ON_TCP_FD

For TCP fd, avoid system calls for the supported options of: ioctl, fcntl, getsockopt, setsockopt.
Non-supported options will go to OS.
To activate, use XLIO_AVOID_SYS_CALLS_ON_TCP_FD=1.
Default: 0 (disabled)

XLIO_THREAD_MODE

By default XLIO is ready for multi-threaded applications, meaning it is thread-safe.
If the user application is single threaded, use this configuration parameter to help eliminate XLIO locks and improve performance.
Values:

  • 0 Single-threaded application
  • 1 Multi-threaded application with spin lock
  • 2 Multi-threaded application with mutex lock
  • 3 Multi-threaded application with more threads than cores using spin lock

Default: 1 (Multi with spin lock)

XLIO_MEM_ALLOC_TYPE

This replaces the XLIO_HUGETBL parameter logic.
XLIO will try to allocate data buffers as configured:

  • 0 "ANON" using malloc
  • 1 "CONTIG" using contiguous pages
  • 2 "HUGEPAGES" using huge pages.

OFED will also try to allocate QP & CQ memory accordingly:

  • 0 "ANON" default use current pages ANON small ones.
  • "HUGE" force huge pages
  • "CONTIG" force contig pages
    • 1 "PREFER_CONTIG" try contig fallback to ANON small pages.
    • "PREFER_HUGE" try huge fallback to ANON small pages.
      • 2 "ALL" try huge fallback to contig if failed fallback to ANON small pages.

To override OFED use: (MLX_QP_ALLOC_TYPE, MLX_CQ_ALLOC_TYPE).
Default: 1 (Contiguous pages)

XLIO_FORK

Controls XLIO fork support. Setting this flag on will cause XLIO to call ibv_fork_init() function. ibv_fork_init() initializes libibverbs's data structures to handle fork() function calls correctly and avoid data corruption.
If ibv_fork_init() is not called or returns a non-zero status, then libibverbs data structures are not fork()-safe and the effect of an application calling fork() is undefined.
ibv_fork_init() works on Linux kernels 2.6.17 and later, which support the MADV_DONTFORK flag for madvise().
XLIO allocates huge pages (XLIO_HUGETBL) by default.
For limitations of using fork() with XLIO, please refer to the Release Notes.
Default: 1 (Enabled)

XLIO_MTU

Size of each Rx and Tx data buffer (Maximum Transfer Unit).
This value sets the fragmentation size of the packets sent by the XLIO library.

  • If XLIO_MTU is 0, then for each interface XLIO will follow the actual MTU
  • If XLIO_MTU is greater than 0, then this MTU value is applicable to all interfaces regardless of their actual MTU

Default: 0 (following interface actual MTU)

XLIO_MSS

Defines the max TCP payload size that can be sent without IP fragmentation.
Value of 0 will set XLIO's TCP MSS to be aligned with XLIO_MTU configuration (leaving 40 bytes of room for IP + TCP headers; "TCP MSS = XLIO_MTU - 40").
Other XLIO_MSS values will force XLIO's TCP MSS to that specific value.
Default: 0 (following XLIO_MTU)

XLIO_CLOSE_ON_DUP2

When this parameter is enabled, XLIO handles the duplicated file descriptor (oldfd), as if it is closed (clear internal data structures) and only then forwards the call to the OS.
This is, in effect, a very rudimentary dup2 support. It supports only the case where dup2 is used to close file descriptors.
Default: 1 (Enabled)

XLIO_INTERNAL_THREAD_AFFINITY

Controls which CPU core(s) the XLIO internal thread is serviced on. The CPU set should be provided as either a hexadecimal value that represents a bitmask or as a comma delimited of values (ranges are ok). Both the bitmask and comma delimited list methods are identical to what is supported by the taskset command. See the man page on taskset for additional information.

The -1 value disables the Internal Thread Affinity setting by XLIO.

Bitmask examples:

0x00000001 – Run on processor 0
0x00000007 – Run on processors 1,2, and 3

Comma delimited examples:

0,4,8 – Run on processors 0,4, and 8
0,1,7-10 – Run on processors 0,1,7,8,9 and 10

Default: -1.

XLIO_INTERNAL_THREAD_CPUSET

Selects a CPUSET for XLIO internal thread (For further information, see man page of cpuset).
The value is either the path to the CPUSET (for example: /dev/cpuset/my_set), or an empty string to run it on the same CPUSET the process runs on.

XLIO_INTERNAL_THREAD_ARM_CQ

Wakes up the internal thread for each packet that the CQ receives.
Polls and processes the packet and brings it to the socket layer.
This can minimize latency for a busy application that is not available to receive the packet when it arrives.
However, this might decrease performance for high pps rate applications.  
Default: 0 (Disabled)

XLIO_INTERNAL_THREAD_TCP_TIMER_HANDLING

Selects the internal thread policy when handling TCP timers.
Use value of 0 for deferred handling. The internal thread will not handle TCP timers upon timer expiration (once every 100ms) in order to let application threads handling it first.
Use value of 1 for immediate handling. The internal thread will try locking and handling TCP timers upon timer expiration (once every 100ms). Application threads may be blocked till internal thread finishes handling TCP timers
Default value is 0 (deferred handling)

XLIO_WAIT_AFTER_JOIN_MSEC

This parameter indicates the time of delay the first packet is send after receiving the multicast JOINED event from the SM.
This is helpful to overcome loss of first few packets of an outgoing stream due to SM lengthy handling of MFT configuration on the switch chips.
Default: 0 (milli-sec)

XLIO_NEIGH_UC_ARP_QUATA

XLIO will send UC ARP in case neigh state is NUD_STALE.
If that neigh state is still NUD_STALE XLIO will try
XLIO_NEIGH_UC_ARP_QUATA retries to send UC ARP again and then will send BC ARP.
Default: 3

XLIO_NEIGH_UC_ARP_DELAY_MSEC

This parameter indicates number of msec to wait between every UC ARP.
Default: 10000

XLIO_NEIGH_NUM_ERR_RETRIES

Indicates number of retries to restart NEIGH state machine if NEIGH receives ERROR event.
Default: 1

XLIO_BF

Enables/disables BlueFlame usage of the card.
Default: 1 (Enabled)

XLIO_TSO

With Segmentation Offload, or TCP Large Send, TCP can pass a buffer to be transmitted that is bigger than the maximum transmission unit (MTU) supported by the medium. Intelligent adapters implement large sends by using the prototype TCP and IP headers of the incoming send buffer to carve out segments of required size. Copying the prototype header and options, then calculating the sequence number and checksum fields creates TCP segment headers.

Expected benefits: Throughput increase and CPU unload.

Default value: auto

auto

    Depends on ethtool setting and adapter ability.

    See ethtool -k <eth0> | grep tcp-segmentation-offload

on

    Enabled in case adapter supports it

off

    Disabled

XLIO_LRO

Large receive offload (LRO) is a technique for increasing inbound throughput of high-bandwidth network connections by reducing central processing unit (CPU) overhead. It works by aggregating multiple incoming packets from a single stream into a larger buffer before they are passed higher up the networking stack, thus reducing the number of packets that must be processed.

Default value: auto 

auto

    Depends on ethtool setting and adapter ability.

    See ethtool -k <eth0> | grep large-receive-offload

on

    Enabled in case adapter supports it

off

    Disabled

XLIO_TRIGGER_DUMMY_SEND_GETSOCKNAME

This parameter triggers dummy packet sent from getsockname() to warm up the caches.
For more information see section "Dummy Send" to Improve Low Message Rate Latency.
Default: 0 (Disable)

XLIO_UTLS_TX

When this parameter is enabled, XLIO offloads TLS TX path through kTLS API if possible.

Default: 1 (Enabled)

XLIO_UTLS_RX

When this parameter is enabled, XLIO offloads TLS RX path through kTLS API if possible.

Default: 1 (Enabled)

XLIO_SPEC

XLIO_SPEC sets all the required configuration parameters of XLIO. Usually, no additional configuration is required.

Example #1: XLIO_SPEC=latency (XLIO predefined specification profile for latency: Latency profile spec – optimized latency on all use cases. System is tuned to keep balance between Kernel and XLIO. Note: It may not reach the maximum bandwidth)

Example #2: XLIO_SPEC=multi_ring_latency (XLIO predefined specification profile for Multi ring latency – optimized for use cases that are keen on latency where two applications communicate using send-only and receive-only TCP sockets)

XLIO_NGINX_WORKER_NUMThis number must be equal to ‘worker_processes’ attribute of nginx configuration file.
Default: 0 (Disable)

XLIO_RING_ALLOCATION_LOGIC_TX

XLIO_RING_ALLOCATION_LOGIC_RX

Ring allocation logic is used to separate the traffic into different rings.
By default, all sockets use the same ring for both RX and TX over the same interface. For different interfaces, different rings are used, even when specifying the logic to be per socket or thread.
Using different rings is useful when tuning for a multi-threaded application and aiming for HW resource separation.

This feature might decrease performance for applications which their main processing loop is based on select() and/or poll().

The logic options are:

  • 0 – Ring per interface
  • 1 – Ring per IP address (using IP address)
  • 10 – Ring per socket (using socket ID as separator)
  • 20 – Ring per thread (using the ID of the thread in which the socket was created)
  • 30 – Ring per core (using CPU ID)
  • 31 – Ring per core - attach threads: attach each thread to a CPU core

Default: 0

XLIO_DISABLE_FLOW_TAG

Enable/Disable flow-tag. Flow Tags improve RX performance for 5T rules.

Default: 0 (Flow tag enabled)
XLIO_SKIP_POLL_IN_RX

Allow TCP socket to skip CQ polling in Rx socket call.

0 - Disabled

1 - Skip always

2 - Skip only if this socket was added to epoll before

Default: 0 (Disabled)

XLIO_SOCKETXTREMEEnable the Spcketxtreme API. The potential benefits of SocketXtreme are elimination of the copy operations, higher throughput, and lower latency.

XLIO_TCP_NODELAY_TRESHOLD

Adding a threshold parameter will trigger TCP_NODELAY only if the first segment length is larger than the threshold.

This skips the TCP_NODELAY for small segments and allows some aggregation.

Only the first segment is checked because the current behavior triggers output regardless of TCP_NODELAY when an unsent list contains at least two segments.

The value is in bytes.

Default: 0

XLIO_DEFERRED_CLOSE

Postpone the close(2) syscall to the socket destructor to prevent the kernel from freeing the resources before RFS destruction. This variable may lead to unexpected EADDRINUSE and excessive fd consumption in the application.

0 - Disabled

1 - Enable

Default: 0 (Disabled)

XLIO_TX_SEGS_RING_BATCH_TCP

The number of TCP segments fetched from the segments pool by a ring at once.

Min value: 1

Default: 1024

XLIO_MULTILOCK

Control locking type mechanism for some specific flows.

Note that usage of Mutex might increase latency.

0 - Spin

1 - Mutex

Default: 0 (Spin)


Beta Level Features Configuration Parameters

The following table lists configuration parameters and their possible values for new XLIO Beta level features. The parameters below are disabled by default.

These XLIO features are still experimental and subject to changes. They can help improve performance of multithread applications.

We recommend altering these parameters in a controlled environment until reaching the best performance tuning.

XLIO Configuration ParameterDescription and Examples

XLIO_RING_MIGRATION_RATIO_TX 

XLIO_RING_MIGRATION_RATIO_RX

Ring migration ratio is used with the "ring per thread" logic in order to decide when it is beneficial to replace the socket's ring with the ring allocated for the current thread.
Each XLIO_RING_MIGRATION_RATIO iteration (of accessing the ring), the current thread ID is checked to see whether the ring matches the current thread.
If not, ring migration is considered. If the ring continues to be accessed from the same thread for a certain iteration, the socket is migrated to this thread ring.
Use a value of -1 in order to disable migration.
Default: 100

XLIO_RING_LIMIT_PER_INTERFACE

Limits the number of rings that can be allocated per interface.
For example, in ring allocation per socket logic, if the number of sockets using the same interface is larger than the limit, several sockets will share the same ring.

XLIO_RX_BUFS might need to be adjusted in order to have enough buffers for all rings in the system. Each ring consumes XLIO_RX_WRE buffers.

Use a value of 0 for an unlimited number of rings.
Default: 0 (no limit)

XLIO_RING_DEV_MEM_TX

XLIO can use the on-device-memory to store the egress packet if it does not fit into the BF inline buffer. This improves application egress latency by reducing the PCI transactions.
Using XLIO_RING_DEV_MEM_TX, enables the user to set the amount of the on-device-memory buffer allocated for each TX ring.
The total size of the on-device-memory is limited to 256k for a single port HCA and to 128k for dual port HCA.
Default value is 0

XLIO_TCP_CC_ALGO

TCP congestion control algorithm.
The default algorithm coming with LWIP is a variation of Reno/New-Reno.
The new Cubic algorithm was adapted from FreeBsd implementation.
Use value of 0 for LWIP algorithm.
Use value of 1 for the Cubic algorithm.
Use value of 2 in order to disable the congestion algorithm.
Default: 0 (LWIP).

XLIO_NGINX_WORKER_NUM

This number must be equal to ‘worker_processes’ attribute of nginx configuration file.

Default: 0 (Disable)

Loading XLIO Dynamically

XLIO can be loaded using Dynamically Loaded (DL) libraries. These libraries are not automatically loaded at program link time or start-up as with LD_PRELOAD. Instead, there is an API for opening a library, looking up symbols, handling errors, and closing the library.

The example below demonstrates how to load socket() function. Similarly, users should load all other network-related functions as declared in sock-redirect.h

#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
#include <arpa/inet.h>
#include <sys/socket.h>

typedef int (*socket_fptr_t) (int __domain, int __type, int __protocol);

int main(int argc, const char** argv)
{
        void* lib_handle;
        socket_fptr_t xlio_socket;
        int fd;

        lib_handle = dlopen("libxlio.so", RTLD_LAZY);
        if (!lib_handle) {
                printf("FAILED to load libxlio.so\n");
                exit(1);
        }

        xlio_socket = (socket_fptr_t)dlsym(lib_handle, "socket");
        if (xlio_socket == NULL)  {
                printf("FAILED to load socket()\n");
                exit(1);
        }

        fd = xlio_socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
        if (fd < 0) {
                printf("FAILED open socket()\n");
                exit(1);
        }

        printf("socket creation succeeded fd = %d\n", fd);
        close(fd);
        dlclose(lib_handle);
        return 0;
}

For more information, please refer to dlopen man page.

For a complete example that includes all the necessary functions, see sockperf’s xlio-redirect.h and xlio_socket-redirect.cpp files.