XLIO Environment Variables
XLIO configuration is performed using environment variables. For the full list of XLIO parameters, please see libxlio README file.
XLIO parameters must be set prior to loading the application with XLIO. You can set the parameters in a system file, which can be run manually or automatically.
All the parameters have defaults that can be modified.
On default startup, the XLIO library prints the XLIO version information, as well as the configuration parameters being used and their values to stderr.
XLIO always logs the values of the XLIO_TRACELEVEL parameter, even when it matches the default setting.
For all other parameters, XLIO logs the parameter values only when they are not equal to the default value.
The following table lists configuration parameters which can be used to tune XLIO performance for more specific use cases.
Configures the execution model of XLIO. See XLIO Library Architecture for more information about execution models.
0 - Run to completion execution model (Default)
Positive Number - Worker Threads execution model - The value determines the number of XLIO worker threads.
With Segmentation Offload, or TCP Large Send, TCP can pass a buffer to be
transmitted that is bigger than the maximum transmission unit (MTU) supportedby the medium. Intelligent adapters implement large sends by using theprototype TCP and IP headers of the incoming send buffer to carve out segmentsof required size. Copying the prototype header and options, then calculatingthe sequence number and checksum fields creates TCP segment headers.Expected benefits: Throughput increase and CPU unload.Default value: auto (Depends on ethtool setting and adapter ability.See ethtool -k <eth0> | grep tcp-segmentation-offload)
Set XLIO_TSO=1 to ensure Segmentation Offload is on for TX throughput oriented applications.
Large receive offload (LRO) is a technique for increasing inbound throughput of
high-bandwidth network connections by reducing central processing unit (CPU)overhead. It works by aggregating multiple incoming packets from a single streaminto a larger buffer before they are passed higher up the networking stack,thus reducing the number of packets that must be processed.Default value: auto (Depends on ethtool setting and adapter ability.See ethtool -k <eth0> | grep large-receive-offload)
Set XLIO_LRO=1 to ensure LRO is turned on for RX throughput oriented applications.
Max size of the array while polling the RX CQs in XLIO.
Default value is 16
Control the number of TCP streams to perform GRO (generic receive offload) simultaneously.
Disable GRO with a value of 0.A GRO session is flushed after each RX poll cycle. See XLIO_CQ_POLL_BATCH_MAX.Default value is 32
The duration in micro-seconds (usec) in which to poll the hardware on Rx path before
going to sleep (pending an interrupt blocking on OS select(), poll() or epoll_wait().The max polling duration will be limited by the timeout the user is using whencalling select(), poll() or epoll_wait().When select(), poll() or epoll_wait() path has successful receive poll hits(see performance monitoring) the latency is improved dramatically. This comeson account of CPU utilization.Value range is -1, 0 to 100,000,000Where value of -1 is used for infinite pollingWhere value of 0 is used for no polling (interrupt driven)Default value is 100000
This will enable polling of the OS file descriptors while user thread calls
select() or poll() and the XLIO is busy in the offloaded sockets polling loop.This will result in a single poll of the not-offloaded sockets everyXLIO_SELECT_POLL offloaded sockets (CQ) polls.When disabled, only offloaded sockets are polled.(See XLIO_SELECT_POLL for more info)Disable with 0Default value is 10
If set, disable delayed acknowledge ability.
This means that TCP responds after every packet.For more information on TCP_QUICKACK flag refer to TCP manual page.Valid Values are:Use value of 0 to disable.Use value of 1 for enable.Default value is Disabled.
TCP send buffer size.
Default value is 1MB.
Size of Tx data buffer elements allocation.
Can not be less then MTU (Maximum Transfer Unit) and greater than 0xFF00.Default value is calculated basing on XLIO_MTU and XLIO_MSS.
This parameter enables/disables TCP RX polling during TCP TX operation for faster
TCP ACK reception.Default: 0 (Disabled)
Allow TCP socket to skip CQ polling in rx socket call.
0 - Disabled1 - Skip always2 - Skip only if this socket was added to epoll beforeDefault: 0 (Disabled)
XLIO predefined specification profiles.
latencyOptimized for use cases that are keen on latency.Example: XLIO_SPEC=latency
Control locking type mechanism for some specific flows.
Note that usage of Mutex might increase latency.0 - Spin1 - MutexDefault: 0 (Spin)
Use only 2 tuple rules for TCP connections, instead of using 5 tuple rules.
This can help to overcome steering limitations for outgoing TCP connections.However, this option requires a unique local IP address per XLIO ring. Inthe default ring per thread configuration, this means that each thread mustbind its sockets to a thread local IP address.Default: 0 (Disabled)
In scenarios of high scale of non blocking sockets in event driven usage such as epoll/poll/select
turning on this parameter (XLIO_RX_CQ_WAIT_CTRL=1) avoids high CPU usage inside the Kernel whileprocessing thread wakeup.
Desired interrupts rate per second for each ring (CQ).
The count and period parameters for CQ moderation will change automatically
to achieve the desired interrupt rate for the current traffic rate.
Default value is 10000
Note: Adjusting the settings of this parameter can address different CPU utilization issues - see CPU Utilization Tuning.
Control XLIO internal thread wakeup timer resolution (in milliseconds).
Default value is 10
Note: Adjusting the settings of this parameter can address different CPU utilization issues - see CPU Utilization Tuning.
Control internal TCP timer resolution (fast timer) in milliseconds.
Minimum value is the thread wakeup timer resolution configured in
performance.threading.internal_handler.timer_msec.
Default value is 100
Note: Adjusting the settings of this parameter can address different CPU utilization issues - see CPU Utilization Tuning.
The following table lists configuration parameters and their possible values for new XLIO Beta level features. The parameters below are disabled by default.
These XLIO features are still experimental and subject to changes. They can help improve performance of multithread applications.
We recommend altering these parameters in a controlled environment until reaching the best performance tuning.
XLIO_RING_MIGRATION_RATIO_TX
XLIO_RING_MIGRATION_RATIO_RX
Ring migration ratio is used with the "ring per thread" logic in order to decide when it is beneficial to replace the socket's ring with the ring allocated for the current thread.
Each XLIO_RING_MIGRATION_RATIO iteration (of accessing the ring), the current thread ID is checked to see whether the ring matches the current thread.If not, ring migration is considered. If the ring continues to be accessed from the same thread for a certain iteration, the socket is migrated to this thread ring.Use a value of -1 in order to disable migration.Default: -1
XLIO_RING_DEV_MEM_TX
XLIO can use the on-device-memory to store the egress packet if it does not fit into the BF inline buffer. This improves application egress latency by reducing the PCI transactions.
Using XLIO_RING_DEV_MEM_TX, enables the user to set the amount of the on-device-memory buffer allocated for each TX ring.The total size of the on-device-memory is limited to 256k for a single port HCA and to 128k for dual port HCA .Default value is 0
XLIO_TCP_CC_ALGO
TCP congestion control algorithm.
The default algorithm coming with LWIP is a variation of Reno/New-Reno.The new Cubic algorithm was adapted from FreeBSD implementation.Use value of 0 for LWIP algorithm.Use value of 1 for the Cubic algorithm.Use value of 2 in order to disable the congestion algorithm.Default: 0 (LWIP).