XLIO - Configuration Reference
This document provides a comprehensive reference for all XLIO configuration parameters organized by functional categories.
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
acceleration_control.app_id | Specify a group of rules from |
|
| Example: |
acceleration_control.default_acceleration | Create all sockets as offloaded or not offloaded by default |
|
| Values:
|
acceleration_control.rules | Defines transport protocol and offload settings for specific applications or processes. Maps to configuration in | - |
| Note: Example: |
acceleration_control.rules[].id | Unique identifier for this transport control rule | - | - | - |
acceleration_control.rules[].name | Name of the application this rule applies to | - | - | - |
acceleration_control.rules[].actions | Action directives that modify transport behavior | - | - | Format: use <transport> <role> <address| >:<port range| > See Descriptions table. |
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
applications.nginx.distribute_cq | Distributes completion queue (CQ) processing across NGINX worker processes to improve performance |
|
| Helps balance CQ handling among worker threads for higher throughput |
applications.nginx.src_port_stride | Controls how source ports are distributed across NGINX worker processes |
|
| Determines port stepping between workers; useful for load balancing |
applications.nginx.udp_pool_size | Defines the size of the UDP socket pool for NGINX. When set >0, a closed UDP socket is returned to the pool instead of being destroyed |
|
| Enables reuse of UDP sockets to reduce allocation overhead |
applications.nginx.udp_socket_pool_reuse | Controls reuse of UDP socket pools for NGINX deployments |
|
| Improves efficiency in UDP-heavy traffic patterns |
applications.nginx.workers_num | Number of NGINX worker processes to optimize for. Must be set to offload NGINX successfully |
|
| Required for enabling NGINX offload support |
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
core.daemon.dir | Directory path for XLIO to write files used by |
|
| When used, |
core.daemon.enable | Enable the XLIO daemon service for additional monitoring capabilities |
|
| - |
core.exception_handling.mode | Mode for handling missing support or error cases in the Socket API or other XLIO functionality |
|
|
|
core.quick_init | Avoid extra checks to reduce initialization time (may fail under system misconfiguration) |
|
| Note: If enabled and hugepages are requested beyond the cgroup limit, XLIO may crash |
core.resources.external_memory_limit | Memory limit for external user allocator ( |
|
| Supports suffixes: B, KB, MB, GB |
core.resources.heap_metadata_block_size | Size of metadata block added to every heap allocation |
|
| Supports suffixes: B, KB, MB, GB |
core.resources.hugepages.enable | Use huge pages for data buffers to improve performance by reducing TLB misses; overrides rdma-core parameters |
|
|
|
core.resources.hugepages.size | Force specific hugepage size for internal allocations; |
|
| Must be power of 2 or |
core.resources.memory_limit | Pre-allocated memory limit for buffers. Dynamic allocations may exceed this. |
|
| Supports suffixes: B, KB, MB, GB |
core.signals.sigint.exit | Call XLIO handler on SIGINT and then application's handler (if exists) |
|
| - |
core.signals.sigsegv.backtrace | Print backtrace when a segmentation fault occurs |
|
| - |
core.syscall.allow_privileged_sockopt | Permit use of privileged socket options that may require special permissions |
|
| - |
core.syscall.avoid_ctl_syscalls | For TCP FDs, avoid system calls for supported options ( |
|
| Unsupported options fallback to OS |
core.syscall.deferred_close | Defer closing file descriptors until the socket is actually closed (useful in multithreaded apps) |
|
| - |
core.syscall.dup2_close_fd | Handle |
|
| Rudimentary |
core.syscall.fork_support | Enable |
|
| - |
core.syscall.getsockname_dummy_send | Trigger dummy packet send from |
|
| - |
core.syscall.sendfile_cache_limit | Memory limit for mapping cache used by |
|
| Supports suffixes: B, KB, MB, GB |
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
hardware_features.striding_rq.enable | Enable or disable Striding Receive Queues (each WQE in a Striding RQ can receive multiple packets) |
|
| The WQE buffer size is determined by |
hardware_features.striding_rq.stride_size | Size in bytes of each stride in a receive WQE; must be a power of two and within [64–8192] |
|
| Range: |
hardware_features.striding_rq.strides_num | Number of strides in each receive WQE; must be a power of two and within [512–65536] |
|
| Range: |
hardware_features.tcp.lro | Large Receive Offload (LRO): increases inbound throughput by reducing CPU overhead via packet aggregation |
|
|
|
hardware_features.tcp.tls_offload.dek_cache_max_size | Maximum Data Encryption Key (DEK) cache size for TLS offload |
|
| - |
hardware_features.tcp.tls_offload.dek_cache_min_size | Minimum DEK cache size for TLS offload |
|
| - |
hardware_features.tcp.tls_offload.rx_enable | Offload TLS RX path through kTLS API if possible (uses UTLS for acceleration) |
|
| - |
hardware_features.tcp.tls_offload.tx_enable | Offload TLS TX path through kTLS API if possible (uses UTLS for acceleration) |
|
| - |
hardware_features.tcp.tso.enable | TCP Segmentation Offload (TSO): allows TCP to transmit buffers larger than the MTU using adapter segmentation |
|
|
|
hardware_features.tcp.tso.max_size | Maximum TCP segment size (in bytes) allowed with TSO |
|
| Supports suffixes: B, KB, MB, GB |
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
monitor.exit_report | Print a human-readable report of resource usage at process exit. Printed during termination and may be missed if process ends with |
|
|
|
monitor.log.colors | Use color scheme when logging: red for errors, purple for warnings, dim for low-level debug. Automatically disabled when logging to non-terminal devices. |
|
| - |
monitor.log.details | Add details to each log line. |
|
|
|
monitor.log.file_path | Redirect all logging to a user-defined file. Library replaces a single |
|
| Example: |
monitor.log.level | Logging verbosity level used by the library. |
|
|
|
monitor.stats.cpu_usage | Calculate XLIO CPU usage during polling hardware loops. Results accessible via the XLIO stats utility. |
|
| - |
monitor.stats.fd_num | Maximum number of sockets monitored by XLIO statistics mechanism. Affects how many sockets |
|
| Range: |
monitor.stats.file_path | Redirect socket statistics to a specific file. Each socket’s stats are dumped on close. |
|
| Example: |
monitor.stats.shmem_dir | Directory path for creating shared-memory files for |
|
| - |
Buffers
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
performance.buffers.batching_mode | Controls batching of returning Rx buffers and pulling Tx buffers per socket |
|
|
|
performance.buffers.rx.buf_size | Size of Rx buffer allocation; must be ≥ MTU and ≤ 0xFF00. Default based on max MTU. |
|
| Range: Supports suffixes: B, KB, MB, GB |
performance.buffers.rx.prefetch_before_poll | Prefetch before polling for packets, improves latency in low PPS traffic |
|
| - |
performance.buffers.rx.prefetch_size | Bytes prefetched into cache during ingress packet processing |
|
| Range: |
performance.buffers.tcp_segments.pool_batch_size | TCP segments batched when fetched from the segment pool |
|
| Minimum: 1 |
performance.buffers.tcp_segments.ring_batch_size | TCP segments fetched per ring from the segment pool |
|
| Minimum: 1 |
performance.buffers.tcp_segments.socket_batch_size | TCP segments fetched per socket from the segment pool |
|
| Minimum: 1 |
performance.buffers.tx.buf_size | Size of Tx buffer allocation; must be ≥ MTU and ≤ 0xFF00. Default based on MTU/MSS. |
|
| Range: Supports suffixes: B, KB, MB, GB |
performance.buffers.tx.global_array_size | Number of global zero-copy Tx buffers preallocated |
|
| - |
performance.buffers.tx.prefetch_size | Cache prefetch size for Tx path to optimize send rate |
|
| Range: |
Completion Queue
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
performance.completion_queue.interrupt_moderation.adaptive_change_frequency_msec | Frequency of interrupt moderation adaptation. Interval in milliseconds between adaptation attempts. Use 0 to disable adaptive interrupt moderation |
|
| - |
performance.completion_queue.interrupt_moderation.adaptive_count | Maximum count value to use in the adaptive interrupt moderation algorithm |
|
| - |
performance.completion_queue.interrupt_moderation.adaptive_interrupt_per_sec | Desired interrupts rate per second for each ring (CQ). Count and period parameters will change automatically to achieve the desired rate |
|
| - |
performance.completion_queue.interrupt_moderation.adaptive_period_usec | Maximum period value to use in the adaptive interrupt moderation algorithm |
|
| - |
performance.completion_queue.interrupt_moderation.enable | Enable CQ interrupt moderation. When enabled, hardware only generates an interrupt after some packets are received or after a packet was held for some time |
|
| - |
performance.completion_queue.interrupt_moderation.packet_count | Number of packets to hold before generating interrupt |
|
| - |
performance.completion_queue.interrupt_moderation.period_usec | Period in microseconds for holding the packet before generating interrupt |
|
| - |
performance.completion_queue.keep_full | If disabled, CQ will not try to compensate for each poll on the receive path. Uses a "debt" to remember missing WREs. If enabled, CQ will try to compensate QP for each polled receive completion |
|
| - |
performance.completion_queue.periodic_drain_max_cqes | Each time XLIO's internal thread starts CQ draining, it will stop when it reaches this max value. Applications are not limited by this value |
|
| - |
performance.completion_queue.periodic_drain_msec | XLIO internal thread safe check that the CQ is drained at least once every N milliseconds. Allows library to progress TCP stack when application doesn't access socket |
|
| - |
performance.completion_queue.rx_drain_rate_nsec | Socket's receive path CQ drain logic rate control. When enabled, socket will check CQ for ready completions even if receive ready packet queue is not empty |
|
| Recommended: |
Polling
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
performance.polling.blocking_rx_poll_usec | Number of times to poll on Rx path for ready packets before going to sleep or returning -1. Done when application uses direct blocked calls to read(), recv(), etc. |
|
| Range: |
performance.polling.iomux.poll_os_ratio | Enables polling of OS file descriptors while user thread calls select() or poll(). Results in single poll of not-offloaded sockets every N offloaded sockets polls |
|
| - |
performance.polling.iomux.poll_usec | Duration in microseconds to poll the hardware on Rx path before going to sleep. Max polling duration limited by timeout used in select(), poll() or epoll_wait() |
|
| Range: |
performance.polling.iomux.skip_os | For select() or poll() this forces XLIO to check the non-offloaded fd even though an offloaded socket has ready packets found while polling |
|
| - |
performance.polling.kernel_fd_attention_level | Controls threshold for checking kernel file descriptors during polling. 0 means never check. Affects how often XLIO checks for activity on non-offloaded kernel file descriptors |
|
| - |
performance.polling.max_rx_poll_batch | Maximum number of receive buffers processed in a single poll operation. Max size of array while polling the CQs |
|
| - |
performance.polling.nonblocking_eagain | Return value 'OK' on all send operations done on non-blocked UDP sockets (OS default). When enabled, library will return with error EAGAIN if unable to accomplish send operation |
|
| - |
performance.polling.offload_transition_poll_count | Controls polling count during transition phase where socket is UDP unicast and no multicast addresses were added. Once first ADD_MEMBERSHIP is called, RX poll duration setting takes effect |
|
| Range: |
performance.polling.rx_cq_wait_ctrl | Ensures FDs are added only to sleeping sockets' epoll descriptors, reducing kernel scan overhead |
|
| - |
performance.polling.rx_kernel_fd_attention_level | Ratio between XLIO CQ poll and OS FD poll. 0 means only poll offloaded sockets. Results in single poll of not-offloaded sockets every N offloaded socket polls |
|
| - |
performance.polling.rx_poll_on_tx_tcp | Enables/disables TCP RX polling during TCP TX operation for faster TCP ACK reception |
|
| - |
performance.polling.skip_cq_on_rx | Allow TCP socket to skip CQ polling in rx socket call |
|
|
|
performance.polling.yield_on_poll | When application runs with multiple threads on limited cores, each thread polling inside XLIO needs to yield CPU to other polling threads to prevent starvation. The value is the number of iterations before yielding the CPU |
|
| - |
Rings
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
performance.rings.max_per_interface | Limit on rings per interface. If number of sockets using same interface is larger than limit, several sockets will share the same ring. Use 0 for unlimited |
|
| - |
performance.rings.rx.allocation_logic | Controls how reception rings are allocated and separated. By default all sockets use the same ring for both RX and TX over the same interface |
|
|
|
performance.rings.rx.migration_ratio | Controls when to replace a socket's ring with the current thread's ring. Used with "ring per thread" logic to decide when ring migration is beneficial |
|
| - |
performance.rings.rx.post_batch_size | Number of Work Request Elements and RX buffers to batch before recycling. Batching decreases latency mean but might increase latency STD |
|
| Range: |
performance.rings.rx.ring_elements_count | Number of Work Request Elements allocated in all RQs. Default value is 128 for hardware_features.striding_rq.enable=true (default) or 32768 for hardware_features.striding_rq.enable=false |
|
| - |
performance.rings.rx.spare_buffers | Number of spare receive buffers a ring holds to allow for filling up QP while full receive buffers are being processed. Default value is 128 for hardware_features.striding_rq.enable=true (default) or 32768 for hardware_features.striding_rq.enable=false |
|
| - |
performance.rings.rx.spare_strides | Number of spare stride objects a ring holds to allow faster allocation of a stride object when a packet arrives |
|
| - |
performance.rings.tx.allocation_logic | Ring allocation logic is used to separate traffic to different rings. By default all sockets use the same ring for both RX and TX over the same interface |
|
|
|
performance.rings.tx.completion_batch_size | Number of TX WREs used until a completion signal is requested. Allows better control of jitter from Tx CQE handling |
|
| Range: |
performance.rings.tx.max_inline_size | Maximum data size sent inline. Setting to 0 disables inlining. Data copied into INLINE space is at least 32 bytes of headers plus user datagram payload |
|
| Range: |
performance.rings.tx.max_on_device_memory | Maximum On Device Memory buffer size for each TX ring. 0 means unlimited. XLIO can use the On Device Memory to store the egress packet if it does not fit into the BF inline buffer |
|
| Range: Note: Total On Device Memory limited to 256k for single-port HCA and 128k for dual-port HCA |
performance.rings.tx.migration_ratio | Controls when to replace a socket's ring with the current thread's ring. Used with "ring per thread" logic to decide when ring migration is beneficial |
|
| - |
performance.rings.tx.ring_elements_count | Number of Work Request Elements allocated in all transmit QPs. Number of QPs can change according to number of network offloaded interfaces |
|
| - |
performance.rings.tx.tcp_buffer_batch | Number of TX buffers fetched by a TCP socket at once. Higher number for less ring accesses to fetch buffers. Lower number for less memory consumption |
|
| Minimum: |
performance.rings.tx.udp_buffer_batch | Number of TX buffers fetched by a UDP socket at once |
|
| Minimum: |
Steering Rules
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
performance.steering_rules.disable_flowtag | Disables flow tag functionality |
|
| - |
performance.steering_rules.tcp.2t_rules | Use only 2-tuple rules for TCP connections instead of 5-tuple rules. Can help overcome steering limitations for outgoing TCP connections but requires unique local IP address per XLIO ring |
|
| - |
performance.steering_rules.tcp.3t_rules | Use only 3-tuple rules for incoming TCP connections instead of 5-tuple rules. Can improve performance for servers with listen sockets accepting many connections |
|
| - |
performance.steering_rules.udp.3t_rules | Relevant for connected UDP sockets. 3-tuple rules are used in hardware flow steering when enabled; 5-tuple when disabled. Enabling can reduce hardware flow steering resources |
|
| - |
performance.steering_rules.udp.only_mc_l2_rules | Use only L2 rules for Ethernet Multicast. All loopback traffic will be handled by XLIO instead of OS |
|
| - |
Threading
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
performance.threading.cpu_affinity | Control which CPU core(s) the XLIO internal thread is serviced on. Can be provided as hexadecimal bitmask or comma-delimited values/ranges |
|
| Examples:
Note: Only hexadecimal values are supported for this parameter in |
performance.threading.cpuset | Select a cpuset for XLIO internal thread. Value is path to cpuset or empty string to run on same cpuset as process |
|
| Example: |
performance.threading.internal_handler.behavior | Select which TCP control flows are done in the internal thread. Should be kept disabled if using blocking poll/select (epoll is OK) |
|
|
|
performance.threading.internal_handler.timer_msec | Control XLIO internal thread wakeup timer resolution (in milliseconds) |
|
| - |
performance.threading.internal_handler.wakeup_per_packet | Wake up the internal thread for each packet that the CQ receives. Can minimize latency for busy applications but might decrease performance for high PPS applications |
|
| - |
performance.threading.mutex_over_spinlock | Control locking type mechanism for some specific flows. Note that usage of Mutex might increase latency |
|
| - |
performance.threading.worker_threads | Controls which execution model and number of worker threads are used to handle networking and progress sockets. Two modes: Run to Completion (0) and Worker Threads (>0) |
|
| Range: |
Profiles
Parameter | Description | Deprecated Environment Variable | Default | Values/Examples/Notes |
profiles.spec | XLIO predefined specification profiles |
|
|
Examples:
|