NVIDIA Docs Hub Homepage NVIDIA Networking Accelerator Software NVIDIA Accelerated IO (XLIO) Documentation Rev 3.60 XLIO - Configuration Reference

XLIO - Configuration Reference

This document provides a comprehensive reference for all XLIO configuration parameters organized by functional categories.

Acceleration Control

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
acceleration_control.app_id	Specify a group of rules from `libxlio.conf` for XLIO to apply	`XLIO_APPLICATION_ID`	`XLIO_DEFAULT_APPLICATION_ID` (matches only the `*` group rule)	Example: `acceleration_control.app_id=iperf_server`
acceleration_control.default_acceleration	Create all sockets as offloaded or not offloaded by default	`XLIO_OFFLOADED_SOCKETS`	`true` (Enabled)	Values: `true`= offloaded `false` = not offloaded
acceleration_control.rules	Defines transport protocol and offload settings for specific applications or processes. Maps to configuration in `libxlio.conf`.	-	`[]`	Note: `rules` is an array of objects with `id`, `name`, and `actions` Example:`{ "acceleration_control": { "rules": [{"id": "A1","name": "nginx","actions": ["use xlio tcp_server *:8080"]}] } }`
acceleration_control.rules[].id	Unique identifier for this transport control rule	-	-	-
acceleration_control.rules[].name	Name of the application this rule applies to	-	-	-
acceleration_control.rules[].actions	Action directives that modify transport behavior	-	-	Format: use <transport> <role> <address\| >:<port range\| > See Descriptions table.

Applications

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
applications.nginx.distribute_cq	Distributes completion queue (CQ) processing across NGINX worker processes to improve performance	`XLIO_DISTRIBUTE_CQ`	`false` (Disabled)	Helps balance CQ handling among worker threads for higher throughput
applications.nginx.src_port_stride	Controls how source ports are distributed across NGINX worker processes	`XLIO_NGINX_SRC_PORT_STRIDE`	`2`	Determines port stepping between workers; useful for load balancing
applications.nginx.udp_pool_size	Defines the size of the UDP socket pool for NGINX. When set >0, a closed UDP socket is returned to the pool instead of being destroyed	`XLIO_NGINX_UDP_POOL_SIZE`	`0` (Disabled)	Enables reuse of UDP sockets to reduce allocation overhead
applications.nginx.udp_socket_pool_reuse	Controls reuse of UDP socket pools for NGINX deployments	`XLIO_NGINX_UDP_POOL_RX_NUM_BUFFS_REUSE`	`0` (Disabled)	Improves efficiency in UDP-heavy traffic patterns
applications.nginx.workers_num	Number of NGINX worker processes to optimize for. Must be set to offload NGINX successfully	`XLIO_NGINX_WORKERS_NUM`	`0`	Required for enabling NGINX offload support

Core

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
core.daemon.dir	Directory path for XLIO to write files used by `xliod`	`XLIO_SERVICE_NOTIFY_DIR`	`/tmp/xlio`	When used, `xliod` must be run with `--notify-dir` pointing to the same folder
core.daemon.enable	Enable the XLIO daemon service for additional monitoring capabilities	`XLIO_SERVICE_ENABLE`	`false` (Disabled)	-
core.exception_handling.mode	Mode for handling missing support or error cases in the Socket API or other XLIO functionality	`XLIO_EXCEPTION_HANDLING`	`-1` (default; future default `0`)	`-2/exit` – exit on startup failure `-1/handle_debug` – handle at DEBUG level `0/log_debug_undo_offload` – log DEBUG and recover via kernel stack `1/log_error_undo_offload` – log ERROR and recover via kernel stack `2/log_error_return_error` – log ERROR and return error code `3/log_error_abort` – log ERROR and abort (throw `xlio_error`)
core.quick_init	Avoid extra checks to reduce initialization time (may fail under system misconfiguration)	`XLIO_QUICK_START`	`false` (Disabled)	Note: If enabled and hugepages are requested beyond the cgroup limit, XLIO may crash
core.resources.external_memory_limit	Memory limit for external user allocator (`0` uses `core.resources.memory_limit` value)	`XLIO_MEMORY_LIMIT_USER`	`0`	Supports suffixes: B, KB, MB, GB
core.resources.heap_metadata_block_size	Size of metadata block added to every heap allocation	`XLIO_HEAP_METADATA_BLOCK`	`32 MB`	Supports suffixes: B, KB, MB, GB
core.resources.hugepages.enable	Use huge pages for data buffers to improve performance by reducing TLB misses; overrides rdma-core parameters `MLX_QP_ALLOC_TYPE` and `MLX_CQ_ALLOC_TYPE`	`XLIO_MEM_ALLOC_TYPE`	`true` (Enabled)	`false` = malloc `true` = huge pages
core.resources.hugepages.size	Force specific hugepage size for internal allocations; `0` allows any supported	`XLIO_HUGEPAGE_SIZE`	`0`	Must be power of 2 or `0`.Suffixes allowed: KB, MB, GB
core.resources.memory_limit	Pre-allocated memory limit for buffers. Dynamic allocations may exceed this. `0` = unlimited	`XLIO_MEMORY_LIMIT`	`2048 MB` (2 GB)	Supports suffixes: B, KB, MB, GB
core.signals.sigint.exit	Call XLIO handler on SIGINT and then application's handler (if exists)	`XLIO_HANDLE_SIGINTR`	`true` (Enabled)	-
core.signals.sigsegv.backtrace	Print backtrace when a segmentation fault occurs	`XLIO_HANDLE_SIGSEGV`	`false` (Disabled)	-
core.syscall.allow_privileged_sockopt	Permit use of privileged socket options that may require special permissions	`XLIO_ALLOW_PRIVILEGED_SOCK_OPT`	`true` (Enabled)	-
core.syscall.avoid_ctl_syscalls	For TCP FDs, avoid system calls for supported options (`ioctl`, `fcntl`, `getsockopt`, `setsockopt`)	`XLIO_AVOID_SYS_CALLS_ON_TCP_FD`	`false` (Disabled)	Unsupported options fallback to OS
core.syscall.deferred_close	Defer closing file descriptors until the socket is actually closed (useful in multithreaded apps)	`XLIO_DEFERRED_CLOSE`	`false` (Disabled)	-
core.syscall.dup2_close_fd	Handle `dup2()` by treating the old FD as closed before forwarding call to OS	`XLIO_CLOSE_ON_DUP2`	`true` (Enabled)	Rudimentary `dup2` support (for FD replacement only)
core.syscall.fork_support	Enable `ibv_fork_init()` to correctly handle `fork()`	`XLIO_FORK`	`true` (Enabled)	-
core.syscall.getsockname_dummy_send	Trigger dummy packet send from `getsockname()` to warm caches	`XLIO_TRIGGER_DUMMY_SEND_GETSOCKNAME`	`false` (Disabled)	-
core.syscall.sendfile_cache_limit	Memory limit for mapping cache used by `sendfile()`	`XLIO_ZC_CACHE_THRESHOLD`	`10 GB`	Supports suffixes: B, KB, MB, GB

Hardware Features

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
hardware_features.striding_rq.enable	Enable or disable Striding Receive Queues (each WQE in a Striding RQ can receive multiple packets)	`XLIO_STRQ`	`true` (Enabled)	The WQE buffer size is determined by `hardware_features.striding_rq.strides_num × hardware_features.striding_rq.stride_size`
hardware_features.striding_rq.stride_size	Size in bytes of each stride in a receive WQE; must be a power of two and within [64–8192]	`XLIO_STRQ_STRIDE_SIZE_BYTES`	`64`	Range: `64–8192` (power of 2)
hardware_features.striding_rq.strides_num	Number of strides in each receive WQE; must be a power of two and within [512–65536]	`XLIO_STRQ_NUM_STRIDES`	`2048`	Range: `512–65536` (power of 2)
hardware_features.tcp.lro	Large Receive Offload (LRO): increases inbound throughput by reducing CPU overhead via packet aggregation	`XLIO_LRO`	`auto (-1)`	`auto/-1` – depends on `ethtool` & adapter `enable/1` – enabled if adapter supports it `disable/0` – disabled
hardware_features.tcp.tls_offload.dek_cache_max_size	Maximum Data Encryption Key (DEK) cache size for TLS offload	`XLIO_HIGH_WMARK_DEK_CACHE_SIZE`	`1024`	-
hardware_features.tcp.tls_offload.dek_cache_min_size	Minimum DEK cache size for TLS offload	`XLIO_LOW_WMARK_DEK_CACHE_SIZE`	`512`	-
hardware_features.tcp.tls_offload.rx_enable	Offload TLS RX path through kTLS API if possible (uses UTLS for acceleration)	`XLIO_UTLS_RX`	`false` (Disabled)	-
hardware_features.tcp.tls_offload.tx_enable	Offload TLS TX path through kTLS API if possible (uses UTLS for acceleration)	`XLIO_UTLS_TX`	`true` (Enabled)	-
hardware_features.tcp.tso.enable	TCP Segmentation Offload (TSO): allows TCP to transmit buffers larger than the MTU using adapter segmentation	`XLIO_TSO`	`auto (-1)`	`auto/-1` – depends on `ethtool` & adapter `enable/1` – enabled if supported `disable/0` – disabled
hardware_features.tcp.tso.max_size	Maximum TCP segment size (in bytes) allowed with TSO	`XLIO_TSO_MAX_SIZE`	`262144` (256 KB)	Supports suffixes: B, KB, MB, GB

Monitor

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
monitor.exit_report	Print a human-readable report of resource usage at process exit. Printed during termination and may be missed if process ends with `SIGKILL`.	`XLIO_PRINT_REPORT`	`auto (-1)`	`auto/-1` – print report only if anomaly detected `on/1` – always print `off/0` – never print
monitor.log.colors	Use color scheme when logging: red for errors, purple for warnings, dim for low-level debug. Automatically disabled when logging to non-terminal devices.	`XLIO_LOG_COLORS`	`true` (Enabled)	-
monitor.log.details	Add details to each log line.	`XLIO_LOG_DETAILS`	`0`	`0` – Basic `1` – ThreadId `2` – ProcessId + ThreadId `3` – Time + ProcessId + ThreadId (time in ms from process start)
monitor.log.file_path	Redirect all logging to a user-defined file. Library replaces a single `%d` with process PID for multiple instances.	`XLIO_LOG_FILE`	`""` (empty)	Example: `/tmp/xlio_log.txt`
monitor.log.level	Logging verbosity level used by the library.	`XLIO_TRACELEVEL`	`info (3)`	`init/-2` or `none/-2` – no logs `panic/-1` – fatal errors `error/0` – runtime errors `warn/1` – warnings `info/3` – general information `details/4` – configuration info `debug/5` – high-level debug (logs all socket API calls) `fine/6` – low-level runtime logging `finer/7` or `all/8` – very detailed logging (significant performance cost)
monitor.stats.cpu_usage	Calculate XLIO CPU usage during polling hardware loops. Results accessible via the XLIO stats utility.	`XLIO_CPU_USAGE_STATS`	`false` (Disabled)	-
monitor.stats.fd_num	Maximum number of sockets monitored by XLIO statistics mechanism. Affects how many sockets `xlio_stats` and `XLIO_STATS_FILE` can report.	`XLIO_STATS_FD_NUM`	`0`	Range: `0–1024`Tool limited to 1024 sockets
monitor.stats.file_path	Redirect socket statistics to a specific file. Each socket’s stats are dumped on close.	`XLIO_STATS_FILE`	`""` (empty)	Example: `/tmp/stats`
monitor.stats.shmem_dir	Directory path for creating shared-memory files for `xlio_stats`. No files created if empty string.	`XLIO_STATS_SHMEM_DIR`	`/tmp/xlio`	-

Performance

Buffers

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
performance.buffers.batching_mode	Controls batching of returning Rx buffers and pulling Tx buffers per socket	`XLIO_BUFFER_BATCHING_MODE`	`enable_and_reuse (1)`	`disable/0` – no batching `enable_and_reuse/1` – batching with periodic reclaim of unused buffers `enable/2` – batching without reclaim
performance.buffers.rx.buf_size	Size of Rx buffer allocation; must be ≥ MTU and ≤ 0xFF00. Default based on max MTU.	`XLIO_RX_BUF_SIZE`	`0`	Range: `0–65280` Supports suffixes: B, KB, MB, GB
performance.buffers.rx.prefetch_before_poll	Prefetch before polling for packets, improves latency in low PPS traffic	`XLIO_RX_PREFETCH_BYTES_BEFORE_POLL`	`0`	-
performance.buffers.rx.prefetch_size	Bytes prefetched into cache during ingress packet processing	`XLIO_RX_PREFETCH_BYTES`	`256`	Range: `32–MTU`
performance.buffers.tcp_segments.pool_batch_size	TCP segments batched when fetched from the segment pool	`XLIO_TX_SEGS_POOL_BATCH_TCP`	`16384`	Minimum: 1
performance.buffers.tcp_segments.ring_batch_size	TCP segments fetched per ring from the segment pool	`XLIO_TX_SEGS_RING_BATCH_TCP`	`1024`	Minimum: 1
performance.buffers.tcp_segments.socket_batch_size	TCP segments fetched per socket from the segment pool	`XLIO_TX_SEGS_BATCH_TCP`	`64`	Minimum: 1
performance.buffers.tx.buf_size	Size of Tx buffer allocation; must be ≥ MTU and ≤ 0xFF00. Default based on MTU/MSS.	`XLIO_TX_BUF_SIZE`	`0`	Range: `0–262144` Supports suffixes: B, KB, MB, GB
performance.buffers.tx.global_array_size	Number of global zero-copy Tx buffers preallocated	`XLIO_TX_BUFS`	`200000`	-
performance.buffers.tx.prefetch_size	Cache prefetch size for Tx path to optimize send rate	`XLIO_TX_PREFETCH_BYTES`	`256`	Range: `0–MTU`

Completion Queue

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
performance.completion_queue.interrupt_moderation.adaptive_change_frequency_msec	Frequency of interrupt moderation adaptation. Interval in milliseconds between adaptation attempts. Use 0 to disable adaptive interrupt moderation	`XLIO_CQ_AIM_INTERVAL_MSEC`	`1000`	-
performance.completion_queue.interrupt_moderation.adaptive_count	Maximum count value to use in the adaptive interrupt moderation algorithm	`XLIO_CQ_AIM_MAX_COUNT`	`500`	-
performance.completion_queue.interrupt_moderation.adaptive_interrupt_per_sec	Desired interrupts rate per second for each ring (CQ). Count and period parameters will change automatically to achieve the desired rate	`XLIO_CQ_AIM_INTERRUPTS_RATE_PER_SEC`	`10000`	-
performance.completion_queue.interrupt_moderation.adaptive_period_usec	Maximum period value to use in the adaptive interrupt moderation algorithm	`XLIO_CQ_AIM_MAX_PERIOD_USEC`	`1000`	-
performance.completion_queue.interrupt_moderation.enable	Enable CQ interrupt moderation. When enabled, hardware only generates an interrupt after some packets are received or after a packet was held for some time	`XLIO_CQ_MODERATION_ENABLE`	`true` (Enabled)	-
performance.completion_queue.interrupt_moderation.packet_count	Number of packets to hold before generating interrupt	`XLIO_CQ_MODERATION_COUNT`	`48`	-
performance.completion_queue.interrupt_moderation.period_usec	Period in microseconds for holding the packet before generating interrupt	`XLIO_CQ_MODERATION_PERIOD_USEC`	`50`	-
performance.completion_queue.keep_full	If disabled, CQ will not try to compensate for each poll on the receive path. Uses a "debt" to remember missing WREs. If enabled, CQ will try to compensate QP for each polled receive completion	`XLIO_CQ_KEEP_QP_FULL`	`true` (Enabled)	-
performance.completion_queue.periodic_drain_max_cqes	Each time XLIO's internal thread starts CQ draining, it will stop when it reaches this max value. Applications are not limited by this value	`XLIO_PROGRESS_ENGINE_WCE_MAX`	`10000`	-
performance.completion_queue.periodic_drain_msec	XLIO internal thread safe check that the CQ is drained at least once every N milliseconds. Allows library to progress TCP stack when application doesn't access socket	`XLIO_PROGRESS_ENGINE_INTERVAL`	`10`	-
performance.completion_queue.rx_drain_rate_nsec	Socket's receive path CQ drain logic rate control. When enabled, socket will check CQ for ready completions even if receive ready packet queue is not empty	`XLIO_RX_CQ_DRAIN_RATE_NSEC`	`0`	Recommended: `100–5000` (nsec)

Polling

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
performance.polling.blocking_rx_poll_usec	Number of times to poll on Rx path for ready packets before going to sleep or returning -1. Done when application uses direct blocked calls to read(), recv(), etc.	`XLIO_RX_POLL`	`100000`	Range: `-1` (infinite), `0` (interrupt-driven), `1–100000000`
performance.polling.iomux.poll_os_ratio	Enables polling of OS file descriptors while user thread calls select() or poll(). Results in single poll of not-offloaded sockets every N offloaded sockets polls	`XLIO_SELECT_POLL_OS_RATIO`	`10`	-
performance.polling.iomux.poll_usec	Duration in microseconds to poll the hardware on Rx path before going to sleep. Max polling duration limited by timeout used in select(), poll() or epoll_wait()	`XLIO_SELECT_POLL`	`100000`	Range: `-1` (infinite), `0` (interrupt-driven), `1–100000000`
performance.polling.iomux.skip_os	For select() or poll() this forces XLIO to check the non-offloaded fd even though an offloaded socket has ready packets found while polling	`XLIO_SELECT_SKIP_OS`	`4`	-
performance.polling.kernel_fd_attention_level	Controls threshold for checking kernel file descriptors during polling. 0 means never check. Affects how often XLIO checks for activity on non-offloaded kernel file descriptors	`XLIO_RING_KERNEL_FD_ATTENTION_LEVEL`	`10`	-
performance.polling.max_rx_poll_batch	Maximum number of receive buffers processed in a single poll operation. Max size of array while polling the CQs	`XLIO_CQ_POLL_BATCH_MAX`	`16`	-
performance.polling.nonblocking_eagain	Return value 'OK' on all send operations done on non-blocked UDP sockets (OS default). When enabled, library will return with error EAGAIN if unable to accomplish send operation	`XLIO_TX_NONBLOCKED_EAGAINS`	`false` (Disabled)	-
performance.polling.offload_transition_poll_count	Controls polling count during transition phase where socket is UDP unicast and no multicast addresses were added. Once first ADD_MEMBERSHIP is called, RX poll duration setting takes effect	`XLIO_RX_POLL_INIT`	`0`	Range: `-1` (infinite),`0` (disabled),`1–100000000`
performance.polling.rx_cq_wait_ctrl	Ensures FDs are added only to sleeping sockets' epoll descriptors, reducing kernel scan overhead	`XLIO_RX_CQ_WAIT_CTRL`	`false` (Disabled)	-
performance.polling.rx_kernel_fd_attention_level	Ratio between XLIO CQ poll and OS FD poll. 0 means only poll offloaded sockets. Results in single poll of not-offloaded sockets every N offloaded socket polls	`XLIO_RX_UDP_POLL_OS_RATIO`	`100`	-
performance.polling.rx_poll_on_tx_tcp	Enables/disables TCP RX polling during TCP TX operation for faster TCP ACK reception	`XLIO_RX_POLL_ON_TX_TCP`	`false` (Disabled)	-
performance.polling.skip_cq_on_rx	Allow TCP socket to skip CQ polling in rx socket call	`XLIO_SKIP_POLL_IN_RX`	`0`	`0` – Disabled `1` – Skip always `2` – Skip only if socket was added to epoll before
performance.polling.yield_on_poll	When application runs with multiple threads on limited cores, each thread polling inside XLIO needs to yield CPU to other polling threads to prevent starvation. The value is the number of iterations before yielding the CPU	`XLIO_RX_POLL_YIELD`	`0` (Disabled)	-

Rings

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
performance.rings.max_per_interface	Limit on rings per interface. If number of sockets using same interface is larger than limit, several sockets will share the same ring. Use 0 for unlimited	`XLIO_RING_LIMIT_PER_INTERFACE`	`0`	-
performance.rings.rx.allocation_logic	Controls how reception rings are allocated and separated. By default all sockets use the same ring for both RX and TX over the same interface	`XLIO_RING_ALLOCATION_LOGIC_RX`	`per_thread (20)`	`per_interface/0` – Ring per interface `per_ip_address/1` – Ring per IP address `per_socket/10` – Ring per socket `per_thread/20` – Ring per thread `per_cpuid/30` – Ring per core (using cpu id) `per_core/31` – Ring per core – attach threads
performance.rings.rx.migration_ratio	Controls when to replace a socket's ring with the current thread's ring. Used with "ring per thread" logic to decide when ring migration is beneficial	`XLIO_RING_MIGRATION_RATIO_RX`	`-1` (disabled)	-
performance.rings.rx.post_batch_size	Number of Work Request Elements and RX buffers to batch before recycling. Batching decreases latency mean but might increase latency STD	`XLIO_RX_WRE_BATCHING`	`1024`	Range: `1–1024`
performance.rings.rx.ring_elements_count	Number of Work Request Elements allocated in all RQs. Default value is 128 for hardware_features.striding_rq.enable=true (default) or 32768 for hardware_features.striding_rq.enable=false	`XLIO_RX_WRE`	`32768`	-
performance.rings.rx.spare_buffers	Number of spare receive buffers a ring holds to allow for filling up QP while full receive buffers are being processed. Default value is 128 for hardware_features.striding_rq.enable=true (default) or 32768 for hardware_features.striding_rq.enable=false	`XLIO_QP_COMPENSATION_LEVEL`	`32768`	-
performance.rings.rx.spare_strides	Number of spare stride objects a ring holds to allow faster allocation of a stride object when a packet arrives	`XLIO_STRQ_STRIDES_COMPENSATION_LEVEL`	`32768`	-
performance.rings.tx.allocation_logic	Ring allocation logic is used to separate traffic to different rings. By default all sockets use the same ring for both RX and TX over the same interface	`XLIO_RING_ALLOCATION_LOGIC_TX`	`per_thread (20)`	`per_interface/0` – Ring per interface `per_ip_address/1` – Ring per IP address `per_socket/10` – Ring per socket `per_thread/20` – Ring per thread `per_cpuid/30` – Ring per core (using cpu id) `per_core/31` – Ring per core – attach threads
performance.rings.tx.completion_batch_size	Number of TX WREs used until a completion signal is requested. Allows better control of jitter from Tx CQE handling	`XLIO_TX_WRE_BATCHING`	`64`	Range: `1–64`
performance.rings.tx.max_inline_size	Maximum data size sent inline. Setting to 0 disables inlining. Data copied into INLINE space is at least 32 bytes of headers plus user datagram payload	`XLIO_TX_MAX_INLINE`	`204`	Range: `0–884`
performance.rings.tx.max_on_device_memory	Maximum On Device Memory buffer size for each TX ring. 0 means unlimited. XLIO can use the On Device Memory to store the egress packet if it does not fit into the BF inline buffer	`XLIO_RING_DEV_MEM_TX`	`0`	Range: `0–262144 KB` Note: Total On Device Memory limited to 256k for single-port HCA and 128k for dual-port HCA
performance.rings.tx.migration_ratio	Controls when to replace a socket's ring with the current thread's ring. Used with "ring per thread" logic to decide when ring migration is beneficial	`XLIO_RING_MIGRATION_RATIO_TX`	`-1` (disabled)	-
performance.rings.tx.ring_elements_count	Number of Work Request Elements allocated in all transmit QPs. Number of QPs can change according to number of network offloaded interfaces	`XLIO_TX_WRE`	`32768`	-
performance.rings.tx.tcp_buffer_batch	Number of TX buffers fetched by a TCP socket at once. Higher number for less ring accesses to fetch buffers. Lower number for less memory consumption	`XLIO_TX_BUFS_BATCH_TCP`	`16`	Minimum: `1`
performance.rings.tx.udp_buffer_batch	Number of TX buffers fetched by a UDP socket at once	`TX_BUFS_BATCH_UDP`	`8`	Minimum: `1`

Steering Rules

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
performance.steering_rules.disable_flowtag	Disables flow tag functionality	`XLIO_DISABLE_FLOW_TAG`	`false` (Disabled)	-
performance.steering_rules.tcp.2t_rules	Use only 2-tuple rules for TCP connections instead of 5-tuple rules. Can help overcome steering limitations for outgoing TCP connections but requires unique local IP address per XLIO ring	`XLIO_TCP_2T_RULES`	`false` (Disabled)	-
performance.steering_rules.tcp.3t_rules	Use only 3-tuple rules for incoming TCP connections instead of 5-tuple rules. Can improve performance for servers with listen sockets accepting many connections	`XLIO_TCP_3T_RULES`	`false` (Disabled)	-
performance.steering_rules.udp.3t_rules	Relevant for connected UDP sockets. 3-tuple rules are used in hardware flow steering when enabled; 5-tuple when disabled. Enabling can reduce hardware flow steering resources	`XLIO_UDP_3T_RULES`	`true` (Enabled)	-
performance.steering_rules.udp.only_mc_l2_rules	Use only L2 rules for Ethernet Multicast. All loopback traffic will be handled by XLIO instead of OS	`XLIO_ETH_MC_L2_ONLY_RULES`	`false` (Disabled)	-

Threading

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
performance.threading.cpu_affinity	Control which CPU core(s) the XLIO internal thread is serviced on. Can be provided as hexadecimal bitmask or comma-delimited values/ranges	`XLIO_INTERNAL_THREAD_AFFINITY`	`"-1"` (disabled)	Examples: `0x00000001` – Run on processor 0 `0x00000007` – Run on processors 1, 2, and 3 `0,4,8` – Run on processors 0, 4, and 8 `0,1,7-10` – Run on processors 0, 1, 7, 8, 9, and 10 Note: Only hexadecimal values are supported for this parameter in `XLIO_INLINE_CONFIG`
performance.threading.cpuset	Select a cpuset for XLIO internal thread. Value is path to cpuset or empty string to run on same cpuset as process	`XLIO_INTERNAL_THREAD_CPUSET`	`""` (empty string)	Example: `/dev/cpuset/my_set`
performance.threading.internal_handler.behavior	Select which TCP control flows are done in the internal thread. Should be kept disabled if using blocking poll/select (epoll is OK)	`XLIO_TCP_CTL_THREAD`	`disable (0)`	`disable/0` – Disable `delegate/1` – Handle TCP timers in application context threads
performance.threading.internal_handler.timer_msec	Control XLIO internal thread wakeup timer resolution (in milliseconds)	`XLIO_TIMER_RESOLUTION_MSEC`	`10`	-
performance.threading.internal_handler.wakeup_per_packet	Wake up the internal thread for each packet that the CQ receives. Can minimize latency for busy applications but might decrease performance for high PPS applications	`XLIO_INTERNAL_THREAD_ARM_CQ`	`0` (Disabled)	-
performance.threading.mutex_over_spinlock	Control locking type mechanism for some specific flows. Note that usage of Mutex might increase latency	`XLIO_MULTILOCK`	`false` (Spin)	-
performance.threading.worker_threads	Controls which execution model and number of worker threads are used to handle networking and progress sockets. Two modes: Run to Completion (0) and Worker Threads (>0)	`XLIO_WORKER_THREADS`	`0` (Run to Completion execution model)	Range: `0–512`

Profiles

Parameter	Description	Deprecated Environment Variable	Default	Values/Examples/Notes
profiles.spec	XLIO predefined specification profiles	`XLIO_SPEC`	`none (0)`	`none/0` – No profile applied`l` `atency/1` – Optimized for latency-sensitive use cases `ultra_latency/2` – Optimized for ultra-low latency using single-threaded model; avoids OS polling and progress engine `nginx/3` – Optimized for nginx (must be used to offload nginx). This profile is turned indirectly by setting `applications.nginx.workers_num=n` `ginx_dpu/4` – Optimized for nginx running inside NVIDIA DPU `nvme_bf3/5` – Optimized for SPDK solution over NVIDIA DPU BF3 `all/6` – Reserved Examples: `profiles.spec=latency` `profiles.spec=ultra_latency` `profiles.spec=nginx_dpu applications.nginx.workers_num=<N>` `profiles.spec=nvme_bf3`

Parameter

Description

Deprecated Environment Variable

Default

Values/Examples/Notes

profiles.spec

XLIO predefined specification profiles

XLIO_SPEC

none (0)

none/0 – No profile appliedl
atency/1 – Optimized for latency-sensitive use cases
ultra_latency/2 – Optimized for ultra-low latency using single-threaded model; avoids OS polling and progress engine
nginx/3 – Optimized for nginx (must be used to offload nginx). This profile is turned indirectly by setting applications.nginx.workers_num=n
ginx_dpu/4 – Optimized for nginx running inside NVIDIA DPU
nvme_bf3/5 – Optimized for SPDK solution over NVIDIA DPU BF3
all/6 – Reserved

Examples:

profiles.spec=latency
profiles.spec=ultra_latency
profiles.spec=nginx_dpu applications.nginx.workers_num=<N>
profiles.spec=nvme_bf3

On This Page