The Accelerated IO (XLIO) library is a network-traffic offload, dynamically-linked user-space Linux library that transparently enhances the performance of socket-based networking-heavy applications over an Ethernet network. In addition, XLIO exposes standard socket APIs with kernel-bypass architecture, enabling a hardware-based direct copy between an application’s user-space memory and the network interface.
The XLIO library accelerates TCP and UDP socket applications by offloading traffic from the user-space directly to the network interface card (NIC) without going through the kernel and the standard IP stack (kernel-bypass). XLIO increases overall traffic packet rate and improves latency and CPU utilization for NGINX and SPDK based applications.
Coupling XLIO with NVIDIA ConnectX-6 Dx or NVIDIA BlueField-2 data processing unit (DPU) acceleration capabilities provides breakthrough performance of Transport Layer Security (TLS) encryption, without application code changes, using a standard socket API.
The XLIO library utilizes the direct hardware access and advanced polling techniques of RDMA-capable NVIDIA network cards. Utilization of Ethernet’s direct hardware access enables the XLIO kernel bypass, which causes the XLIO library to bypass the kernel’s network stack for all IP network traffic transmit and receive socket API calls. Thus, applications using the XLIO library gain many benefits, including:
- Reduced context switches and interrupts, which result in:
- Higher throughput
- Requests per second
- Number of new connections per second
- Improved CPU utilization
- Lower latency
- Minimal buffer copies between user data and hardware – XLIO needs a single copy to transfer a unicast or multicast offloaded packet between hardware and the application’s data buffers.
Good application candidates for XLIO include, but are not limited to:
- Applications that require NGINX TCP/IP performance acceleration for HTTP/HTTPS traffic, DoH, and CDN. In addition, hardware offload capabilities for TLS Encryption.
- Applications that require accelerated SPDK (Storage Performance Development Kit) performance over NVME-over-TCP.
Advanced XLIO Features
The XLIO library provides several significant advantages:
The underlying wire protocol used for the unicast and multicast solution is standard TCP and UDP IPv4, which is interoperable with any TCP/UDP/IP networking stack. Thus, the opposite side of the communication can be any machine with any OS and can be located on an Ethernet network.
XLIO uses a standard protocol that enables an application to use the XLIO for asymmetric acceleration purposes. A “TCP server-side” application, a “multicast consuming” only, or “multicast publishing” only application can leverage this while remaining compatible with Ethernet peers.
- Kernel bypass for unicast and multicast transmit and receive operations. This delivers much lower CPU overhead since TCP/IP stack overhead is not incurred
- Better CPU utilization by eliminating the context switch costs using kernel bypass
- Reduced number of context switches. All XLIO software is implemented in user space in the user application’s context. This allows the server to process a significantly higher packet rate than would otherwise be possible
- Minimal buffer copies. Data is transferred from the hardware NIC straight to the application buffer in user space, with only a single intermediate user-space buffer and zero kernel IO buffers
- Fewer hardware interrupts for received/transmitted packets
- Fewer queue congestion problems witnessed in standard TCP/IP applications
- Supports legacy socket applications – no need for application code rewrite
- Maximizes messages per second (MPS) rates
- Minimizes message latency
- Reduces latency spikes (outliers)
- Lowers the CPU usage required to handle traffic
- Hardware offload of TLS encryption for Tx and Rx paths
- Reduce data copies by providing zero-copy extra API for TX and RX