NVIDIA Accelerated IO (XLIO) Documentation Rev 2.0.6

Introduction to XLIO

The Accelerated IO (XLIO) library is a network-traffic offload, dynamically-linked user-space Linux library that transparently enhances the performance of socket-based networking-heavy applications over an Ethernet network. In addition, XLIO exposes standard socket APIs with kernel-bypass architecture, enabling a hardware-based direct copy between an application’s user-space memory and the network interface.

The XLIO library accelerates TCP and UDP socket applications by offloading traffic from the user-space directly to the network interface card (NIC) without going through the kernel and the standard IP stack (kernel-bypass). XLIO increases overall traffic packet rate and improves latency and CPU utilization for NGINX and SPDK based applications.

Coupling XLIO with NVIDIA ConnectX-6 Dx, NVIDIA ConnectX-7 or NVIDIA BlueField-2 data processing unit (DPU) acceleration capabilities provides breakthrough performance of Transport Layer Security (TLS) encryption, without application code changes, using a standard socket API.

The XLIO library utilizes the direct hardware access and advanced polling techniques of RDMA-capable NVIDIA network cards. Utilization of Ethernet’s direct hardware access enables the XLIO kernel bypass, which causes the XLIO library to bypass the kernel’s network stack for all IP network traffic transmit and receive socket API calls. Thus, applications using the XLIO library gain many benefits, including:

  • Reduced context switches and interrupts, which result in:

    • Higher throughput

    • Requests per second

    • Number of new connections per second

    • Improved CPU utilization

    • Lower latency

  • Minimal buffer copies between user data and hardware – XLIO needs a single copy to transfer a unicast or multicast offloaded packet between hardware and the application’s data buffers.

Good application candidates for XLIO include, but are not limited to:

  • Applications that require NGINX TCP/IP performance acceleration for HTTP/HTTPS traffic, DoH, and CDN. In addition, hardware offload capabilities for TLS Encryption.

  • Applications that require accelerated SPDK (Storage Performance Development Kit) performance over NVME-over-TCP.

The XLIO library provides several significant advantages:

  • The underlying wire protocol used for the unicast and multicast solution is standard TCP and UDP IPv4/IPv6, which is interoperable with any TCP/UDP/IP networking stack. Thus, the opposite side of the communication can be any machine with any OS and can be located on an Ethernet network.

    Warning

    XLIO uses a standard protocol that enables an application to use the XLIO for asymmetric acceleration purposes. A “TCP server-side”application, a “multicast consuming” only, or “multicast publishing” only application can leverage this while remaining compatible with Ethernet peers.

  • Kernel bypass for unicast and multicast transmit and receive operations. This delivers much lower CPU overhead since TCP/IP stack overhead is not incurred

  • Better CPU utilization by eliminating the context switch costs using kernel bypass

  • Reduced number of context switches. All XLIO software is implemented in user space in the user application’s context. This allows the server to process a significantly higher packet rate than would otherwise be possible

  • No buffer copies in the kernel

  • Fewer hardware interrupts for received/transmitted packets

  • Fewer queue congestion problems witnessed in standard TCP/IP applications

  • Supports legacy socket applications – no need for application code rewrite

  • Maximizes messages per second (MPS) rates

  • Minimizes message latency

  • Reduces latency spikes (outliers)

  • Lowers the CPU usage required to handle traffic

  • Hardware offload of TLS encryption for Tx and Rx paths

  • Zero-copy extra API for TX and RX

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.