NVIDIA Docs Hub NVIDIA Networking Accelerator Software NVIDIA Accelerated IO (XLIO) Documentation Rev 3.50.3 NGINX

NGINX

Introduction

This guide covers deploying NGINX over NVIDIA Accelerated IO (XLIO), focusing on best practices and recommended conventions.

NGINX is a web server that is simple to configure for serving static web content and can also be deployed to deliver dynamic content across networks. For more information, visit the official Nginx website.

NGINX & OpenSSL Build Instructions

We recommend using stable, official versions of NGINX and OpenSSL.

Copy
Copied!

            
            # mkdir /opt/nginx_xlio
# cd /opt/nginx_xlio
# git clone https://github.com/openssl/openssl.git -b openssl-3.0.2 
# git clone https://github.com/nginx/nginx.git -b release-1.21.6
# cd nginx
# auto/configure --prefix=/opt/nginx_xlio/install --with-openssl=/opt/nginx_xlio/openssl --with-http_ssl_module --with-http_v2_module --with-openssl-opt="enable-ktls -DOPENSSL_LINUX_TLS"
# make -j && make install

NGINX Configuration File Example

This example can be adapted to your specific requirements.

Please pay attention to the following directives: (1), (2), (3), (4), (5).

Copy
Copied!

            
            # (1) This directive needs to be coherent with XLIO_NGINX_WORKERS_NUM
worker_processes 16;
 
# (2) Turning off the deamon - Currently not supported by XLIO.
daemon off;
 
user root root;
worker_rlimit_nofile 1048575;
worker_priority -20;
error_log /dev/stdout info;
pid logs/nginx.pid;
 
events {
    worker_connections 200000;
    use epoll;
    multi_accept off;
    accept_mutex off;
}
 
http {
    # (3) Adjust to XLIO logic
    ssl_buffer_size 16128; 
    
	# (4) Zero Copy optimization for files (sendfile API)
	sendfile on;
 
    include mime.types;
    default_type application/octet-stream;
    access_log off;
    client_body_timeout 1800s;
    client_header_timeout 1800s;
    send_timeout 1800s;
    keepalive_timeout 1h;
    keepalive_requests 100000000;
    # File caching optimizations:
	open_file_cache max=1000 inactive=20s;
	open_file_cache_valid 3600s;
	open_file_cache_min_uses 2;
	open_file_cache_errors on;
 
    server {
         # (5) Enable KTLS usage with NGINX. Comment out this directive to disable KTLS.
		ssl_conf_command Options KTLS;  
 
		listen [SPECIFIC_IPV4_ADDR]:443 ssl default_server backlog=65535;
        listen [SPECIFIC_IPV6_ADDR]:443 ssl default_server backlog=65535;
        server_name localhost;
        ssl_certificate /etc/ssl/certs/nginx-rsa-selfsigned.crt;
        ssl_certificate_key /etc/ssl/private/nginx-rsa-selfsigned.key;        
		# Please see NVIDIA TLS offload supported ciphers
		ssl_protocols TLSv1.2 TLSv1.3;
		ssl_ciphers "ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:AES256-GCM-SHA384:AES128-GCM-SHA256";
        ssl_conf_command Ciphersuites "TLS_AES_256_GCM_SHA384:TLS_AES_128_GCM_SHA256";
        ssl_prefer_server_ciphers on;
 
		location / {
            root html;
            index index.html index.htm;
        }
    }
}

TLS HW Offload Ciphers

To utilize TLS HW offload, please refer to TLS HW Offload for requirements and supported ciphers.

Tunings - Best Practices

Numa Considerations

Aligning your application with the NUMA node of the NVIDIA card reduces cross-node memory access latency, enhancing performance.

Steps to Check NUMA Node for a network interface:

Copy
Copied!

            
            #<host> ip addr show
#<host> cat /sys/class/net/<interface_name>/device/numa_node

Bind application to NUMA node:

Copy
Copied!

            
            #<host> numactl --cpunodebind=<NUMA_NODE> <your_application>

Huge Pages Configuration

XLIO can leverage Huge Pages to reduce TLB misses and improve memory allocation efficiency.

Check supported Huge Page sizes:

Copy
Copied!

            
            #(host) ls /sys/kernel/mm/hugepages/

Allocate Huge Pages:

Copy
Copied!

            
            #(host) echo <number_of_hugepages> | sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
OR
#(host) echo <number_of_hugepages> | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

NUMA-aware Huge Pages allocation (recommended):

We recommend that Huge Pages be allocated on the same NUMA node as the application to reduce memory access latency and optimize performance.

Copy
Copied!

            
            #(host) echo <number_of_hugepages> | sudo tee /sys/devices/system/node/<NUMA_NODE>/hugepages/hugepages-2048kB/nr_hugepages
OR
#(host) echo <number_of_hugepages> | sudo tee /sys/devices/system/node/<NUMA_NODE>/hugepages/hugepages-1048576kB/nr_hugepages

Running NGINX over XLIO

Setting unlimited locked memory:

Copy
Copied!

            
            #<host> ulimit -l unlimited

XLIO configuration example for x86:

Copy
Copied!

            
            #<host> export XLIO_SPEC=nginx
#<host> export XLIO_NGINX_WORKERS_NUM=16
#<host> export XLIO_TX_BUF_SIZE=16384
#<host> LD_PRELOAD=/path/to/libxlio.so /path/to/nginx -c /path/to/nginx.conf

Copy
Copied!

            
            #<host> XLIO_SPEC=nginx XLIO_NGINX_WORKERS_NUM=16 XLIO_TX_BUF_SIZE=16384 <MORE_XLIO_PARAMS> LD_PRELOAD=path/to/libxlio.so path/to/nginx -c path/to/nginx.conf

XLIO configuration example for aarch64 (BlueField):

Copy
Copied!

            
            #<host> export XLIO_SPEC=nginx_dpu
#<host> export XLIO_NGINX_WORKERS_NUM=16
#<host> export XLIO_TX_BUF_SIZE=16384
#<host> LD_PRELOAD=/path/to/libxlio.so /path/to/nginx -c /path/to/nginx.conf

Notes

XLIO offers additional optimization parameters that can be tuned for specific workloads.
For monitoring XLIO performance counters, please see Monitoring, Debugging, and Troubleshooting

Troubleshooting

Client-Side Checks:

Is not CPU- or memory-bound
Can receive at full rate without packet drops
Does not trigger excessive retransmissions

On This Page