NVIDIA DOCA IP Fragmentation Application Guide
This document provides a IP Fragmentation implementation on top of the NVIDIA® BlueField® DPU.
This IP Fragmentation application is designed to handle IP fragmentation and reassembly efficiently, ensuring minimal processing overhead for non-fragmented packets while maintaining high performance for fragmented packets.
The application operates on a multi-core architecture, uses Receive Side Scaling (RSS) to distribute traffic, and supports configurable modes for flexible port configurations.
Key Features:
IP Reassembly:
Functionality: The application assembles fragmented packets received on input ports based on their fragmentation headers.
Workflow: Upon successful reassembly, the complete packets are forwarded to their destination port.
IP Fragmentation:
Functionality: Packets exceeding a configurable Maximum Transmission Unit (MTU) are fragmented into smaller packets.
Workflow: Fragments are generated with correct headers and forwarded while maintaining efficient resource utilization.
Transparent Forwarding: Packets that are neither fragmented nor require reassembly are forwarded directly without additional processing overhead.
Inner and Outer Fragmentation Handling: The application supports handling fragmentation at both inner (e.g., encapsulated traffic like GRE, VXLAN) and outer IP layers.
Performance Optimization:
Designed for high throughput using multi-core processing.
Utilizes RSS to distribute traffic across multiple cores, ensuring efficient CPU utilization and scalability.
Debuggability with Counters.
Dual Operating Modes:
Mode 1 (Two Ports): Forwarding between two ports (e.g., Port 0 ↔ Port 1).
Mode 2 (Four Ports): Forwarding from Port A to Port B and Port C to Port D, enabling simultaneous independent operations on two traffic streams.
The IP Fragmentation application client can either runs on the DPU serving as an underlying service for host applications.
Supported Modes:
Dual Port Mode: Traffic flows bidirectionally between two ports.
Quad Port Mode: Independent unidirectional forwarding from Port A → Port B and Port C → Port D.
Notes:
Both diagrams illustrate the flow for a single direction; however, the application operates bidirectionally.
In both modes, non-fragmented or valid-sized packets follow the same flow path without additional actions.
The IP Fragmentation application runs on top of the DOCA API to send and receive packets.
Operational Workflow
Packet Reception and Classification:
Traffic is received on the input ports, with RSS distributing flows to available cores.
Packets are classified into three categories:
Fragmented (Needs Reassembly)
Too Large (Needs Fragmentation)
Standard Packets (Direct Forwarding)
Reassembly:
Fragments are buffered and reassembled using a configurable timeout.
Once reassembled, the full packet is validated and forwarded.
Fragmentation:
Large packets exceeding the MTU are fragmented.
Fragments are prepared with correct headers, sequence numbers, and size.
Direct Forwarding:
Standard packets are forwarded with minimal processing
Performance and Scalability
Multi-Core Processing:
The application scales horizontally with the number of CPU cores, with each core handling a subset of traffic flows.
RSS Traffic Distribution:
Receive Side Scaling ensures optimal load balancing across cores.
Minimal Overhead:
Processing logic is optimized for low-latency handling of standard packets while ensuring efficient fragmentation and reassembly operations.
Debugging and Monitoring
Application provides real-time counters for insights for:
Packets processed.
Fragments reassembled or fragmented.
Errors such as timeout on incomplete fragments.
This application leverages the following DOCA libraries:
TODO(Vlad)
TODO(Vlad)
Please refer to the NVIDIA DOCA Installation Guide for Linux for details on how to install BlueField-related software.
The installation of DOCA's reference applications contains the sources of the applications, alongside the matching compilation instructions. This allows for compiling the applications "as-is" and provides the ability to modify the sources, then compile a new version of the application.
For more information about the applications as well as development and compilation tips, refer to the DOCA Reference Applications page.
The sources of the application can be found under the application's directory: /opt/mellanox/doca/applications/file_compression/
.
Compiling All Applications
All DOCA applications are defined under a single meson project. So, by default, the compilation includes all of them.
To build all the applications together, run:
cd /opt/mellanox/doca/applications/
meson /tmp/build
ninja -C /tmp/build
doca_file_compression
is created under /tmp/build/file_compression/
.
Compiling Only the Current Application
To directly build only the file compression application: (REPLACE WITH FLAG FOR YOUR APP)
cd /opt/mellanox/doca/applications/ meson /tmp/build -Denable_all_applications=
false
-Denable_file_compression=true
ninja -C /tmp/buildInfodoca_file_compression
is created under/tmp/build/file_compression/
.Alternatively, one can set the desired flags in the
meson_options.txt
file instead of providing them in the compilation command line:Edit the following flags in
/opt/mellanox/doca/applications/meson_options.txt
:Set
enable_all_applications
tofalse
Set
enable_file_compression
totrue
The same compilation commands should be used, as were shown in the previous section:
cd /opt/mellanox/doca/applications/ meson /tmp/build ninja -C /tmp/build
Infodoca_file_compression
is created under/tmp/build/file_compression/
.
Troubleshooting
Please refer to the NVIDIA DOCA Troubleshooting for any issue you may encounter with the compilation of the DOCA applications.
Prerequisites
<huge pages and stuff are declared here>
The IPsec security gateway application is based on DOCA Flow. Therefore, the user is required to allocate huge pages.
$ echo
'1024'
| sudo tee -a /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages $ sudo mkdir /mnt/huge $ sudo mount -t hugetlbfs -o pagesize=2M nodev /mnt/huge
Application Execution
The file compression application is provided in source form, hence a compilation is required before the application can be executed.
Application usage instructions (REPLACE WITH REAL HELP AS PRINTED FROM YOUR APP) :
Usage: doca_file_compression [DOCA Flags] [Program Flags] DOCA Flags: -h, --help Print a help synopsis -v, --version Print program version information -l, --log-level Set the (numeric) log level
for
the program <10
=DISABLE,20
=CRITICAL,30
=ERROR,40
=WARNING,50
=INFO,60
=DEBUG,70
=TRACE> --sdk-log-level Set the SDK (numeric) log levelfor
the program <10
=DISABLE,20
=CRITICAL,30
=ERROR,40
=WARNING,50
=INFO,60
=DEBUG,70
=TRACE> -j, --json <path> Parse all command flags from an input json file Program Flags: -p, --pci-addr DOCA Comm Channel device PCI address -r, --rep-pci DOCA Comm Channel device representor PCI address -f, --file File to send by the client / File to write by the server -t, --timeout Application timeoutfor
receiving file content messages,default
is5
secFor additional information, please refer to the Command Line Flags section below.
NoteThe above usage printout can be printed to the command line using the
-h
(or--help
) options: (REPLACE WITH REAL PATH OF YOUR APP)./doca_file_compression -h
CLI example for running the application on BlueField: (REPLACE WITH REAL PATH & COMMAND OF YOUR APP)
./doca_file_compression -p
03
:00.0
-r 3b:00.0
-f received.txtNoteBoth the DOCA Comm Channel device PCI address (
03:00.0
) and the DOCA Comm Channel device representor PCI address (3b:00.0
) should match the addresses of the desired PCI devices.CLI example for running the application on the host: (REPLACE WITH REAL PATH & COMMAND OF YOUR APP)
./doca_file_compression -p 3b:
00.0
-f send.txtNoteThe DOCA Comm Channel device PCI address (
3b:00.0
) should match the address of the desired PCI device.The application also supports a JSON-based deployment mode, in which all command-line arguments are provided through a JSON file:
./doca_file_compression --json [json_file]
For example: (REPLACE WITH REAL PATH & COMMAND OF YOUR APP)
./doca_file_compression --json ./file_compression_params.json
NoteBefore execution, please ensure that the used JSON file contains the correct configuration parameters, and especially the desired PCI addresses needed for the deployment.
Command Line Flags
Flag Type |
Short Flag |
Long Flag/JSON Key |
Description |
JSON Content |
General flags |
|
|
Prints a help synopsis |
N/A |
|
|
Prints program version information |
N/A |
|
|
|
Set the log level for the application:
|
|
|
N/A |
|
Sets the log level for the program:
|
|
|
|
|
Parse all command flags from an input json file |
N/A |
|
Program flags |
|
|
For client – path to the file to be sent For server – path to write the file into Note
This is a mandatory flag.
|
|
|
|
Comm Channel DOCA device PCIe address Note
This is a mandatory flag.
|
|
|
|
|
Comm Channel DOCA device representor PCIe address Note
This flag is mandatory only on the DPU.
|
|
Refer to DOCA Arg Parser for more information regarding the supported flags and execution modes.
Troubleshooting
Please refer to the NVIDIA DOCA Troubleshooting for any issue you may encounter with the installation or execution of the DOCA applications.
Parse application argument.
Initialize arg parser resources and register DOCA general parameters.
doca_argp_init();
Register file compression application parameters.
register_file_compression_params();
Parse the arguments.
doca_argp_start();
Parse app parameters.
Set endpoint attributes.
set_endpoint_properties();
Set maximum message size of 4080 bytes.
Set maximum number of messages allowed.
Create comm channel endpoint.
doca_comm_channel_ep_create();
Create endpoint for client/server.
Run client/server main logic.
file_compression_client/server();
Clean up the file compression application.
file_compression_cleanup();
Free all application resources.
Arg parser destroy.
doca_argp_destroy()
/opt/mellanox/doca/applications/file_compression/
/opt/mellanox/doca/applications/file_compression/file_compression_params.json