DOCA Pipeline Language Runtime Controller Gateway SHM Application Guide
This document describes the usage of the NVIDIA DOCA Pipeline Language (DPL) Runtime Controller Gateway SHM sample application.
This sample application leverages the capabilities of DOCA Pipeline Languages (DPL) Services combined with the DPL Runtime Controller SDK to implement a VXLAN gateway application . It provides encapsulation/decapsulation logic and interactive packet/counter management.
The following diagram illustrates the connections between the DPL Runtime Controller-based application and the DPL Runtime Service (DPL daemon). The DPL daemon is loaded with a DPL gateway_shm.p4 program that implements the gateway logic.
The controller controls the DPL daemon through gRPC or Shared Memory (SHM) interfaces to insert, query, and delete HW steering rules. Arriving traffic from the wire and the VFs is processed according to the gateway program and passed to the relevant port based on the loaded program.
The application is based on two major components that work together to define and manage the forwarding state:
DPL Program (gateway_shm.p4)
A DPL program loaded onto the DPL Runtime Service:
gateway_shm.p4
/*
* SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-License-Identifier: LicenseRef-NvidiaProprietary
*
* NVIDIA CORPORATION, its affiliates and licensors retain all intellectual
* property and proprietary rights in and to this material, related
* documentation and any modifications thereto. Any use, reproduction,
* disclosure or distribution of this material and related documentation
* without an express license agreement from NVIDIA CORPORATION or
* its affiliates is strictly prohibited.
*/
#include <doca_model.p4>
#include <doca_headers.p4>
#include <doca_externs.p4>
#include <doca_parser.p4>
/*
* VxLAN tunnel gateway
* This application allows the user to have a customized tunnel gateway that can stitch VxLAN
* packets across tenant domains. End point traffic destined to local bare metal hosts can
* be decapsulated and forwarded, while gateway traffic can be decapsulated, then
* encapsulated back to the wire. This program can be easily extended to be a gateway across
* different tunnel types as well.
*/
/*
* Table sizes.
*/
const bit<32> DECAP_TABLE_SIZE = 32768;
const bit<32> ENCAP_TABLE_SIZE = 32768;
/* The directionality is based on network to host
* The user will configure the DPL port IDs in the DPL RT configuration
*/
const bit<32> WIRE_PORT = 32w0;
struct headers_t {
NV_FIXED_HEADERS
}
parser packet_parser(packet_in packet, out headers_t headers) {
NV_FIXED_PARSER(packet, headers)
}
/**
* This control performs the overlay policy including L2 encap with VxLAN
*/
control overlay_encap(
inout headers_t headers,
in nv_standard_metadata_t std_meta,
inout nv_empty_metadata_t user_meta,
inout nv_empty_metadata_t pkt_out_meta
) {
NvDirectCounter(NvCounterType.PACKETS_AND_BYTES) encap_counter;
action deny() {
encap_counter.count();
nv_drop();
}
action to_port(nv_logical_port_t port) {
encap_counter.count();
nv_send_to_port(port);
}
action vxlan_v4_encap(nv_mac_addr_t underlay_src_mac, nv_mac_addr_t underlay_dst_mac,
nv_ipv4_addr_t underlay_sip, nv_ipv4_addr_t underlay_dip, bit<24> vni, nv_logical_port_t port) {
nv_set_vxlan_v4_underlay(headers, underlay_dst_mac, underlay_src_mac, underlay_sip, underlay_dip, vni);
encap_counter.count();
nv_send_to_port(port);
}
table encap_v4_table {
key = {
headers.ipv4.dst_addr : exact;
}
actions = {
vxlan_v4_encap;
to_port;
deny;
}
size = ENCAP_TABLE_SIZE;
default_action = deny;
direct_counter = encap_counter;
nv_high_update_rate = true;
}
apply {
if (headers.ipv4.isValid()) {
encap_v4_table.apply();
}
}
}
/**
* This control is for packets from wire to host (RX)
* and includes policy for L2 decap
*/
control underlay_decap(
inout headers_t headers,
in nv_standard_metadata_t std_meta,
inout nv_empty_metadata_t user_meta,
inout nv_empty_metadata_t pkt_out_meta
) {
NvDirectCounter(NvCounterType.PACKETS_AND_BYTES) decap_counter;
action deny() {
decap_counter.count();
nv_drop();
}
action decap() {
decap_counter.count();
nv_l2_decap(headers);
}
action to_port(nv_logical_port_t port) {
nv_send_to_port(port);
}
action decap_to_port(nv_logical_port_t port) {
decap();
to_port(port);
}
table decap_v4_table {
key = {
headers.vxlan.vni : exact;
}
actions = {
decap;
to_port;
decap_to_port;
deny;
NoAction;
}
size = DECAP_TABLE_SIZE;
direct_counter = decap_counter;
default_action = NoAction;
nv_high_update_rate = true;
}
apply {
if (headers.vxlan.isValid()) {
decap_v4_table.apply();
}
}
}
control gateway(
inout headers_t headers,
in nv_standard_metadata_t std_meta,
inout nv_empty_metadata_t user_meta,
inout nv_empty_metadata_t pkt_out_meta
) {
overlay_encap() over;
underlay_decap() under;
/* user should add entries that correspond to the wire ports
* A hit means this is an RX packet, miss means a TX packet
*/
table direction_table {
key = {
std_meta.ingress_port : exact;
}
actions = {
NoAction;
}
default_action = NoAction;
const entries = {
(WIRE_PORT) : NoAction();
}
}
apply {
if (direction_table.apply().hit) {
under.apply(headers, std_meta, user_meta, pkt_out_meta);
}
else {
over.apply(headers, std_meta, user_meta, pkt_out_meta);
}
}
}
NvDocaPipeline(
packet_parser(),
gateway()
) main;
This P4 application implements a basic VXLAN termination and origination function for IPv4 traffic. Its primary goal is to differentiate between incoming packets from the underlay network (Rx/Decapsulation) and packets originating from a host (Tx/Encapsulation), applying the necessary L2 overlay policies in each direction.
The program logic is separated into three distinct Control Blocks: gateway, underlay_decap, and overlay_encap .
gateway: Responsible for directing packets into the relevant control block (underlay_decaporoverlay_encap) by matching on the ingress port.underlay_decap: Responsible for L2 decapsulation of packets from wire to host (Rx) .overlay_encap: Responsible for the overlay policy, including L2 VXLAN encapsulation of packets from host to wire (Tx) .
P4 Tables
Table Name | Control Block | Match Field | Actions | Purpose |
|
| ingress_port |
| Determines the processing direction (Rx or Tx) |
|
| VxLAN.vni |
| Core policy table for decapsulation; identifies the tenant context |
|
| IPv4.dst_addr |
| Core policy table for encapsulation; determines the tunnel endpoint and VNI |
Direct Counters
Counter Name | Tied to Table | Actions Tracked | Function |
|
|
| Counts successfully decapsulated packets and denied packets |
|
|
| Counts successfully encapsulated packets and denied packets |
Control Application (gateway.entries.json)
A control application manages the daemon's HW steering rules by receiving a JSON file that describes the desired rules.
gateway.entries.json
{
"doctype" : "gateway_shm.p4",
"tables": {
"encap_v4_table": {
"entries": [
{
"match": {
"headers.ipv4.dst_addr": "6.6.6.4"
},
"action": "vxlan_v4_encap_encap_v4_table",
"params": {
"underlay_src_mac": "3C:6D:66:11:11:11",
"underlay_dst_mac": "ff:ff:ff:ff:ff:ff",
"underlay_sip": "6.6.6.3",
"underlay_dip": "6.6.6.2",
"vni": "1",
"port": "0"
}
},
{
"match": {
"headers.ipv4.dst_addr": "6.6.6.5"
},
"action": "vxlan_v4_encap_encap_v4_table",
"params": {
"underlay_src_mac": "3C:6D:66:11:11:11",
"underlay_dst_mac": "ff:ff:ff:ff:ff:ff",
"underlay_sip": "6.6.6.3",
"underlay_dip": "6.6.6.2",
"vni": "1",
"port": "0"
}
}
]
},
"decap_v4_table": {
"entries": [
{
"match": {
"headers.vxlan.vni": "1"
},
"action": "decap_to_port_decap_v4_table",
"params": {
"port": "1"
}
},
{
"match": {
"headers.vxlan.vni": "2"
},
"action": "deny_decap_v4_table"
}
]
}
}
}
Pipeline
The following diagram shows a schematic of the gateway_shm.p4 program . The program defines dynamic decap_v4_table and encap_v4_table tables with no pre-defined rules. The DPL Runtime Controller uses gateway.entries.json to insert custom rules with the desired match values and actions into these tables.
This application leverages the following DOCA libraries:
Refer to their respective programming guide for more information.
Please refer to the DOCA Installation Guide for Linux for details on how to install BlueField-related software.
The installation of DPL Runtime Controller's sample applications contains the sources of the applications, alongside the matching compilation instructions. This allows for compiling the applications "as-is" and provides the ability to modify the sources, then compile a new version of the application.
For more information about the applications and development tips, refer to the DOCA Reference Applications page.
The application sources are located in /opt/dpl_rt_controller/samples/gateway_shm/
Prerequisites
The application relies on the
dpl_rt_controllerlibrary.DPL Development Container installed on the host. See the DPL Installation Guide for more details.
Compiling the Application
To build the Gateway application:
Compile the
gateway_shm.p4program:dplp4c.sh --target doca --odir /tmp/gateway_shm_out gateway_shm.p4
InfoBecause this application relies on the Shared Memory (SHM) Interface, it is mandatory for the application to run directly on the DPU . Therefore, the compiler's output must be copied into the DPU's file system (e.g.,
/tmp/gateway_shm_out) before compiling the applicationCompile the Gateway application:
cd/opt/dpl_rt_controller/samples/gateway_shm meson /tmp/gateway_shm -Dsample_programs_out=/tmp/gateway_shm_out ninja -C /tmp/gateway_shm
The dpl_sample_gateway_shm application is created in /tmp/gateway_shm.
The application is provided in source form and requires compilation before execution . For details, refer to section "Compiling the Application".
Prerequisites
- The DPL Runtime Service must be ready to run on the BlueField (Arm side)
- The application relies on the
json-copen source, requiring the following package to be installed:sudo apt install libjson-c-dev
Application Execution
- Start the DPL Runtime Service as detailed in the DPL Container Deployment.
Run the gateway application. Usage:
Application usage instructions
./dpl_sample_gateway_shm <device_id> <p4info path> <program path> <json_entries_path>
NoteThe device id (first argument) should match the ID of the DPL device as configured at
/etc/dpl_rt_service/devices.d/.Example:
sudo/tmp/gateway_shm/dpl_sample_gateway_shm 1000 /tmp/gateway_shm_out/gateway_shm.p4info.txt /tmp/gateway_shm_out/gateway_shm.dplconfig gateway.entries.json
gateway_main.cc:
Parses the received arguments and executes
doca_error_t gateway(uint32_t device_id, const char *p4info_path, const char *blob_path, const char *json_entries_path), responsible for the main logic.gateway_sample.cc:
Implements
doca_error_t gateway()and runs the main logic:Load the compiled program and p4info using gRPC:
DPL_P4RT_Controller::Controller ctrl = DPL_P4RT_Controller::Controller(device_id,
"localhost:9559"); ctrl.LoadProgram(p4info_path, blob_path);Connect to the DPL device:
dpl_rt_controller_connect(&connect_attr, &device);
Insert the entries from
gateway.entries.jsonusing the SHM interface:doca_error_t insert_entries_from_json(device, json_entries_path, entries) { ...
// Process encap_v4_table/decap_v4_table entriesfor(inti = 0; i < entries_count; i++) { json_object *entry_obj = json_object_array_get_idx(entries_array, i);structdpl_rt_controller_entry *dpl_entry = nullptr; dpl_shm_gateway_shm_gateway_over_encap_v4_table_entry_t *table_entry = nullptr; dpl_shm_gateway_shm_gateway_over_encap_v4_table_alloc_entry_mem(device, &dpl_entry, &table_entry); .../* Parse match fields, actions and parameters to construct the table_entry */...// Insert the entry using the SHM APIdpl_shm_gateway_shm_insert_entry(dpl_entry, nullptr); ... } }The program enters an interactive loop:
while(!quit) { std::cout <<"Press a key (e=entries, c=counters, q=quit): "<< std::flush; ...// Display all entries on 'E'/'e' key pressdisplay_all_entries(entries); ...// Read counter data on 'C'/'c' key pressread_entry_counters(device, entries); ... }The system is now configured. Test packets can be sent (e.g., via
scapy):sendp(Ether(src=\"00:11:11:11:11:11\", dst=\"00:22:22:22:22:22\") / IP(src=\"192.168.1.1\", dst=\"192.168.1.2\"), iface=\"eth2\", count=50)
On 'E'/'e' key press, all table entries are displayed:
voiddisplay_all_entries(entries) { ...// Iterate over entries per table ("encap_v4_table"/"decap_v4_table")for(constauto& table_pair : entries) {conststd::string& table_name = table_pair.first;conststd::vector<structdpl_rt_controller_entry*>& table_entries = table_pair.second; ...// Iterate over entry handlesfor(size_ti = 0; i < table_entries.size(); i++) {void*shm_entry = dpl_rt_controller_table_entry_get_shm(table_entries[i]); .../* Extract & display entry data using the controller interface */... } } }On 'C'/'c' key press, counter data for each entry is read:
InfoThe application displays counter values for each entry because
gateway_shm.p4usesdirect_counters. However, counters are only actively updated by thedecap,deny, andvxlan_v4_encapactions . Therefore, only entries using these actions will show updated values.On quit:
Cleanup entries:
doca_error_t cleanup_entries(entries) { ...
// Iterate over entries per tablefor(constauto& table_pair : entries) { ...// Iterate over entry handlesfor(size_ti = 0; i < table_entries.size(); i++) {// Delete entry using the controller interfacedpl_rt_controller_table_entry_delete(table_entries[i]); ... } } }Disconnect from device and detach:
dpl_rt_controller_disconnect(device); dpl_rt_controller_detach();