DOCA Documentation v3.2.0

DOCA Pipeline Language Runtime Controller Gateway SHM Application Guide

This document describes the usage of the NVIDIA DOCA Pipeline Language (DPL) Runtime Controller Gateway SHM sample application.

This sample application leverages the capabilities of DOCA Pipeline Languages (DPL) Services combined with the DPL Runtime Controller SDK to implement a VXLAN gateway application . It provides encapsulation/decapsulation logic and interactive packet/counter management.

The following diagram illustrates the connections between the DPL Runtime Controller-based application and the DPL Runtime Service (DPL daemon). The DPL daemon is loaded with a DPL gateway_shm.p4 program that implements the gateway logic.

The controller controls the DPL daemon through gRPC or Shared Memory (SHM) interfaces to insert, query, and delete HW steering rules. Arriving traffic from the wire and the VFs is processed according to the gateway program and passed to the relevant port based on the loaded program.

image-2025-10-19_17-31-15-version-2-modificationdate-1762832202120-api-v2.png

The application is based on two major components that work together to define and manage the forwarding state:

DPL Program (gateway_shm.p4)

A DPL program loaded onto the DPL Runtime Service:

gateway_shm.p4

Copy
Copied!
            

/* * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. * SPDX-License-Identifier: LicenseRef-NvidiaProprietary * * NVIDIA CORPORATION, its affiliates and licensors retain all intellectual * property and proprietary rights in and to this material, related * documentation and any modifications thereto. Any use, reproduction, * disclosure or distribution of this material and related documentation * without an express license agreement from NVIDIA CORPORATION or * its affiliates is strictly prohibited. */   #include <doca_model.p4> #include <doca_headers.p4> #include <doca_externs.p4> #include <doca_parser.p4>   /* * VxLAN tunnel gateway * This application allows the user to have a customized tunnel gateway that can stitch VxLAN * packets across tenant domains. End point traffic destined to local bare metal hosts can * be decapsulated and forwarded, while gateway traffic can be decapsulated, then * encapsulated back to the wire. This program can be easily extended to be a gateway across * different tunnel types as well. */   /* * Table sizes. */ const bit<32> DECAP_TABLE_SIZE = 32768; const bit<32> ENCAP_TABLE_SIZE = 32768;   /* The directionality is based on network to host * The user will configure the DPL port IDs in the DPL RT configuration */ const bit<32> WIRE_PORT = 32w0;   struct headers_t { NV_FIXED_HEADERS }   parser packet_parser(packet_in packet, out headers_t headers) { NV_FIXED_PARSER(packet, headers) }   /** * This control performs the overlay policy including L2 encap with VxLAN */ control overlay_encap( inout headers_t headers, in nv_standard_metadata_t std_meta, inout nv_empty_metadata_t user_meta, inout nv_empty_metadata_t pkt_out_meta ) { NvDirectCounter(NvCounterType.PACKETS_AND_BYTES) encap_counter;   action deny() { encap_counter.count(); nv_drop(); }   action to_port(nv_logical_port_t port) { encap_counter.count(); nv_send_to_port(port); }   action vxlan_v4_encap(nv_mac_addr_t underlay_src_mac, nv_mac_addr_t underlay_dst_mac, nv_ipv4_addr_t underlay_sip, nv_ipv4_addr_t underlay_dip, bit<24> vni, nv_logical_port_t port) { nv_set_vxlan_v4_underlay(headers, underlay_dst_mac, underlay_src_mac, underlay_sip, underlay_dip, vni); encap_counter.count(); nv_send_to_port(port); } table encap_v4_table { key = { headers.ipv4.dst_addr : exact; } actions = { vxlan_v4_encap; to_port; deny; } size = ENCAP_TABLE_SIZE; default_action = deny; direct_counter = encap_counter; nv_high_update_rate = true; }   apply { if (headers.ipv4.isValid()) { encap_v4_table.apply(); } } }     /** * This control is for packets from wire to host (RX) * and includes policy for L2 decap */ control underlay_decap( inout headers_t headers, in nv_standard_metadata_t std_meta, inout nv_empty_metadata_t user_meta, inout nv_empty_metadata_t pkt_out_meta ) { NvDirectCounter(NvCounterType.PACKETS_AND_BYTES) decap_counter;   action deny() { decap_counter.count(); nv_drop(); }   action decap() { decap_counter.count(); nv_l2_decap(headers); }   action to_port(nv_logical_port_t port) { nv_send_to_port(port); }   action decap_to_port(nv_logical_port_t port) { decap(); to_port(port); }   table decap_v4_table { key = { headers.vxlan.vni : exact; } actions = { decap; to_port; decap_to_port; deny; NoAction; } size = DECAP_TABLE_SIZE; direct_counter = decap_counter; default_action = NoAction; nv_high_update_rate = true; }   apply { if (headers.vxlan.isValid()) { decap_v4_table.apply(); } } }   control gateway( inout headers_t headers, in nv_standard_metadata_t std_meta, inout nv_empty_metadata_t user_meta, inout nv_empty_metadata_t pkt_out_meta ) { overlay_encap() over; underlay_decap() under;   /* user should add entries that correspond to the wire ports * A hit means this is an RX packet, miss means a TX packet */ table direction_table { key = { std_meta.ingress_port : exact; } actions = { NoAction; } default_action = NoAction; const entries = { (WIRE_PORT) : NoAction(); } }   apply { if (direction_table.apply().hit) { under.apply(headers, std_meta, user_meta, pkt_out_meta); } else { over.apply(headers, std_meta, user_meta, pkt_out_meta); } } }   NvDocaPipeline( packet_parser(), gateway() ) main;

This P4 application implements a basic VXLAN termination and origination function for IPv4 traffic. Its primary goal is to differentiate between incoming packets from the underlay network (Rx/Decapsulation) and packets originating from a host (Tx/Encapsulation), applying the necessary L2 overlay policies in each direction.

The program logic is separated into three distinct Control Blocks: gateway, underlay_decap, and overlay_encap .

  • gateway: Responsible for directing packets into the relevant control block (underlay_decap or overlay_encap) by matching on the ingress port.

  • underlay_decap: Responsible for L2 decapsulation of packets from wire to host (Rx) .

  • overlay_encap: Responsible for the overlay policy, including L2 VXLAN encapsulation of packets from host to wire (Tx) .

P4 Tables

Table Name

Control Block

Match Field

Actions

Purpose

direction_table

gateway

ingress_port

NoAction (default)

Determines the processing direction (Rx or Tx)

decap_v4_table 1

underlay_decap

VxLAN.vni

decap, to_port, decap_to_port, deny, NoAction (default)

Core policy table for decapsulation; identifies the tenant context

encap_v4_table 1

overlay_encap

IPv4.dst_addr

vxlan_v4_encap, to_port, deny (default)

Core policy table for encapsulation; determines the tunnel endpoint and VNI

  1. High update rate table  

Direct Counters

Counter Name

Tied to Table

Actions Tracked

Function

decap_counter

decap_v4_table

decap, deny

Counts successfully decapsulated packets and denied packets

encap_counter

encap_v4_table

vxlan_v4_encap, deny

Counts successfully encapsulated packets and denied packets

Control Application (gateway.entries.json)

A control application manages the daemon's HW steering rules by receiving a JSON file that describes the desired rules.

gateway.entries.json

Copy
Copied!
            

{ "doctype" : "gateway_shm.p4", "tables": { "encap_v4_table": { "entries": [ { "match": { "headers.ipv4.dst_addr": "6.6.6.4" }, "action": "vxlan_v4_encap_encap_v4_table", "params": { "underlay_src_mac": "3C:6D:66:11:11:11", "underlay_dst_mac": "ff:ff:ff:ff:ff:ff", "underlay_sip": "6.6.6.3", "underlay_dip": "6.6.6.2", "vni": "1", "port": "0" } }, { "match": { "headers.ipv4.dst_addr": "6.6.6.5" }, "action": "vxlan_v4_encap_encap_v4_table", "params": { "underlay_src_mac": "3C:6D:66:11:11:11", "underlay_dst_mac": "ff:ff:ff:ff:ff:ff", "underlay_sip": "6.6.6.3", "underlay_dip": "6.6.6.2", "vni": "1", "port": "0" } } ] }, "decap_v4_table": { "entries": [ { "match": { "headers.vxlan.vni": "1" }, "action": "decap_to_port_decap_v4_table", "params": { "port": "1" } }, { "match": { "headers.vxlan.vni": "2" }, "action": "deny_decap_v4_table" } ] } } }


Pipeline

The following diagram shows a schematic of the gateway_shm.p4 program . The program defines dynamic decap_v4_table and encap_v4_table tables with no pre-defined rules. The DPL Runtime Controller uses gateway.entries.json to insert custom rules with the desired match values and actions into these tables.

image-2025-10-19_11-42-49-version-2-modificationdate-1762832248413-api-v2.png

This application leverages the following DOCA libraries:

Refer to their respective programming guide for more information.

Info

Please refer to the DOCA Installation Guide for Linux for details on how to install BlueField-related software.

The installation of DPL Runtime Controller's sample applications contains the sources of the applications, alongside the matching compilation instructions. This allows for compiling the applications "as-is" and provides the ability to modify the sources, then compile a new version of the application.

Tip

For more information about the applications and development tips, refer to the DOCA Reference Applications page.

The application sources are located in /opt/dpl_rt_controller/samples/gateway_shm/

Prerequisites

Compiling the Application

To build the Gateway application:

  1. Compile the gateway_shm.p4 program:

    Copy
    Copied!
                

    dplp4c.sh --target doca --odir /tmp/gateway_shm_out gateway_shm.p4

    Info

    Because this application relies on the Shared Memory (SHM) Interface, it is mandatory for the application to run directly on the DPU . Therefore, the compiler's output must be copied into the DPU's file system (e.g., /tmp/gateway_shm_out) before compiling the application

  2. Compile the Gateway application:

    Copy
    Copied!
                

    cd /opt/dpl_rt_controller/samples/gateway_shm meson /tmp/gateway_shm -Dsample_programs_out=/tmp/gateway_shm_out ninja -C /tmp/gateway_shm

Info

The dpl_sample_gateway_shm application is created in /tmp/gateway_shm.


Note

The application is provided in source form and requires compilation before execution . For details, refer to section "Compiling the Application".

Prerequisites

  • The DPL Runtime Service must be ready to run on the BlueField (Arm side)
  • The application relies on the json-c open source, requiring the following package to be installed:
    Copy
    Copied!
                

    sudo apt install libjson-c-dev

Application Execution

  1. Start the DPL Runtime Service as detailed in the DPL Container Deployment.
  2. Run the gateway application. Usage:

    Application usage instructions

    Copy
    Copied!
                

    ./dpl_sample_gateway_shm <device_id> <p4info path> <program path> <json_entries_path>

    Note

    The device id (first argument) should match the ID of the DPL device as configured at /etc/dpl_rt_service/devices.d/.

    Example:

    Copy
    Copied!
                

    sudo /tmp/gateway_shm/dpl_sample_gateway_shm 1000 /tmp/gateway_shm_out/gateway_shm.p4info.txt /tmp/gateway_shm_out/gateway_shm.dplconfig gateway.entries.json

  1. gateway_main.cc:

    Parses the received arguments and executes doca_error_t gateway(uint32_t device_id, const char *p4info_path, const char *blob_path, const char *json_entries_path) , responsible for the main logic.

  2. gateway_sample.cc:

    Implements doca_error_t gateway() and runs the main logic:

    1. Load the compiled program and p4info using gRPC:

      Copy
      Copied!
                  

      DPL_P4RT_Controller::Controller ctrl = DPL_P4RT_Controller::Controller(device_id, "localhost:9559"); ctrl.LoadProgram(p4info_path, blob_path);

    2. Connect to the DPL device:

      Copy
      Copied!
                  

      dpl_rt_controller_connect(&connect_attr, &device);

    3. Insert the entries from gateway.entries.json using the SHM interface:

      Copy
      Copied!
                  

      doca_error_t insert_entries_from_json(device, json_entries_path, entries) { ... // Process encap_v4_table/decap_v4_table entries    for (int i = 0; i < entries_count; i++) {        json_object *entry_obj = json_object_array_get_idx(entries_array, i);    struct dpl_rt_controller_entry *dpl_entry = nullptr; dpl_shm_gateway_shm_gateway_over_encap_v4_table_entry_t *table_entry = nullptr; dpl_shm_gateway_shm_gateway_over_encap_v4_table_alloc_entry_mem(device, &dpl_entry, &table_entry); ... /* Parse match fields, actions and parameters to construct the table_entry */ ...            // Insert the entry using the SHM API         dpl_shm_gateway_shm_insert_entry(dpl_entry, nullptr);         ... } }

    4. The program enters an interactive loop:

      Copy
      Copied!
                  

      while (!quit) {     std::cout << "Press a key (e=entries, c=counters, q=quit): " << std::flush;     ... // Display all entries on 'E'/'e' key press    display_all_entries(entries); ... // Read counter data on 'C'/'c' key press    read_entry_counters(device, entries); ... }

    5. The system is now configured. Test packets can be sent (e.g., via scapy):

      Copy
      Copied!
                  

      sendp(Ether(src=\"00:11:11:11:11:11\", dst=\"00:22:22:22:22:22\") / IP(src=\"192.168.1.1\", dst=\"192.168.1.2\"), iface=\"eth2\", count=50)

    6. On 'E'/'e' key press, all table entries are displayed:

      Copy
      Copied!
                  

      void display_all_entries(entries) { ... // Iterate over entries per table ("encap_v4_table"/"decap_v4_table")    for (const auto& table_pair : entries) {    const std::string& table_name = table_pair.first; const std::vector<struct dpl_rt_controller_entry*>& table_entries = table_pair.second; ... // Iterate over entry handles    for (size_t i = 0; i < table_entries.size(); i++) {             void *shm_entry = dpl_rt_controller_table_entry_get_shm(table_entries[i]); ...             /* Extract & display entry data using the controller interface */ ... } } }

    7. On 'C'/'c' key press, counter data for each entry is read:

      Copy
      Copied!
                  

      Info

      The application displays counter values for each entry because gateway_shm.p4 uses direct_counters. However, counters are only actively updated by the decap, deny, and vxlan_v4_encap actions . Therefore, only entries using these actions will show updated values.

    8. On quit:

      1. Cleanup entries:

        Copy
        Copied!
                    

        doca_error_t cleanup_entries(entries) { ... // Iterate over entries per table for (const auto& table_pair : entries) { ... // Iterate over entry handles for (size_t i = 0; i < table_entries.size(); i++) { // Delete entry using the controller interface dpl_rt_controller_table_entry_delete(table_entries[i]); ... } } }

      2. Disconnect from device and detach:

        Copy
        Copied!
                    

        dpl_rt_controller_disconnect(device); dpl_rt_controller_detach();

© Copyright 2025, NVIDIA. Last updated on Nov 20, 2025