NVIDIA DOCA Simple Forward VNF Application Guide

This guide provides a Simple Forward implementation on top of NVIDIA® BlueField® DPU.

Simple forward is a forwarding application that leverages the DOCA Flow API to take either VXLAN, GRE, or GTP traffic from a single RX port and transmits it on a single TX port.

For every packet received on an RX queue on a given port, DOCA Simple Forward checks the packet's key, which consists of a 5-tuple. If it finds that the packet matches an existing flow, then it does not create a new one. Otherwise, a new flow is created with a FORWARDING component. Finally, the packet is forwarded to the TX queue of the egress port if the "rx-only" mode is not set.

The FORWARDING component type depends on the flags delivered when running the application. For example, if the hairpinq flag is provided, then the FORWARDING component would be hairpin. Otherwise, it would be RSS'd to software, and hence every VXLAN, GTP, or GRE packet would be received on RX queues.

Simple forward should be run with dual ports. By using a traffic generator, the RX port receives the VXLAN, GRE, or GTP packets and forwarding forwards them back to the traffic generator.

The following diagram illustrates simple forward's packet flows. It receives traffic coming from the wire and passes it to the other port.

system-design-diagram-version-1-modificationdate-1707420967293-api-v2.png

Simple forward first initializes DPDK, after which the application handles the incoming packets.

The following diagram illustrates the initialization process.

initialization-process-illustration-version-1-modificationdate-1707420967063-api-v2.png

  1. Init_DPDK – EAL init, parse argument from command line and register signal.

  2. Start port – mbuf_create, dev_configure, rx/tx/hairpin queue setup and start the port.

  3. Simple_fwd INIT – create flow tables, build default forward pipes.

The following diagram illustrates how to process the packet.

packet-processing-illustration-version-1-modificationdate-1707420966600-api-v2.png

  1. Based on the packet's info, find the key values (e.g. src/dst IP, src/dst port, etc).

  2. Traverse the inner flow tables, check if the keys exist or not.

    • If yes, update inner counter

    • If no, a new flow table is added to the DPU

  3. Forward the packet to the other port.

This application leverages the following DOCA library:

Refer to its respective programming guide for more information.

Installation

Refer to the NVIDIA DOCA Installation Guide for Linux for details on how to install BlueField-related software.

Prerequisites

  1. A FLEX profile number should be manually set to 3 on the system for the application to build the GRE, standard VXLAN and GRE pipes.

    1. Set FLEX profile number to 3 from the DPU.

      Copy
      Copied!
                  

      sudo mlxconfig -d <pcie_address> s FLEX_PARSER_PROFILE_ENABLE=3

    2. Reset the firmware from the host side by performing graceful shutdown and power cycling.

      Copy
      Copied!
                  

      ipmitool power cycle

      Note

      Resetting the firmware can be done from the DPU as well. For more information, refer to step 3.b of the "Upgrading Firmware" section of the NVIDIA DOCA Installation Guide for Linux.

  2. The Simple Forward application is based on DOCA Flow. Therefore, the user is required to allocate huge pages.

    Copy
    Copied!
                

    echo '2048' | sudo tee -a /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

    On some operating systems (RockyLinux, OpenEuler, CentOS 8.2) the default huge page size on the DPU (and Arm hosts) is larger than 2MB, and is often 512MB instead. Once can find out the sige of the huge pages using the following command:

    Copy
    Copied!
                

    $ grep -i huge /proc/meminfo   AnonHugePages: 0 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 4 HugePages_Free: 4 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 524288 kB Hugetlb: 6291456 kB

    Given that the guiding principal is to allocate 4GB of RAM, in such cases instead of allocating 2048 pages, one should allocate the matching amount (8 pages):

    Copy
    Copied!
                

    echo '8' | sudo tee -a /sys/kernel/mm/hugepages/hugepages-524288kB/nr_hugepages

Application Execution

The simple forward application is provided in both source and binary forms. The binary is located under /opt/mellanox/doca/applications/simple_fwd_vnf/bin/doca_simple_fwd_vnf.

  1. Application usage instructions:

    Copy
    Copied!
                

    Usage: doca_simple_forward_vnf [DPDK Flags] -- [DOCA Flags] [Program Flags]   DOCA Flags: -h, --help Print a help synopsis -v, --version Print program version information -l, --log-level Set the (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE> --sdk-log-level Set the SDK (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE> -j, --json <path> Parse all command flags from an input json file   Program Flags: -t, --stats-timer <time> Set interval to dump stats information -q, --nr-queues <num> Set queues number -r, --rx-only Set rx only -o, --hw-offload Set PCI address of the RXP engine to use -hq, --hairpinq Set forwarding to hairpin queue -a, --age-thread Start thread do aging

    Note

    This usage printout can be printed to the command line using the -h (or --help) options:

    Copy
    Copied!
                

    /opt/mellanox/doca/applications/simple_fwd_vnf/bin/doca_simple_fwd_vnf -- -h

    Note

    For additional information, refer to section "Command Line Flags".

  2. CLI example for running the application on the BlueField:

    Copy
    Copied!
                

    /opt/mellanox/doca/applications/simple_fwd_vnf/bin/doca_simple_fwd_vnf -a auxiliary:mlx5_core.sf.4 -a auxiliary:mlx5_core.sf.5 -- -l 60

    Warning

    SFs must be enabled according to the NVIDIA BlueField DPU Scalable Function User Guide.

    Before creating SFs on a specific physical port, it is important to verify the encap mode on the respective PF FDB. The default mode is basic. To check the encap mode, run:

    Copy
    Copied!
                

    cat /sys/class/net/p0/compat/devlink/encap

    In this case, disable encap on the PF FDB before creating the SFs by running:

    Copy
    Copied!
                

    /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.0 mode legacy /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.1 mode legacy echo none > /sys/class/net/p0/compat/devlink/encap echo none > /sys/class/net/p1/compat/devlink/encap /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.0 mode switchdev /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.1 mode switchdev

    If the encap mode is set to basic then the application fails upon initialization.

    Warning

    The flag -a auxiliary:mlx5_core.sf.4 -a auxiliary:mlx5_core.sf.5 is mandatory for proper usage of the application.

    1. Modifying this flag results unexpected behavior as only 2 ports are supported.

    2. The SF number is arbitrary and configurable.

    Warning

    The SF numbers must match the desired SF devices.

  3. CLI example for running the application on the host:

    Copy
    Copied!
                

    /opt/mellanox/doca/applications/simple_fwd_vnf/bin/doca_simple_fwd_vnf -a 04:00.3 -a 04:00.4 -- -l 60

    Warning

    The device identifiers must match the desired network devices.

    Note

    For more information, refer to section "Running DOCA Application on Host" in NVIDIA DOCA Virtual Functions User Guide.

  4. The application also supports a JSON-based deployment mode, in which all command-line arguments are provided through a JSON file:

    Copy
    Copied!
                

    doca_simple_fwd_vnf --json [json_file]

    For example:

    Copy
    Copied!
                

    cd /opt/mellanox/doca/applications/simple_fwd_vnf/bin ./doca_simple_fwd_vnf --json ./simple_fwd_params.json

    Warning

    Before execution, ensure that the used JSON file contains the correct configuration parameters, and especially the PCIe addresses necessary for the deployment.

Command Line Flags

Flag Type

Short Flag

Long Flag/JSON Key

Description

JSON Content

DPDK Flags

a

devices

Add a PCIe device into the list of devices to probe.

Copy
Copied!
            

"devices": [ {"device": "sf", "id": "4","sft": true}, {"device": "sf", "id": "5","sft": true}, ]

General flags

h

help

Prints a help synopsis

N/A

v

version

Prints program version information

N/A

l

log-level

Set the log level for the application:

  • DISABLE=10

  • CRITICAL=20

  • ERROR=30

  • WARNING=40

  • INFO=50

  • DEBUG=60

  • TRACE=70 ( requires compilation with TRACE log level support )

Copy
Copied!
            

"log-level": 60

N/A

sdk-log-level

Sets the log level for the program:

  • DISABLE=10

  • CRITICAL=20

  • ERROR=30

  • WARNING=40

  • INFO=50

  • DEBUG=60

  • TRACE=70

Copy
Copied!
            

"sdk-log-level": 40

j

json

Parse all command flags from an input JSON file

N/A

Program flags

t

stats-timer

Set interval to dump stats information.

Copy
Copied!
            

"stats-timer": 2

q

nr-queues

Set queues number.

Copy
Copied!
            

"nr-queues": 4

r

rx-only

Set RX only. When set, the packets will not be sent to the TX queues.

Copy
Copied!
            

"rx-only": false

o

hw-offload

Set HW offload of the RXP engine to use.

Copy
Copied!
            

"hw-offload": false

hq

hairpinq

Set forwarding to hairpin queue.

Copy
Copied!
            

"hairpinq": false

a

age-thread

Start a dedicated thread that handles the aged flows.

Copy
Copied!
            

"age-thread": false

Note

Refer to DOCA Arg Parser for more information regarding the supported flags and execution modes.


Troubleshooting

Refer to the NVIDIA DOCA Troubleshooting Guide for any issue encountered with the installation or execution of the DOCA applications .

In addition to providing the application in binary form, the installation also includes all of the application sources and compilation instructions so as to allow modifying the sources and recompiling the application. For more information about the applications, as well as development and compilation tips, refer to the DOCA Applications page.

The sources of the application can be found under the /opt/mellanox/doca/applications/simple_fwd_vnf/src directory.

Recompiling All Applications

The applications are all defined under a single meson project, so the default compilation recompiles all the DOCA applications.

To build all the applications together, run:

Copy
Copied!
            

cd /opt/mellanox/doca/applications/ meson /tmp/build ninja -C /tmp/build

Note

doca_simple_fwd_vnf is created under /tmp/build/simple_fwd_vnf/src/.


Recompiling Simple Forward Application Only

To directly build only the simple forward application:

Copy
Copied!
            

cd /opt/mellanox/doca/applications/ meson /tmp/build -Denable_all_applications=false -Denable_simple_fwd_vnf=true ninja -C /tmp/build

Note

doca_simple_fwd_vnf is created under /tmp/build/simple_fwd_vnf/src/.

Alternatively, users can set the desired flags in the meson_options.txt file instead of providing them in the compilation command line:

  1. Edit the following flags in /opt/mellanox/doca/applications/meson_options.txt:

    • Set enable_all_applications to false

    • Set enable_simple_fwd_vnf to true

  2. Run the following compilation commands :

    Copy
    Copied!
                

    cd /opt/mellanox/doca/applications/ meson /tmp/build ninja -C /tmp/build

    Note

    doca_simple_fwd_vnf is created under /tmp/build/simple_fwd_vnf/src/.

Troubleshooting

Refer to the NVIDIA DOCA Troubleshooting Guide for any issue encountered with the compilation of the application .

  1. Parse application argument.

    1. Initialize arg parser resources and register DOCA general parameters.

      Copy
      Copied!
                  

      doca_argp_init();

    2. Register application parameters.

      Copy
      Copied!
                  

      register_simple_fwd_params();

    3. Parse the arguments.

      Copy
      Copied!
                  

      doca_argp_start();

      1. Parse DPDK flags and invoke handler for calling the rte_eal_init() function.

      2. Parse app parameters.

  2. DPDK initialization.

    Copy
    Copied!
                

    dpdk_init();

    Calls rte_eal_init() to initialize EAL resources with the provided EAL flags.

  3. DPDK port initialization and start.

    Copy
    Copied!
                

    dpdk_queues_and_ports_init();

    1. Initialize DPDK ports.

    2. Create mbuf pool using rte_pktmbuf_pool_create.

    3. Driver initialization – use rte_eth_dev_configure to configure the number of queues.

    4. Rx/Tx queue initialization – use rte_eth_rx_queue_setup and rte_eth_tx_queue_setup to initialize the queues.

    5. Rx hairpin queue initialization – use rte_eth_rx_hairpin_queue_setup to initialize the queues.

    6. Start the port using rte_eth_dev_start.

  4. Simple forward initialization.

    Copy
    Copied!
                

    simple_fwd_init();

    1. simple_fwd_create_ins – create flow tables using simple_fwd_ft_create.

    2. simple_fwd_init_ports_and_pipes – initialize DOCA port using simple_fwd_init_doca_port and build default pipes for each port.

  5. Main loop.

    Copy
    Copied!
                

    simple_fwd_process_pkts();

    1. Receive packets using rte_eth_rx_burst in a loop.

    2. Process packets using simple_fwd_process_offload.

    3. Transmit the packets on the other port by calling rte_eth_tx_burst. Or free the packet mbuf if rx_only is set to true.

  6. Process packets.

    Copy
    Copied!
                

    simple_fwd_process_offload();

    1. Parse the packet's rte_mbuf using simple_fwd_pkt_info.

    2. Handle the packet using simple_fwd_handle_packet. If the packet's key does not match the existed the flow entry, create a new flow entry and PIPE using simple_fwd_handle_new_flow. Otherwise, increase the total packet's counter.

  7. Simple forward destroy.

    Copy
    Copied!
                

    simple_fwd_destroy();

    Simple forward close port and clean the flow resources.

  8. DPDK ports and queues destruction.

    Copy
    Copied!
                

    dpdk_queues_and_ports_fini();

  9. DPDK finish.

    Copy
    Copied!
                

    dpdk_fini();

    Calls rte_eal_destroy() to destroy initialized EAL resources.

  10. Arg parser destroy.

    Copy
    Copied!
                

    doca_argp_destroy();

    • Free DPDK resources by call rte_eal_cleanup() function.

  • /opt/mellanox/doca/applications/simple_fwd_vnf/src

  • /opt/mellanox/doca/applications/simple_fwd_vnf/bin/simple_fwd_params.json

© Copyright 2023, NVIDIA. Last updated on Feb 9, 2024.