NVIDIA BlueField Platform Software Troubleshooting Guide

DOCA Flow

This page offers troubleshooting information for DOCA Flow users and customers.

DOCA Flow operates as an independent project but depends on DPDK. Consequently, some issues users may encounter might originate from DPDK, accompanied by DPDK-specific error messages. For troubleshooting such errors, consult the MLNX_DPDK Troubleshooting Guide.

To maximize the benefits of DOCA Flow, the following resources are recommended:

  • NVIDIA DOCA Flow Programming Guide

    1. Start with the introduction, then proceed to the "Steering Domains" section for a foundational understanding of DOCA Flow's core concepts.

    2. Review the "Flow Life Cycle" section to gain insights into key operational stages, including:

      • Initialization

      • Pipe creation and entry insertion

      • Teardown

  • NVIDIA DOCA Library APIs – consult the API reference to identify and implement the appropriate DOCA Flow functions for specific use cases

  • Sample code – NVIDIA provides a comprehensive set of sample codes covering all released features. These samples are valuable for both theoretical understanding and practical implementation in related projects.

  • NVIDIA DOCA Flow Connection Tracking Programming Guide – this guide explains the Connection Tracking functionality integrated within DOCA Flow, which is particularly useful for applications requiring advanced session management

Command

Description

ibdev2netdev -v

Part of the OFED package. This command displays associations between network devices and RDMA adapter ports.

lspci

A Linux command used to provide information about each PCI bus on the system

ethtool

A Linux command used to query or control network driver and hardware settings

ip

The ip command is used to assign addresses to network interfaces and configure network parameters on Linux systems. It serves as a replacement for the deprecated ifconfig command on modern Linux distributions.

devlink

devlink is an API used to expose device information and resources not directly associated with specific device classes, such as chip-wide or switch-ASIC-wide configurations.

echo -n $vf0_pci > /sys/bus/pci/drivers/mlx5_core/unbind

echo -n $vf1_pci > /sys/bus/pci/drivers/mlx5_core/unbind

devlink dev eswitch set pci/${pci_addr} mode switchdev

echo $vf_num > /sys/bus/pci/devices/${pci_addr}/sriov_numvfs

Or:

echo $vf_num >/sys/bus/pci/devices/$pci/mlx5_num_vfs

echo -n $vf0_pci > /sys/bus/pci/drivers/mlx5_core/bind

echo -n $vf1_pci > /sys/bus/pci/drivers/mlx5_core/bind

Sets switchdev mode with 2 VF's.

Note

The mlx5_num_vfs parameter is always available, regardless of whether the OS has loaded the virtualization module (e.g., when adding intel_iommu support to the GRUB file). In contrast, the sriov_numvfs parameter is applicable only if intel_iommu has been added to the GRUB file. If the sriov_numvfs file is not visible, verify that intel_iommu has been correctly included in the GRUB configuration.

<doca_flow_sample> --help

Displays help and options related to EAL, providing guidance on how to pass device parameters

<doca_flow_sample> -- --help

Displays the options available for the DOCA Flow application

<doca_flow_sample> -- --log-level <N>

Sets the log level 1 for the sample application. DOCA provides fine-grained control by offering separate logging paths for both the sample and the SDK.

<doca_flow_sample> -- --sdk-log-level <N>

Sets the log level 1 for the SDK. Similar to the sample application's log control, this option allows precise logging management for the SDK.

  1. Log level values:

    • 10 – DISABLE

    • 20 – CRITICAL

    • 30 – ERROR

    • 40 – WARNING

    • 50 – INFO

    • 60 – DEBUG

    • 70 – TRACE

               

DOCA Flow is an integral component of the DOCA ecosystem, providing application-level and SDK-level logging capabilities. Detailed information on logging can be found in the Debuggability section of the DOCA documentation.

DOCA Flow also supports a comprehensive set of counters that can be attached to pipes or entries. For more details, refer to the "Setting Pipe Monitoring" section in the NVIDIA DOCA Flow Programming Guide.

Counters are particularly useful for troubleshooting scenarios where packets are being dropped. To diagnose such issues, set up non-shared counters along the expected flow route to identify where the drops are occurring.

Debug & Trace Features

The DOCA SDK development packages (doca-devel) include developer-oriented packages that provide additional trace and debug capabilities not included in the production libraries:

  • For .deb-based systems – libdoca-sdk-flow-trace

  • For .rpm-based systems – doca-sdk-flow-trace

These packages install the trace-enabled versions of the libraries in the following directories:

  • .deb-based systems – /opt/mellanox/doca/lib/<arch>/trace

  • .rpm-based systems – /opt/mellanox/doca/lib64/trace

For detailed information on the capabilities provided by these trace libraries and best practices for their use, refer to the corresponding section in the DOCA Flow Programming Guide. Links to the guide can be found in the Preface.

Using a Custom DPDK

To use a custom DPDK version, follow the DPDK troubleshooting guidelines referenced in the Preface to compile the project and install it either locally or system-wide.

After compilation, ensure the PKG_CONFIG_PATH and LD_LIBRARY_PATH environment variables are configured to point exclusively to the newly compiled DPDK. For example, on Ubuntu 22.04, these variables can be set as follows:

Copy
Copied!
            

ARCH=`uname -m` export PKG_CONFIG_PATH=<DPDK_INSTALL_PATH>/lib/$ARCH-linux-gnu/pkgconfig export LD_LIBRARY_PATH=<DPDK_INSTALL_PATH>/lib/$ARCH-linux-gnu/

Once the environment variables are set, reconfigure and recompile any DOCA Flow-related applications or samples to link them with the custom DPDK.

Functional Debugging with Scapy and Monitor

When packets hit the wrong pipe entries, debugging can be performed with a minimal traffic load. Follow these steps to identify and resolve the issue:

  1. Use Scapy to construct and send packets from the desired port toward the host or device where DOCA Flow is expected to be listening. For detailed instructions on building and sending packets, refer to the Scapy Documentation.

  2. Use Scapy's sniffing functionality to capture and analyze the returning packets. This will help determine whether the expected changes were applied. Refer to the Scapy Sniffing Documentation for usage details.

  3. Leverage the monitor feature to set up counters for tracing packets through the system. After sending the traffic, query these counters and print the statistics to identify which pipes and pipe entries were traversed. Ensure that a monitor is included on the default entry to obtain a comprehensive view of the packet's path.

Performance Testing with TRex

TRex can be used as a traffic generator for performance testing with DOCA Flow. Install the appropriate version for your system from the TRex website.

Refer to the TRex Documentation for instructions on configuring the server and client, and generating traffic based on specific test requirements.

To measure packet rates or bandwidth, perform the following:

  • Retrieve metrics such as packets per second or bandwidth directly from the TRex server.

  • Alternatively, use the mlnx_perf tool to view statistics at the physical interface level. To check packets per second, filter the output by searching for rx_packets_phy or tx_packets_phy.

This setup enables comprehensive performance testing and accurate evaluation of DOCA Flow's capabilities.

Steering Dump Tool

The mlx_steering_dump tool can be used to parse and analyze hardware configurations, providing insights into the hardware structures created by DOCA Flow and their relationships.

Unclear How to Use a Feature

For every feature, small sample applications are available to demonstrate functionality and usage. The DOCA Flow documentation includes a dedicated "Samples" section, which serves as a valuable reference. These samples can be used as a starting point for implementation or as a guide for creating custom solutions.

It is recommended to avoid using the samples without understanding their context. Refer to the accompanying sample documentation, where key details about feature usage are explained, to ensure proper implementation and avoid potential issues.

DOCA Flow Error When Adding New Entry to Pipe

An error occurs when attempting to add a new entry function to a pipe.

The error message may resemble the following:

Copy
Copied!
            

mlx5_common: Failed to create TIR using DevX mlx5_net: Port 0 cannot create DevX TIR. [10:26:39:622581][DOCA][ERR][dpdk_engine]: create pipe entry fail on index:1, error=Port 0 create flow fail, type 1 message: cannot get hash queue, type=8 

This issue is likely caused by an incorrect SF/ports configuration.

To resolve the issue, execute the following commands on the BlueField device:

Copy
Copied!
            

dpu# /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.0 mode legacy dpu# /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.1 mode legacy dpu# echo none > /sys/class/net/p0/compat/devlink/encap dpu# echo none > /sys/class/net/p1/compat/devlink/encap dpu# /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.0 mode switchdev dpu# /opt/mellanox/iproute2/sbin/devlink dev eswitch set pci/0000:03:00.1 mode switchdeV

After applying these commands, the configuration should allow successful addition of new pipe entries.

Match is Not Working - All Packets are Matched

Check the pipe’s match configuration. If both match and match_mask are provided, ensure the match_mask reflects the intended criteria. A typical mistake is providing a match_mask but leaving it unfilled, resulting in a zeroed match in the hardware rule. Set the match_mask to NULL or properly fill it in.

This behavior involves implicit and explicit types of matches, described in section "Setting Pipe Match or Action" in NVIDIA DOCA Flow Programming Guide.

Both UDP and TCP Packets are Matched, However, Only TCP was Intended

DOCA Flow operates in Relaxed Match mode (refer to the "Setting Pipe Match or Action" section). In this mode, only doca_flow_parser_meta controls header-type matching. The doca_flow_match enums, such as match.outer.l4_type_ext, do not influence header-type matching. Instead, these enums act as selectors, specifying how DOCA Flow handles unions for multiple headers, such as TCP and UDP.

For guidance on configuring the correct header-type match, consult the "Flow Parser Meta" section in the documentation.

Match Structure is Configured to Match on Type, but Runtime Error Occurs

As previously noted, header-type matching is controlled by doca_flow_parser_meta. Verify that the application code is correctly utilizing this mechanism.

If the configuration includes only type selectors, such as match.outer.l4_type_ext within outer, inner, or tun, without selecting a specific field, it may result in a runtime error due to the absence of a defined match field.

Control Pipe is Configured with Monitor, but Querying the Counter Returns an Error

For control pipes, counters are configured on a per-entry basis. Ensure that an empty control pipe is created, and counters are properly configured for each inserted entry.

© Copyright 2025, NVIDIA. Last updated on Jul 17, 2025.