DOCA Flow Connection Tracking

This guide provides an overview and configuration instructions for DOCA Flow CT API.

DOCA Flow Connection Tracking (CT) is a 5-tuple table which supports the following:

  • Track 5-tuple sessions (or 6-tuple when a zone is available)

  • Zone based – virtual tables

  • Aging (i.e., removes idle connections)

  • Sets metadata for a connection

  • Bidirectional packet handling

  • High rate of connections per second (CPS)

The CT module makes it simple and efficient to track connections by leveraging hardware resources. The module supports both autonomous and managed mode.

DOCA Flow CT pipe handles non-encapsulated TCP and UDP packets. The CT pipe only supports forward next pipe or miss next pipe actions:

  • All packets matching known connection 6-tuples are forwarded to the CT's forward pipe

  • No-matching packets are forwarded to the miss pipe

The user application must handle packets accordingly.

The DOCA Flow CT API is built around four major parts:

  • CT module manipulation – configuring CT module resources

  • CT connection entry manipulation – adding, removing, or updating connection entries

  • Callbacks – handling async entry processing result

  • Pipe and entry statistics

arch-diagram-version-1-modificationdate-1707420712663-api-v2.png

Aging

Aging time is a time in seconds that sets the maximum allowed time for a session to be maintained without a packet seen. If that time elapses with no packet being detected, the session is terminated.

To support aging, a dedicated aging thread is started to poll and check counters for all connections.

Autonomous Mode

In this mode, DOCA runs multiple CT workers internally, to handle connections in parallel.

The connection's lifecycle is controlled by the connection state encapsulated in the packet and time-based aging.

CT workers establish and close connections automatically based on the connection's state stored in packet meta.

Packet meta is defined as follows:

Copy
Copied!
            

uint32_t src : 1;      /**< Source port in multi-port E-Switch mode */ uint32_t hairpin : 1;  /**< Subject to forward using hairpin. */ uint32_t type : 2;     /**< CT packet type: New, End or Update */ uint32_t data : 28;    /**< Zone set by user or reserved after CT pipe. */

  • data – CT table matches packet meta (zone) and 5-tuples

  • type – can have the following values:

    • NONE – (known) if packet hit any connection rule

    • NEW – if new TCP or UDP connection

    • END – if TCP connection closed

  • src and hairpin – used for forwarding pipe and worker to deliver packet

autonomous-mode-diagram-version-1-modificationdate-1707420712330-api-v2.png

Managed Mode

The application is responsible for managing the worker threads in this mode, parsing and handling the connection's lifecycle.

Managed mode uses DOCA Flow CT management APIs to create or destroy the connections.

The CT aging module notifies on aged out connections by calling callbacks.

Users can create connection rules with different a pattern, meta, or counter, for each packet direction.

Note

Users are responsible for defining meta and mask to match and modify.

Users can create one rule of a connection first, then create another rule using API doca_flow_ct_entry_add_dir().

managed-mode-diagram-version-1-modificationdate-1707420711940-api-v2.png

To enable DOCA Flow CT on the DPU, perform the following:

  1. Enable iommu.passthrough in Linux boot commands ( or disable SMMU from the DPU BIOS):

    1. Run:

      Copy
      Copied!
                  

      sudo vim /etc/default/grub

    2. Set GRUB_CMDLINE_LINUX="iommu.passthrough=1".

    3. Run:

      Copy
      Copied!
                  

      sudo update-grub sudo reboot

  2. Define huge pages (see prerequisites of DOCA Flow).

  3. Configure DPU firmware with LAG_RESOURCE_ALLOCATION=1:

    Copy
    Copied!
                

    sudo mlxconfig -d <device-id> s LAG_RESOURCE_ALLOCATION=1

    Note

    Gain device-id from the output of the mst status -v command.

  4. Perform power cycle on the host and Arm sides.

  5. If working with a single port, set the DPU into e-switch mode:

    Copy
    Copied!
                

    sudo devlink dev eswitch set pci/<pcie-address> mode switchdev sudo devlink dev param set pci/<pcie-address> name esw_multiport value false cmode runtime

    Note

    Retrieve pcie-address from the output of the mst status -v command.

  6. If working with two PF ports, set the DPU into multi-port e-switch mode (for the 2 PCIe devices):

    Copy
    Copied!
                

    sudo devlink dev param set pci/<pcie-address> name esw_multiport value true cmode runtime

    Note

    Retrieve pcie-address from the output of the mst status -v command.

DOCA Flow CT supports actions based on meta and NAT operations. Each action can be defined as either shared or non-shared.

Shared Actions

Actions that can be shared between entries. Shared actions are predefined and reused in multiple entries.

The user gets a handle per shared action created and uses this handle as a reference to the action where required.

Note

It is user responsibility to track shared actions and to remove them when they become irrelevant.

Shared actions are defined using a control queue (see struct doca_flow_ct_cfg).

Non-shared Actions

Actions provided with their data during entry create/update.

These actions are completely managed by DOCA Flow CT and cannot be reused in multiple flows (i.e., NAT operations).

Action Sets in Pipe Creation

Users must define action sets during DOCA Flow CT pipe creation (as with any other pipe).

Note

Only actions for meta and NAT are accepted (according to struct doca_flow_ct_actions).

During entry create/update, different actions can be provided per direction (different action content and/or different type).

Feature Enable

To enable user actions, configure the following parameters:

  • User action templates during DOCA Flow CT pipe creation

  • Maximum number of user actions (nb_user_actions on DOCA Flow CT init)

Using Actions in Autonomous Mode

Init

Configure the following parameters on doca_flow_ct_init():

  • nb_ctrl_queues – number of control queues for defining shared actions

  • nb_user_actions – maximum number of shared actions

  • worker_cb – callbacks required to communicate with the user

Create DOCA Flow CT Pipe

Configure actions sets on doca_flow_pipe_create().

Create Shared Actions

Use doca_flow_ct_actions_add_shared() with one of the control queues.

Shared actions can be added at any time before use.

Implement Worker Callbacks

Callbacks are called from each worker thread to acquire synchronization with the user code and on the first packet of a flow.

On doca_flow_ct_rule_pkt_cb:

  • Determine how the packet should be treated

  • If rules are required, return the actions handles to use

Using Actions in Managed Mode

Init

Configure the following parameters on doca_flow_ct_init():

  • nb_ctrl_queues – number of control queues for defining shared actions

  • nb_user_actions – maximum number of shared actions

Create DOCA Flow CT Pipe

Configure actions sets on doca_flow_pipe_create().

Create Shared Actions

Use doca_flow_ct_actions_add_shared() with one of the control queues.

Shared actions can be added at any time before use.

Add Entry

Entry can be created in one of the following ways:

  • Using an action handle of a predefined shared action

  • Using action data which is specific to the flow, not sharable (e.g., for NAT operations)

The entry can have different actions and/or different action types per direction.

Remove Entry

Non-shared actions associated with an entry are implicitly destroyed by DOCA Flow CT.

Shared actions are not destroyed. They can be used by the user until they decide to remove them.

Update Entry

Entry actions can be updated per direction. All combinations of shared/non-shared actions are applicable (e.g., update from shared to non-shared).

For the library API reference, refer to DOCA Flow and CT API documentation in the NVIDIA DOCA Library APIs.

Warning

The pkg-config (*.pc file) for the Flow CT library is included in DOCA's regular definitions :doca.

The following sections provide additional details about the library API.

enum doca_flow_ct_flags

DOCA Flow CT configuration optional flags.

Flag

Description

DOCA_FLOW_CT_FLAG_STATS = 1u << 0

Enable internal pipe counters for packet tracking purposes. Call doca_flow_pipe_dump(<ct_pipe>) to dump counter values. Each call dumps values changed.

DOCA_FLOW_CT_FLAG_WORKER_STATS = 1u << 1,

Enable worker thread internal debug counter periodical dump. Autonomous mode only.

DOCA_FLOW_CT_FLAG_NO_AGING = 1u << 2,

Disable aging

DOCA_FLOW_CT_FLAG_SW_PKT_PARSING = 1u << 3,

Enable CT worker software packet parsing to support VLAN, IPv6 options, or special tunnel types

DOCA_FLOW_CT_FLAG_MANAGED = 1u << 4,

Enable managed mode in which user application is responsible for managing packet handling, and calling the CT API to manipulate CT connection entries

DOCA_FLOW_CT_FLAG_ASYMMETRIC = 1u << 5,

Allows different 6-tuple table definitions for the origin and reply directions. Default to symmetric mode, uses same meta and reverse 5-tuples for reply direction. Managed mode only.

DOCA_FLOW_CT_FLAG_ASYMMETRIC_COUNTER = 1u << 6,

Enable different counters for the origin and reply directions. Managed mode only.

DOCA_FLOW_CT_FLAG_NO_COUNTER = 1u << 7,

Disable counter and aging to save aging thread CPU cycles

DOCA_FLOW_CT_FLAG_DEFAULT_MISS = 1u << 8,

Check TCP SYN flags and UDP in CT miss flow to identify ADD type packets.

DOCA_FLOW_CT_FLAG_WIRE_TO_WIRE = 1u << 9,

Hint traffic comes from uplink wire and forwards to uplink wire.


enum doca_flow_ct_rule_opr

Options for handling flows in autonomous mode with shared actions. The decision is taken on the first flow packet.

Operation

Description

DOCA_FLOW_CT_RULE_OK

Flow should be defined in the CT pipe using the required shared actions handles

DOCA_FLOW_CT_RULE_DROP

Flow should not be defined in the CT pipe. The packet should be dropped.

DOCA_FLOW_CT_RULE_TX_ONLY

Flow should not be defined in the CT pipe. The packet should be transmitted.


struct direction_cfg

Managed mode configuration for origin or reply direction.

Field

Description

bool match_inner

5-tuple match pattern applies to packet inner layer

struct doca_flow_meta *zone_match_mask

Mask to indicate meta field and bits to match

struct doca_flow_meta *meta_modify_mask

Mask to indicate meta field and bits to modify on connection packet match


struct doca_flow_ct_worker_callbacks

Set of callbacks for using shared actions in autonomous mode.

Field

Description

doca_flow_ct_sync_acquire_cb worker_init

Called at the start of a worker thread to sync with the user context

doca_flow_ct_sync_release_cb worker_release

Called at the end of a worker thread

doca_flow_ct_rule_pkt_cb rule_pkt

Called on the first packet of a flow


struct doca_flow_ct_cfg

DOCA Flow CT configuration.

Copy
Copied!
            

uint32_t nb_arm_queues; uint32_t nb_ctrl_queues; uint32_t nb_user_actions; uint32_t nb_arm_sessions[DOCA_FLOW_CT_SESSION_MAX]; uint32_t flags; struct doca_dev *doca_dev; void *ib_dev; void *ib_pd; uint16_t aging_core; uint16_t aging_query_delay_s; doca_flow_ct_flow_log_cb flow_log_cb; struct doca_flow_ct_aging_ops *aging_ops; uint32_t base_core_id; union {              /* Managed mode configuration for origin and reply direction. */               struct direction_cfg direction[2];                 /* Below fields are dedicate for autonomous mode */               struct { uint16_t tcp_timeout_s; uint16_t tcp_session_del_s; uint16_t udp_timeout_s;                        enum doca_flow_tun_type tunnel_type;                       uint16_t vxlan_dst_port;                       enum doca_flow_ct_hash_type hash_type;                       uint32_t meta_user_bits;                       uint32_t meta_action_bits;                       struct doca_flow_meta *meta_zone_mask;                       struct doca_flow_meta *connection_id_mask; struct doca_flow_ct_worker_callbacks worker_cb;               };       };

Where:

Field

Description

uint32_t nb_arm_queues

Number of CT queues. In autonomous mode, also the number of worker threads.

uint32_t nb_ctrl_queues

Number of CT control queues used for defining shared actions

uint32_t nb_user_actions

Maximum number of user actions supported (shared and non-shared)

Minimum value is 1K * nb_ctrl_queues

uint32_t nb_arm_sessions[DOCA_FLOW_CT_SESSION_MAX]

Maximum number of IPv4 and IPv6 CT connections

uint32_t flags

CT configuration flags

struct doca_dev *doca_dev

DOCA device

void *ib_dev

Deprecated

void *ib_pd

Deprecated

uint16_t aging_core

CPU core ID for CT aging thread to bind.

uint16_t aging_core_delay

CT aging code delay.

doca_flow_ct_flow_log_cb flow_log_cb

Flow log callback function, when set

struct doca_flow_ct_aging_ops *aging_ops

User-defined aging logic callback functions. Fallback to default aging logic

uint32_t base_core_id

Base core ID for the workers

struct direction_cfg direction

Managed mode configuration for origin or reply direction

uint16_t tcp_timeout_s

TCP timeout in seconds

uint16_t tcp_session_del_s

Time to delay or kill TCP session after RST/FIN

enum doca_flow_tun_type tunnel_type

Encapsulation tunnel type

uint16_t vxlan_dst_port

VXLAN outer UDP destination port in big endian

enum doca_flow_ct_hash_type hash_type

Type of connection hash table type: NONE or SYMMETRIC_HASH

uint32_t meta_user_bits

User packet meta bits to be owned by the user

uint32_t meta_action_bits

User packet meta bits to be carried by identified connection packet

struct doca_flow_meta *meta_zone_mask

Mask to indicate meta field and bits saving zone information

struct doca_flow_meta *connection_id_mask

Mask to indicate meta field and bits for CT internal connection ID

struct doca_flowct_worker_callbacks worker_cb

Worker callbacks to use shared actions


struct doca_flow_ct_actions

This structure is used in the following cases:

  • For defining shared actions. In this case, action data is provided by the user. The action handle is returned by DOCA Flow CT.

  • For defining an entry with actions. The structure can be filled with two options:

    • With action handle of a previously created shared action

    • With non-shared action data

DOCA Flow CT action structure.

Copy
Copied!
            

enum doca_flow_resource_type  resource_type; union { /* Used when creating an entry with a shared action. */               uint32_t action_handle;   /* Used when creating an entry with non-shared action or when creating a shared action. */              struct {                           uint32_t action_idx;                         struct doca_flow_meta meta;                         struct doca_flow_header_l4_port l4_port;                         union {                                   struct doca_flow_ct_ip4 ip4;                                   struct doca_flow_ct_ip6 ip6;                          };              } data;      };

Where:

Field

Description

enum doca_flow_resource_type resource_type

Shared/non-shared action

uint32_t action_handle

Shared action handle

uint32_t action_idx

Actions template index

struct doca_flow_meta meta

Modify meta values

struct doca_flow_header_l4_port l4_port

UDP or TCP source and destination port

struct doca_flow_ct_ip4 ip4

Source and destination IPv4 addresses

struct doca_flow_ct_ip6 ip6

Source and destination IPv6 addresses


This section describes DOCA Flow CT samples based on the DOCA Flow CT pipe.

The samples illustrate how to use the library API to manage TCP/UDP connections.

Running the Samples

  1. Refer to the following documents:

  2. To build a given sample:

    Copy
    Copied!
                

    cd /opt/mellanox/doca/samples/doca_flow/flow_ct_udp meson /tmp/build ninja -C /tmp/build

    Note

    The binary doca_flow_ct_udp is created under /tmp/build/.

  3. Sample (e.g., doca_flow_ct_udp ) usage:

    Copy
    Copied!
                

    Usage: doca_<sample_name> [DOCA Flags] [Program Flags]    DOCA Flags:   -h, --help                              Print a help synopsis   -v, --version                           Print program version information       -l, --log-level                         Set the (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE> --sdk-log-level Set the SDK (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE> -j, --json <path>                       Parse all command flags from an input json file     Program Flags:  -p, --pci_addr <PCI-ADDRESS>            PCI device address

  4. For additional information per sample, use the -h option:

    Copy
    Copied!
                

    /tmp/build/<sample_name> -h

  5. The following is a CLI example for running the samples when port 03:00.0 is configured (multi-port e-switch) as manager port:

    Copy
    Copied!
                

    /tmp/build/doca_<sample_name> -- -p 03:00.0 -l 60

Samples

Flow CT UDP

This sample illustrates how to create a simple UDP pipeline with a CT pipe in it.

The sample logic includes:

  1. Initializing DOCA Flow by indicating mode_args="switch,hws" in the doca_flow_cfg struct.

  2. Initializing DOCA Flow CT .

  3. Starting two DOCA Flow uplink representor ports where port 0 has a special role of being a switch manager port.

    Note

    Ports are configured according to the parameters provided to doca_dpdk_port_probe() in the main function.

  4. Creating a pipeline on the main port:

    1. Building an UDP pipe to filter non-UDP packets.

    2. Building a CT pipe to hold UDP session entries.

    3. Building a counter pipe with an example 5-tuple entry to which non-unidentified UDP sessions should be sent.

    4. Building a VXLAN encapsulation pipe to encapsulate all identified UDP sessions.

    5. Building an RSS pipe from which all packets are directed to the sample main thread for parsing and processing.

  5. Packet processing:

    1. The first UDP packet triggers the miss flow as the CT pipe is empty.

    2. 5-tuple packet parsing is performed.

    3. doca_flow_ct_add_entry() is called to create a hardware rule according to the parsed 5-tuple info.

    4. The second UDP packet based on the the same 5-tuple should be sent again. Packet hits the HW rule inserted before and directed to port 0 after VXLAN encapsulation.

Reference:

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp/flow_ct_udp_sample.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp/flow_ct_udp_main.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp/mson.build

Flow CT UDP Query

This sample illustrates how to query a Flow CT UDP session entry. The query can be done according to session direction (origin or reply). The pipeline is identical to that of the Flow CT UDP sample.

This sample adds the following logic:

  1. Dumping port 0 information into a file at ./port_0_info.txt.

  2. Querying UDP session hardware entry created after receiving the first UDP packet:

    • Origin total bytes received

    • Origin total packets received

    • Reply total bytes received

    • Reply total packets received

Reference:

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp_query/flow_ct_udp_query_sample.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp_query/flow_ct_udp_query_main.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp_query/mson.build

Flow CT UDP Update

This sample illustrates how a CT entry can be updated after creation.

The pipeline is identical to that of the Flow CT UDP sample . In case of non-active UDP sessions, a relevant entry shall be updated with an aging timeout.

This sample adds the following logic:

  1. Querying all UDP sessions for the total number of packets received in both the origin and reply directions.

  2. Updating entry aging timeout to 2 seconds once a session is not active (i.e., no packets received on either side).

  3. Waiting until all non-active session are aged and deleted.

Reference:

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp_update/flow_ct_udp_update_sample.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp_update/flow_ct_udp_update_main.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_udp_update/mson.build

Flow CT UDP Single Match

This sample is based on the Flow CT UDP sample. The sample illustrates that a hardware entry can be created with a single match (matching performed in one direction only) in the API call doca_flow_ct_add_entry().

Flow CT Aging

This sample illustrates the use of the DOCA Flow CT aging functionality. It demonstrates how to build a pipe and add different entries with different aging times and user data.

No packets need to be sent for this sample.

The sample logic includes:

  1. Initializing DOCA Flow by indicating mode_args="switch,hws" in the doca_flow_cfg struct.

  2. Initializing DOCA Flow CT .

  3. Starting two DOCA Flow uplink representor ports where port 0 has a special role of being a switch manager port.

    Note

    Ports are configured according to the parameters provided to doca_dpdk_port_probe() in the main function.

  4. Building a UDP pipe to serve as the root pipe.

  5. Building a counter pipe with an example 5-tuple entry to which CT forwards packets.

  6. Adding 32 entries with a different 5-tuple match, different aging time (3-12 seconds), and setting user data. User data will contain the port ID, entry number, and status.

  7. Handling aging in small intervals and removing each entry after age-out.

  8. Running these commands until all 32 entries age out.

Reference:

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_aging/flow_ct_aging_sample.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_aging/flow_ct_aging_main.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_aging/meson.build

Flow CT TCP

This sample illustrates how to manage TCP flags with CT to achieve better control over TCP sessions.

Note

The sample expects to receive at least SYN and FIN packets.

The sample logic includes:

  1. Initializing DOCA Flow by indicating mode_args="switch,hws" in the doca_flow_cfg struct.

  2. Initializing DOCA Flow CT .

  3. Starting two DOCA Flow uplink representor ports where port 0 has a special role of being a switch manager port.

    Note

    Ports are configured according to the parameters provided to doca_dpdk_port_probe() in the main function.

  4. Creating a pipeline on the main port:

    1. Building an TCP pipe to filter non-TCP packets.

    2. Building a CT pipe to hold TCP session entries.

    3. Building a CT miss pipe which forwards all packets to RSS pipe.

    4. Building an RSS pipe from which all packets are directed to the sample main thread for parsing and processing.

    5. Building a TCP flags filter pipe which identifies the TCP flag inside the packets. SYN, FIN, and RST packets are forwarded the to RSS pipe while all others are forwarded to the EGRESS pipe.

    6. Building an EGRESS pipe to forward packets to uplink representor port 1.

  5. Packet processing:

    1. The first TCP packet triggers the miss flow as the CT pipe is empty.

    2. 5-tuple packet parsing is performed.

    3. TCP flag is examined.

      • In case of a SYN flag, a HW entry is created.

      • For FIN or RST flags, the HW entry is removed and all packets are transferred to uplink representor port 1 using rte_eth_tx_burst().

    4. From this point on, all TCP packets belonging to the above session are offloaded directly to uplink port representor 1.

Reference:

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/flow_ct_tcp_sample.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/flow_ct_tcp_main.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/mson.build

Flow CT TCP Actions

This sample illustrates how a to add shared and non-shared actions to CT TCP sessions. The pipeline is identical to that of the Flow CT TCP sample.

Note

The sample expects to receive at least SYN and FIN packets.

This sample adds a shared action on one side of the session that placed the value 1 in the packet's metadata, while on the other side of the session a non-shared action is placed. The non-shared action simply flips the order of the source-destination IP addresses and port numbers.

Reference:

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp_actions/flow_ct_tcp_actions_sample.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp_actions/flow_ct_tcp_actions_main.c

  • /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp_actions/mson.build

© Copyright 2023, NVIDIA. Last updated on Feb 9, 2024.