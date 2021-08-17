



Before using any DOCA FLOW, it is mandatory to call DOCA FLOW initialization.

Copy Copied! int doca_flow_init(const struct doca_flow_cfg *cfg, struct doca_flow_error *error);



The struct doca_flow_cfg contains the following elements:

Copy Copied! struct doca_flow_cfg { uint32_t total_sessions; /**< total flows count */ uint16_t queues; /**< queue id for each offload thread */ bool is_hairpin; /**< when true, the fwd will be hairpin queue*/ bool aging; /**< when true, aging is handled by doca */ };



Where:

total_sessions - refers to the estimated scale of HW rules

- refers to the estimated scale of HW rules queues – the number of HW acceleration controls queues. It is expected that the same core always uses the same queue_id. In cases where multiple cores are accessing the API with the same queue_id, it is up to the application to lock in between cores/threads.

DOCA FLOW API serves as an abstraction layer API for network acceleration. The packet processing in-network function is described from ingress to egress, and therefore a PIPE must be attached to the origin port. Once a packet arrives to the origin port, it will start the HW execution as defined by the DOCA API.

Copy Copied! /** * @brief doca flow port struct */ struct doca_flow_port;



doca_flow_port is an opaque object since the DOCA FLOW API is not bound to a specific packet delivery API such as DPDK. The first step is to start the DOCA FLOW port. The purpose of this step is to attach user application ports to the DOCA ports.

Copy Copied! struct doca_flow_port *doca_flow_port_start(struct doca_flow_port_cfg *cfg, struct doca_flow_error *error);



Port configuration contains the following:

port_id – chosen by the user. IDs must start with 0 and be consecutive.

– chosen by the user. IDs must start with 0 and be consecutive. type - depends on underlying API

- depends on underlying API devargs – a string containing the exact configuration needed according to the type

– a string containing the exact configuration needed according to the type priv_data_size – per port, users may define private data where application-specific info can be stored

Copy Copied! struct doca_flow_port_cfg { uint16_t port_id; /**< dpdk port id*/ enum doca_flow_port_type type; /**< mapping type of port */ const char *devargs; /**< specific per port type cfg */ uint16_t priv_data_size; /**< user private data */ };



When DPDK is used, the following configuration must be provided:

Copy Copied! enum doca_flow_port_type type = DOCA_FLOW_PORT_DPDK_BY_ID const char *devargs = “1”





The devargs parameter points to a string that has the numeric value of the DPDK port_id in decimal. The port must be configured and started before calling this API. Mapping the DPDK port to the DOCA port is required to synchronize application ports with HW ports.

PIPE is a template that defines packet processing without adding any specific HW rule. A PIPE consists of a template that includes the following elements:

Match

Monitor

Actions

Forward

The following diagram illustrates a PIPE structure.

The creation phase allows the HW to efficiently build the execution PIPE. After the PIPE is created, specific entries can be added. Only a subset of the PIPE can be used (e.g. skipping the monitor completely, or just using the counter, etc).



Match is a mandatory field when creating a PIPE. Using the following struct, users must define the fields that should be matched on the PIPE.

Copy Copied! struct doca_flow_match { uint32_t flags; uint8_t out_src_mac[DOCA_ETHER_ADDR_LEN]; /**< outer source mac address*/ uint8_t out_dst_mac[DOCA_ETHER_ADDR_LEN]; /**< outer destination mac address*/ doca_be16_t vlan_id; /**< outer vlan id*/ struct doca_flow_ip_addr out_src_ip; /**< outer source ip address*/ struct doca_flow_ip_addr out_dst_ip; /**< outer destination ip address*/ uint8_t out_l4_type; /**< outer layer 4 protcol type*/ doca_be16_t out_src_port; /**< outer layer 4 source port*/ doca_be16_t out_dst_port; /**< outer layer 4 destination port*/ struct doca_flow_tun tun; /**< tunnel info*/ struct doca_flow_ip_addr in_src_ip; /**< inner source ip address if tunnel is used*/ struct doca_flow_ip_addr in_dst_ip; /**< inner destination ip address if tunnel is used*/ uint8_t in_l4_type; /**< inner layer 4 protocol type if tunnel is used*/ doca_be16_t in_src_port; /**< inner layer 4 source port if tunnel is used*/ doca_be16_t in_dst_port; /**< inner layer 4 destination port if tunnel is used*/ };



For each field, users choose whether the field may be:

Ignored (wild card) – value of the field is ignored

Constant – all entries in the PIPE must have the same value for this field. Users should not put a value for each entry.

Changeable – per entry, the user must provide the value to match Note: L4 type, L3 type, and tunnel type cannot be changeable.

The match field type can be defined either implicitly or explicitly.



To match implicitly, the following considerations should be taken into account.



Field is zeroed

Pipeline has no comparison on the field

These are fields that have a constant value. For example, as shown in the following, the tunnel type is VXLAN.

Copy Copied! match.tun.type = DOCA_FLOW_TUN_VXLAN;





These fields only need to be configured once, not once per new pipeline entry.

These are fields that may change per entry. For example, the following shows an inner 5-tuple which are set with a full mask.

Copy Copied! match.in_dst_ip.a.ipv4_addr = 0xffffffff;





If this is the constant value required by user, then they should set zero on the field when adding a new entry.

The following is an example of a match on the VXLAN tunnel, when for each entry there is a specific IPv4 destination address, and an inner 5-tuple.

Copy Copied! static void build_underlay_overlay_match(struct doca_flow_match *match) { match->out_dst_ip.ipv4_addr = 0xffffffff; match->out_l4_type = DOCA_PROTO_UDP; match->out_dst_port = DOCA_VXLAN_DEFAULT_PORT; match->tun.type = DOCA_FLOW_TUN_VXLAN; match->tun.vxlan.tun_id = 0xffffffff; //inner match->in_dst_ip.ipv4_addr = 0xffffffff; match->in_src_ip.ipv4_addr = 0xffffffff; match->in_src_ip.type = DOCA_FLOW_IP4_ADDR; match->in_l4_type = DOCA_PROTO_TCP; match->in_src_port = 0xffff; match->in_dst_port = 0xffff; }

Users may provide a mask on match. In this case, there are two doca_flow_match items: The first will contain constant values, and the second will contain masks.



Field is zeroed

Pipeline has no comparison on the field

Copy Copied! Match_mask.in_dst_ip.ipv4_addr = 0;

These are fields that have a constant value. For example, as shown in the following, the tunnel type is VXLAN and the mask should be full.

Copy Copied! match.tun.type = DOCA_FLOW_TUN_VXLAN; match_mask.tun.type = 0xffffffff;





Once a field is defined as constant, the field’s value cannot be changed per entry. Users must set constant fields to zero when adding entries to avoid ambiguity.

These are fields that may change per entry (e.g. inner 5-tuple). Their value should be zero and the mask should be full.

Copy Copied! match.in_dst_ip.ipv4_addr = 0; match_mask.in_dst_ip.ipv4_addr = 0xffffffff;





Note that for IPs, the prefix mask can be used as well.

Similarly to setting PIPE match, actions also have a template definition with the following data structure:

Copy Copied! struct doca_flow_actions { bool decap; /**< when true, will do decap*/ uint8_t mod_src_mac[DOCA_ETHER_ADDR_LEN]; /**< modify source mac address*/ uint8_t mod_dst_mac[DOCA_ETHER_ADDR_LEN]; /**< modify destination mac address*/ struct doca_flow_ip_addr mod_src_ip; /**< modify source ip address*/ struct doca_flow_ip_addr mod_dst_ip; /**< modify destination ip address*/ doca_be16_t mod_src_port; /**< modify layer 4 source port*/ doca_be16_t mod_dst_port; /**< modify layer 4 destination port*/ bool dec_ttl; /**< decrease TTL value*/ bool has_encap; /**< when true, will do encap*/ struct doca_flow_encap_action encap; /**< encap data information*/ };



Similarly to doca_flow_match in the creation phase, only the subset of actions that should be executed per packet are defined. This is done in a similar way to match, namely by classifying a field to one of the following:

Ignored field – field is zeroed, modify is not used

Constant fields – when a field should be modified per packet, but the value is the same for all packets, a one-time value on action definitions can be used

Changeable fields – fields that may have more than one possible value, and the exact values is set by the user per entry Copy Copied! match_mask.in_dst_ip.ipv4_addr = 0xffffffff;

Boolean fields – Boolean values, encap and decap are considered as constant values. It is not allowed to generate actions with encap=true and to then have an entry without an encap value.

For example:

Copy Copied! static void create_decap_inner_modify_actions(struct doca_flow_actions *actions) { actions->decap = true; actions->mod_dst_ip.ipv4_addr = 0xffffffff; }

If a policer should be used, then it is possible to have the same configuration for all policers on the PIPE or to have a specific configuration per entry.

Copy Copied! struct doca_flow_monitor { uint8_t flags; /**< indicate which actions be included*/ struct { uint32_t id; /**< meter id */ uint64_t cir; /**< Committed Information Rate (bytes/second). */ uint64_t cbs; /** Committed Burst Size (bytes). */ }; /**< meter action configuration*/ uint32_t aging; /**< aging time in seconds*/ };



Where:

Committed information rate (CIR) – defines the maximum bandwidth

Committed burst size (CBS) – maximum local burst size

T(c) is the number of available tokens. For each packet where "b" equals the number of bytes, if t(c)-b≥0 the packet can continue, and tokens are consumed so t(c)=t(c)-b . If t(c)-b<0 , the packet is dropped.

T(c) tokens are increased according to time, configured CIR, configured CBS, and packet arrival. When a packet is received prior to anything else, the t(c) tokens are filled. The number of tokens is a relative value that relies on the total time passed since last update, but it is limited by CBS value.

The last "action" in a PIPE directs a packet on where to go next. Users may configure one of the following:

Send to software (representor)

Send to wire

Jump to next PIPE

The FORWARDING action may be set for PIPE create, but it can also be unique per entry. A PIPE can be defined with constant forwarding (e.g., always send packets on a specific port). In this case, all entries will have the exact same forwarding. If forwarding is not defined when PIPE is created, however, users must define forwarding for each entry. In this instance PIPEs may have different forwarding actions.

Copy Copied! struct doca_flow_fwd { enum doca_flow_fwd_type type; /**< indicate the forwarding type*/ union { struct { uint32_t rss_flags; /**< rss offload types*/ uint16_t *rss_queues; /**< rss queues array*/ int num_of_queues; /**< number of queues*/ uint32_t rss_mark; /**< markid of each queues*/ }; /**< rss configuration information*/ struct { uint16_t port_id; /**< destination port id*/ }; /**< port configuration information*/ struct { struct doca_flow_pipe *next_pipe; /**< next pipe pointer*/ }; /**< next pipe configuration information*/ }; };



The following is an RSS forwarding example:

Copy Copied! struct { uint32_t rss_flags; /**< rss offload types*/ uint16_t *rss_queues; /**< rss queues array*/ int num_of_queues; /**< number of queues*/ uint32_t rss_mark; /**< markid of each queues*/ };



Flags include the RSS fields defined in the following enum:

Copy Copied! enum doca_rss_type { DOCA_FLOW_RSS_IP = (1 << 0), /**< rss by ip head*/ DOCA_FLOW_RSS_UDP = (1 << 1), /**< rss by udp head*/ DOCA_FLOW_RSS_TCP = (1 << 2), };





Queues point to the uint16_t array that contains the queue numbers. When a port is started, the number of queues is defined, starting from zero up to the number of queues minus 1. RSS queue numbers may contain any subset of those predefined queue numbers. For specific match packet can be directed to a single queue by having RSS forwarding with a single queue.

MARK is an optional parameter that may be communicated to the software. If MARK is set and the packet arrives to the software, the value can be examined using the software API. When DPDK is used, MARK is placed on the struct rte_mbuf . (See "Action: MARK" section in official DPDK documentation.) When using the Kernel, the MARK value is placed on the struct sk_buff MARK field.

The port_id is given in struct doca_flow_port_cfg .

The packet will be directed to the port. In many instances the complete PIPE is executed in the HW, including the forwarding of the packet back to the wire. The packet never arrives to the SW. Example code for forwarding to port:

Copy Copied! struct doca_flow_fwd *fwd = malloc(sizeof(struct doca_flow_fwd)); memset(fwd, 0, sizeof(struct doca_flow_fwd)); fwd->type = DOCA_FLOW_FWD_PORT; fwd->port_id = port_cfg->port_id;





The type of forwarding is DOCA_FLOW_FWD_PORT and the only data required is the port_id as defined in DOCA_FLOW_PORT .

Once all parameters are defined, the create function is called.

Copy Copied! struct doca_flow_pipe * doca_flow_create_pipe(const struct doca_flow_pipe_cfg *cfg, const struct doca_flow_fwd *fwd, struct doca_flow_error *error);





The return value of the function is a handle to the PIPE. This handle should be given when adding entries to PIPE. If a failure occurs, the function returns NULL , and the error reason and message are put in the error argument if provided by the user.

It is possible to not have all options fields. For example, fwd is not mandatory, and in PIPE configuration some of the fields might be zeroed when not used.

Once PIPE is created, a new entry can be added to it. These entries are bound to a PIPE, so when a PIPE is destroyed, all the entries in the PIPE are removed. Please refer to section PIPE Entry for more information.

There is no priority between PIPEs or entries. The way that priority can be implemented is to match the highest priority first, and if a miss occurs, to jump to the next PIPE. There can be more than one PIPE on a root as long the PIPEs are not overlapping. If entries are overlapping, the priority is set according to the order of entries added. So, if two PIPEs have overlapping matching and PIPE1 has higher priority than PIPE2, users should add an entry to PIPE1 after any entry is added to PIPE2.

An entry is a specific instance inside of a PIPE. When defining a PIPE, users define match criteria (subset of fields to be matched), the type of actions to be done on matched packets, monitor, and, optionally, the FORWARDING action. When adding an entry, users should define the values that are not constant among all entries in the PIPE, and if FORWARDING is not defined then that is also mandatory.

Copy Copied! struct doca_flow_pipe_entry *doca_flow_pipe_add_entry(uint16_t pipe_queue, struct doca_flow_pipe *pipe, const struct doca_flow_match *match, const struct doca_flow_actions *actions, const struct doca_flow_monitor *mon, const struct doca_flow_fwd *fwd, struct doca_flow_error *error)

DOCA FLOW is designed to support concurrency in an efficient way. Since the expected rate is going to be in millions of new entries per second, it is required to use similar architecture as data path. Having a unique queue ID per core saves the DOCA engine from having to lock the data structure and enables the usage of multiple queues when interacting with HW. It is expected for a single thread/core to use a unique pipe_queue and that the pipe_queue is not shared with other threads/cores.

Copy Copied! struct doca_flow_pipe_entry



Upon success, a handle is returned. If a failure occurs, a NULL value is returned, and an error message will be filled. The application can keep this handle and call remove on the entry using its handle.

Copy Copied! int doca_flow_pipe_rm_entry(uint16_t pipe_queue, struct doca_flow_pipe_entry *entry);

By default, no counter is added. If defined in monitor, a unique counter is added per entry.

Note: Having a counter per entry affects performance and should be avoided if it is not required by the application.



When a counter is present, it is possible to query the flow and get the counter’s data.