DOCA Ethernet
Contents:
This guide provides an overview and configuration instructions for the DOCA ETH API.
The DOCA Ethernet library is supported at alpha level.
DOCA ETH comprises of two APIs, DOCA ETH RXQ and DOCA ETH TXQ. The control path is always handled on the host/DPU CPU side by the library. The datapath can be managed either on the CPU by the DOCA ETH library or on the GPU by the GPUNetIO library.
    
DOCA ETH RXQ is an RX queue. It defines a queue for receiving packets.    
 It also supports receiving Ethernet packets on any memory mapped by doca_mmap.
The memory location to which packets are scattered is agnostic to the processor which manages the datapath (CPU/DPU/GPU). For example, the datapath may be managed on the CPU while packets are scattered to GPU memory.
    
DOCA ETH TXQ is an TX queue. It defines a queue for sending packets.    
 It also supports sending Ethernet packets from any memory mapped by doca_mmap.
To free the CPU from managing the datapath, the user can choose to manage the datapath from the GPU. In this mode of operation, the library collects user configurations and creates a receive/send queue object on the GPU memory (using the DOCA GPU sub-device) and coordinates with the network card (NIC) to interact with the GPU processor.
This library follows the architecture of a DOCA Core Context. It is recommended to read the following sections:
- BlueField DPU Scalable Function (for using SF on DPU) 
- DOCA GPUNetIO (for GPU datapath) 
Changes in 2.9.0
    
The following subsection(s) detail the doca_eth library updates in version 2.9.0.    
Added
ETH RXQ
- doca_error_t doca_eth_rxq_set_metadata_num(struct doca_eth_rxq *eth_rxq, uint8_t metadata_num)
- doca_error_t doca_eth_rxq_set_flow_tag(struct doca_eth_rxq *eth_rxq, uint8_t enable_flow_tag)
- doca_error_t doca_eth_rxq_set_rx_hash(struct doca_eth_rxq *eth_rxq, uint8_t enable_rx_hash)
- doca_error_t doca_eth_rxq_cap_get_max_metadata_num(const struct doca_devinfo *devinfo, uint8_t *max_metadata_num)
- doca_error_t doca_eth_rxq_event_batch_managed_recv_get_l3_ok_array(const struct doca_eth_rxq_event_batch_managed_recv *event_batch_managed_recv, const uint8_t **l3_ok_array)
- doca_error_t doca_eth_rxq_event_batch_managed_recv_get_l4_ok_array(const struct doca_eth_rxq_event_batch_managed_recv *event_batch_managed_recv, const uint8_t **l4_ok_array)
- doca_error_t doca_eth_rxq_task_recv_get_metadata_array(const struct doca_eth_rxq_task_recv *task_recv, const uint32_t **metadata_array)
- doca_error_t doca_eth_rxq_event_managed_recv_get_metadata_array(const struct doca_eth_rxq_event_managed_recv *event_managed_recv, const uint32_t **metadata_array)
- doca_error_t doca_eth_rxq_event_batch_managed_recv_get_metadata_array(const struct doca_eth_rxq_event_batch_managed_recv *event_batch_managed_recv, const uint32_t **metadata_array)
- doca_error_t doca_eth_rxq_task_recv_get_flow_tag(const struct doca_eth_rxq_task_recv *task_recv, uint32_t *flow_tag)
- doca_error_t doca_eth_rxq_event_managed_recv_get_flow_tag(const struct doca_eth_rxq_event_managed_recv *event_managed_recv, uint32_t *flow_tag)
- doca_error_t doca_eth_rxq_event_batch_managed_recv_get_flow_tag_array(const struct doca_eth_rxq_event_batch_managed_recv *event_batch_managed_recv, const uint32_t **flow_tag_array)
- doca_error_t doca_eth_rxq_task_recv_get_rx_hash(const struct doca_eth_rxq_task_recv *task_recv, uint32_t *rx_hash)
- doca_error_t doca_eth_rxq_event_managed_recv_get_rx_hash(const struct doca_eth_rxq_event_managed_recv *event_managed_recv, uint32_t *rx_hash)
- doca_error_t doca_eth_rxq_event_batch_managed_recv_get_rx_hash_array(const struct doca_eth_rxq_event_batch_managed_recv *event_batch_managed_recv, const uint32_t **rx_hash_array)
- #define doca_eth_rxq_event_batch_managed_recv_metadata_array_get_metadata(metadata_array, metadata_num, packet_index, metadata_index) metadata_array[packet_index * metadata_num + metadata_index]
ETH TXQ
- doca_error_t doca_eth_txq_set_metadata_num(struct doca_eth_rxq *eth_txq, uint8_t metadata_num)
- doca_error_t doca_eth_txq_cap_get_max_metadata_num(const struct doca_devinfo *devinfo, uint8_t *max_metadata_num)
- doca_error_t doca_eth_txq_get_flow_queue_id(struct doca_eth_txq *eth_txq, uint16_t *flow_queue_id)
- doca_error_t doca_eth_txq_task_send_get_metadata_array(struct doca_eth_txq_task_send *task_send, uint32_t **metadata_array)
- doca_error_t doca_eth_txq_task_lso_send_get_metadata_array(struct doca_eth_txq_task_lso_send *task_lso_send, uint32_t **metadata_array)
- doca_error_t doca_eth_txq_task_batch_send_get_metadata_array(struct doca_task_batch *task_batch_send, uint32_t **metadata_array)
- doca_error_t doca_eth_txq_task_batch_lso_send_get_metadata_array(struct doca_task_batch *task_batch_lso_send, uint32_t **metadata_array)
- void doca_eth_txq_task_send_set_ol_flags(struct doca_eth_txq_task_send *task_send, uint16_t ol_flags)
- void doca_eth_txq_task_lso_send_set_ol_flags(struct doca_eth_txq_task_lso_send *task_lso_send, uint16_t ol_flags)
- doca_error_t doca_eth_txq_task_batch_send_get_ol_flags_array(struct doca_task_batch *task_batch_send, uint16_t **ol_flags_array)
- doca_error_t doca_eth_txq_task_batch_lso_send_get_ol_flags_array(struct doca_task_batch *task_batch_lso_send, uint16_t **ol_flags_array)
- void doca_eth_txq_task_lso_send_set_mss(struct doca_eth_txq_task_lso_send *task_lso_send, uint16_t mss)
- doca_error_t doca_eth_txq_task_batch_lso_send_get_mss_array(struct doca_task_batch *task_batch_lso_send, uint16_t **mss_array)
- #define doca_eth_txq_task_batch_metadata_array_get_metadata(metadata_array, metadata_num, packet_index, metadata_index) metadata_array[packet_index * metadata_num + metadata_index]
Modified
ETH RXQ
- typedef void (*doca_eth_rxq_event_batch_managed_recv_handler_cb_t)(uint16_t events_number, union doca_data event_batch_user_data, doca_error_t status, struct doca_buf **pkt_array, uint8_t *l3_ok_array, uint8_t *l4_ok_array)- → typedef void (*doca_eth_rxq_event_batch_managed_recv_handler_cb_t)(struct doca_eth_rxq_event_batch_managed_recv *event_batch_managed_recv, uint16_t events_number, union doca_data event_batch_user_data, doca_error_t status, struct doca_buf **pkt_array)
 
ETH TXQ
N/A
DOCA ETH based applications can run either on the Linux host machine or on the NVIDIA® BlueField® DPU target. The following is required:
- Applications should run with root privileges 
- Application that want to retrieve timestamp in ETH RXQ, should enable REAL_TIME_CLOCK_ENABLE parameter using - mlxconfig.
- Applications need to use DOCA Flow to forward incoming traffic to DOCA ETH RXQ's queue. See DOCA Flow and DOCA ETH RXQ samples for reference. 
Make sure the system has free huge pages for DOCA Flow.
DOCA ETH is comprised of two parts: DOCA ETH RXQ and DOCA ETH TXQ.
DOCA ETH RXQ
Operating Modes
DOCA ETH RXQ can operate in the three modes, each exposing a slightly different control/datapath.
Regular Receive
This mode is supported only for CPU datapath.
In this mode, the received packet buffers are managed by the user. To receive a packet, the user should submit a receive task containing a doca_buf to write the packet into.
The application uses this mode if it wants to:
- Run on CPU 
- Manage the memory of received packet and the packet's exact place in memory 
- Forward the received packets to other DOCA libraries 
 
Cyclic Receive
This mode is supported only for GPU datapath.
In this mode, the library scatters packets to the packet buffer (supplied by the user as doca_mmap) in a cyclic manner. Packets acquired by the user may be overwritten by the library if not processed fast enough by the application.
In this mode, the user must provide DOCA ETH RXQ with a packet buffer to be managed by the library (see doca_eth_rxq_set_pkt_buf()). The buffer should be large enough to avoid packet loss (see doca_eth_rxq_estimate_packet_buf_size()).
The application uses this mode if:
- It wants to run on GPU 
- It has a deterministic packet processing time, where a packet is guaranteed to be processed before the library overwrites it with a new packet 
- It wants best performance 
 
Managed Memory Pool Receive
This mode is supported only for CPU datapath.
In this mode, the library uses various optimizations to manage the packet buffers. Packets acquired by the user cannot be overwritten by the library unless explicitly freed by the application. Thus, if the application does not release the packet buffers fast enough, the library would run out of memory and packets would start dropping.
Unlike Cyclic Receive mode, the user can pass the packet to other libraries in DOCA with the guarantee that the packet is not overwritten while being processed by those libraries.
In this mode, the user must provide DOCA ETH RXQ with a packet buffer to be managed by the library (see doca_eth_rxq_set_pkt_buf()). The buffer should be large enough to avoid packet loss (see doca_eth_rxq_estimate_packet_buf_size()).
The application uses this mode if:
- It wants to run on CPU 
- It has a deterministic packet processing time, where a packet is guaranteed to be processed before the library runs out of memory and packets start dropping 
- It wants to forward the received packets to other DOCA libraries 
- It wants best performance 
 
Working with DOCA Flow
To route incoming packets to the desired DOCA ETH RXQ, applications need to use DOCA Flow. Applications need to do the following:
- Create and start DOCA Flow on the appropriate port (device) 
- Create pipes to route packets into 
- Get the queue ID of the queue (inside DOCA ETH RXQ) using - doca_eth_rxq_get_flow_queue_id()
- Add an entry to a pipe which routes packets into the RX queue (using the queue ID we obtained) 
 
For more details see DOCA ETH RXQ samples and DOCA Flow.
DOCA ETH TXQ
Operating Modes
DOCA ETH TXQ can only operate in one mode.
Regular Send
For the CPU datapath, the user should submit a send task containing a doca_buf of the packet to send.
For information regarding the datapath on the GPU, see DOCA GPUNetIO.
 
Offloads
- Large Segment Offloading (LSO) – For TXQ, the hardware supports LSO on transmitted TCP packets over IPv4 and IPv6. LSO enables the software to prepare a large TCP message for sending with a header template (the application should provide this header to the library) which is updated automatically for every generated segment. The hardware segments the large TCP message into multiple TCP segments. Per each such segment, device updates the header template accordingly (see LSO Send Task). 
- L3/L4 checksum offloading – For TXQ, the hardware supports calculation of checksum on transmitted packets and validation of received packet checksum. Checksum calculation is supported for TCP/UDP running over IPv4 and IPv6. (In case of tunneling, the hardware calculates the checksum of the outer header.) The hardware does not require any pseudo header checksum calculation, and the value placed in TCP/UDP checksum is ignored when performing the calculation. See - doca_eth_txq_set_l3_chksum_offload()/- doca_eth_txq_set_l4_chksum_offload().
- Metadata support – For RXQ, the hardware supports retrieving the metadata value collected during packet flow table processing (see - doca_eth_rxq_set_metadata_num()). For TXQ, the hardware supports attaching a metadata value which is carried into transmit flow table processing (see- doca_eth_txq_set_metadata_num()).
- Flow tag – For RXQ, the hardware supports retrieving the flow tag value which software (e.g., DOCA Flow) sets when creating a flow entry (see - doca_eth_rxq_set_flow_tag()).
- RX hash – For RXQ, the hardware supports retrieving the RX hash value of the packet (see - doca_eth_rxq_set_rx_hash()).
- Packet headroom/tailroom - For RXQ managed memory pool mode, the hardware supports keeping a headroom space (in head section of - doca_buf) and tailroom space (in tail section of- doca_buf) according the size user requested (see- doca_eth_rxq_set_packet_headroom()/- doca_eth_rxq_set_packet_tailroom()).
- Packet timestamp - For RXQ, the hardware supports retrieving the timestamp (time since epoch in [ns]) when the packet was received (see - doca_eth_rxq_set_timestamp()).
Objects
- doca_mmap– in Cyclic Receive and Managed Memory Pool Receive modes, the user must configure DOCA ETH RXQ with packet buffer to write the received packets into as a- doca_mmap(see DOCA Core Memory Subsystem)
- doca_buf– in Regular Receive mode, the user must submit receive tasks that includes a buffer to write the received packet into as a- doca_buf. Also, In Regular Send mode, the user must submit send tasks that include a buffer of the packet to send as a- doca_buf(see DOCA Core Memory Subsystem).
To start using the library, the user must first first go through a configuration phase as described in DOCA Core Context Configuration Phase.
This section describes how to configure and start the context to allow execution of tasks and retrieval of events.
DOCA ETH in GPU datapath does not need to be associated with a DOCA PE (since the datapath is not on the CPU).
Configurations
The context can be configured to match the application use case.
To find if a configuration is supported or the min/max value for it, refer to Device Support.
Mandatory Configurations
These configurations are mandatory and must be set by the application before attempting to start the context.
DOCA ETH RXQ
- At least one task/event/event_batch type must be configured. Refer to Tasks/Events/Event Batch for more information. 
- Max packet size (the maximum size of packet that can be received) must be provided at creation time of the DOCA ETH RXQ context 
- Max burst size (the maximum number of packets that the library can handle at the same time) must be provided at creation time of the DOCA ETH RXQ context 
- A device with appropriate support must be provided upon creation 
- When in Cyclic Receive or Managed Memory Pool Receive modes, a - doca_mmapmust be provided in-order write the received packets into (see- doca_eth_rxq_set_pkt_buf())
- In case of a GPU datapath, A DOCA GPU sub-device must be provided using - doca_ctx_set_datapath_on_gpu()
DOCA ETH TXQ
- At least one task/task_batch type must be configured. Refer to Tasks/Task Batch for more information. 
- Max burst size (the maximum number of packets that the library can handle at the same time) must be provided at creation time of the DOCA ETH TXQ context 
- A device with appropriate support must be provided on creation 
- In case of a GPU datapath, a DOCA GPU sub-device must be provided using - doca_ctx_set_datapath_on_gpu()
Optional Configurations
The following configurations are optional. If they are not set, then a default value is used.
DOCA ETH RXQ
- RXQ mode – User can set the working mode using - doca_eth_rxq_set_type(). Default type is Regular Receive.
- Max receive buffer list length – User can set the maximum length of buffer list/chain as a receive buffer using - doca_eth_rxq_set_max_recv_buf_list_len(). Default value is 1.
- Metadata number – User can set the number of retrieved metadata per packet using - doca_eth_rxq_set_metadata_num(). Default value is 0 (i.e., metadata is not retrieved).
- Flow tag – User can enable/disable retrieval of flow tag per packet using - doca_eth_rxq_set_flow_tag(). Default value is 0 (i.e., flow tag is not retrieved).
- RX hash – User can enable/disable retrieval of RX hash per packet using - doca_eth_rxq_set_rx_hash(). Default value is 0 (i.e., RX hash is not retrieved).
- Packet headroom - User can request a headroom size per packet using - doca_eth_rxq_set_packet_headroom(). Default value is 0 (i.e., no headroom).
- Packet tailroom - User can request a tailroom size per packet using - doca_eth_rxq_set_packet_tailroom(). Default value is 0 (i.e., no tailroom).
- Timestamp – User can enable/disable retrieval of timestamp per packet using - doca_eth_rxq_set_timestamp(). Default value is 0 (i.e., timestamp is not retrieved).
DOCA ETH TXQ
- TXQ mode – User can set the working mode using - doca_eth_txq_set_type(). The default type is Regular Send.
- Max send buffer list length – User can set the maximum length of buffer list/chain as a send buffer using - doca_eth_txq_set_max_send_buf_list_len(). Default value is 1.
- L3/L4 offload checksum – User can enable/disable default L3/L4 checksum offloading using - doca_eth_txq_set_l3_chksum_offload()/- doca_eth_txq_set_l4_chksum_offload(). Disabled by default.
- MSS – User can set default MSS (maximum segment size) value for LSO send task/task_batch using - doca_eth_txq_set_mss(). Default value is 1500.
- Max LSO headers size – User can set the maximum LSO headers size for LSO send task/task_batch using - doca_eth_txq_set_max_lso_header_size(). Default value is 74.
- Metadata number – User can set the number of attached metadata per packet using - doca_eth_txq_set_metadata_num(). Default value is 0 (i.e., metadata is not attached).
Device Support
DOCA ETH requires a device to operate. For picking a device, see DOCA Core Device Discovery.
    To check if a device supports a specific mode, use the type capabilities functions (see     
doca_eth_rxq_cap_is_type_supported() and doca_eth_txq_cap_is_type_supported()    
).
Devices can allow the following capabilities:
- The maximum burst size 
- The maximum buffer chain list (only for Regular Receive/Regular Send modes) 
- The maximum packet size (only for DOCA ETH RXQ) 
- L3/L4 checksum offloading capability (only for DOCA ETH TXQ) 
- Maximum LSO message/header size (only for DOCA ETH TXQ) 
- Wait-on-time offloading capability (only for DOCA ETH TXQ in GPU datapath) 
- Max metadata number capability (only for CPU datapath) 
Buffer Support
    DOCA ETH support buffers (    
doca_mmap    
 or     
doca_buf    
) with the following features:
| Buffer Type | Send Task | LSO Send Task | Receive Task | Managed Receive Event | 
| Local mmap buffer | Yes | Yes | Yes | Yes | 
| Mmap from PCIe export buffer | Yes | Yes | Yes | Yes | 
| Mmap from RDMA export buffer | No | No | No | No | 
| Linked list buffer | Yes | Yes | Yes | No | 
For buffer support in the case of GPU datapath, see DOCA GPUNetIO Programming Guide.
This section describes execution on CPU (unless stated otherwise) using DOCA Core Progress Engine.
For information regarding GPU datapath, see DOCA GPUNetIO.
Tasks
DOCA ETH exposes asynchronous tasks that leverage the DPU hardware according to the DOCA Core architecture. See DOCA Core Task.
DOCA ETH RXQ
Receive Task
    This task allows receiving packets from a     
doca_dev    
.
Task Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Enable the task |     
calling  |      | 
| Number of tasks |     
 | – | 
| Max receive buffer list length |     
 |     
 | 
| Maximal packet size |     
 |      | 
| Number of metadata |     
 |      | 
| Enable flow tag |     
 | – | 
| Enable RX hash |     
 | – | 
| Enable timestamp |     
 | – | 
Task Input
Common input as described in DOCA Core Task.
| Name | Description | Notes | 
| Packet buffer | Buffer pointing to the memory where received packet are to be written | The received packet is written to the tail segment extending the data segment | 
Task Output
Common output as described in DOCA Core Task.
Additionally :
| Name | Description | Notes | 
| L3 checksum result | Value indicating whether the L3 checksum of the received packet is valid or not |     Can be queried using     
 | 
| L4 checksum result | Value indicating whether the L4 checksum of the received packet is valid or not |     Can be queried using     
 | 
| Metadata array | Array containing metadata values associated with the packet |     Can be queried using     
 | 
| Flow tag | Flow tag value associated with the packet |     Can be queried using     
 | 
| RX hash | RX hash value associated with the packet |     Can be queried using     
 | 
| Timestamp | Timestamp ([ns] since epoch) associated with the packet |     
Can be queried using  | 
Task Completion Success
After the task is completed successfully the following will happen:
- The received packet is written to the packet buffer 
- The packet buffer data segment is extended to include the received packet 
Task Completion Failure
If the task fails midway:
- The context enters stopping state 
- The packet buffer - doca_bufobject is not modified
- The packet buffer contents may be modified 
Task Limitations
All limitations described in DOCA Core Task
Additionally:
- The operation is not atomic. 
- Once the task has been submitted, then the packet buffer should not be read/written to. 
DOCA ETH TXQ
Send Task
    This task allows sending packets from a     
doca_dev    
.
Task Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Enable the task |     
calling  |      | 
| Number of tasks |     
 Note The number of tasks can be expanded using  | – | 
| Max send buffer list length |     
 |     
 | 
| Default L3/L4 offload checksum |     
     
 Info Disabled by default. |     
     
 | 
| Number of metadata |     
 |      | 
Task Input
Common input as described in DOCA Core Task.
| Name | Description | Notes | 
| Packet buffer | Buffer pointing to the packet to send | The sent packet is the memory in the data segment | 
| Metadata values | Metadata array containing metadata values to attach with packet |     
Array length is the  | 
| Offload flags | Offload flags to associate with specific packet (if users want to override default values) |     
Offload flags is a bit mask of values in  | 
Task Output
Common output as described in DOCA Core Task.
Task Completion Success
    
The task finishing successfully does not guarantee that the packet has been transmitted onto the wire. It only signifies that the packet has successfully entered the device's TX hardware and that the packet buffer doca_buf is no longer in the library's ownership and it can be reused by the application.    
Task Completion Failure
If the task fails midway:
- The context enters stopping state 
- The packet buffer - doca_bufobject is not modified
Task Limitations
- The operation is not atomic 
- Once the task has been submitted, the packet buffer should not be written to 
- Other limitations are described in DOCA Core Task 
LSO Send Task
    This task allows sending "large" packets (larger than MTU) from a     
doca_dev    
 (hardware splits the packet into several packets smaller than the MTU and sends them).
Task Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Enable the task |     
calling  |      | 
| Number of tasks |     
 Note The number of tasks can be expanded using  | – | 
| Max send buffer list length |     
 |     
 | 
| Default L3/L4 offload checksum |     
     
 Info Disabled by default. |     
     
 | 
| Default MSS |     
 | – | 
| Max LSO headers size |     
 |     
 | 
| Number of metadata |     
 |      | 
Task Input
Common input as described in DOCA Core Task.
| Name | Description | Notes | 
| Packet payload buffer | Buffer that points to the "large" packet's payload (does not include headers) to send | The sent packet is the memory in the data segment | 
| Packet headers buffer | Gather list that when combined includes the "large" packet's headers to send |     See     
 | 
| Metadata values | Metadata array containing metadata values to attach with packet |     
Array length is the  | 
| Offload flags | Offload flags to associate with specific packet (if users want to override default values) |     
Offload flags is a bit mask of values in  | 
| MSS | MSS to associate with LSO packet (if users want to override default values) | – | 
Task Output
Common output as described in DOCA Core Task.
Task Completion Success
The task finishing successfully does not guarantee that the packet has been transmitted onto the wire. It only means that the packet has successfully entered the device's TX hardware and that the packet payload buffer and the packet headers buffer is no longer in the library's ownership and it can be reused by the application.
Task Completion Failure
If the task fails midway:
- The context enters stopping state 
- The packet payload buffer - doca_bufobject and the packet header buffer- doca_gather_listare not modified
Task Limitations
- The operation is not atomic 
- Once the task has been submitted, the packet payload buffer and the packet headers buffer should not be written to 
- Other limitations are described in DOCA Core Task 
Events
DOCA ETH exposes asynchronous events to notify about changes that happen asynchronously, according to the DOCA Core architecture. See DOCA Core Event.
In addition to common events as described in DOCA Core Event, DOCA ETH exposes an extra events:
DOCA ETH RXQ
Managed Receive Event
    This event allows receiving packets from a     
doca_dev    
 (without requiring the application to manage the memory the packets are written to).
Event Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Register to the event |     
 |      | 
| Number of metadata |     
 |      | 
| Enable flow tag |     
 | – | 
| Enable RX hash |     
 | – | 
| Enable timestamp |     
 | – | 
| Packet headroom size |     
 | 
 | 
| Packet tailroom size |     
 | 
 | 
Event Trigger Condition
The event is triggered every time a packet is received.
Event Success Handler
The success callback (provided in the event registration) is invoked and the user is expected to perform the following:
- Use the - pktparameter to process the received packet
- Use - event_user_datato get the application context
- Query L3/L4 checksum results of the packet 
- Query metadata array of metadata values associated with the packet 
- Query flow tag value associated with the packet 
- Query RX hash value associated with the packet 
- Query packet timestamp associated with the packet 
- Free the - pkt(- doca_bufobject) and return it to the libraryNote- Not freeing the - pktmay cause scenario where packets are lost.
Event Failure Handler
The failure callback (provided in the event registration) is invoked, and the following happens:
- The context enters stopping state 
- The - pktparameter becomes NULL
- The - event_user_dataparameter contains the value provided by the application when registering the event
DOCA ETH TXQ
Error Send Packet
This event is relevant when running DOCA ETH on GPU datapath (see DOCA GPUNetIO). It allows detecting failure in sending packets.
Event Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Register to the event |     
 | Always supported | 
Event Trigger Condition
The event is triggered when sending a packet fails.
Event Handler
The callback (provided in the event registration) is invoked and the user can:
- Get the position (index) of the packet that TXQ failed to send 
Notify Send Packet
This event is relevant when running DOCA ETH on GPU datapath (see DOCA GPUNetIO). It notifies user every time a packet is sent successfully.
Event Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Register to the event |     
 | Always supported | 
Event Trigger Condition
The event is triggered when sending a packet fails.
Event Handler
The callback (provided in the event registration) is invoked and the user can:
- Get the position (index) of the packet was sent 
- Timestamp of sending the packet 
Task Batch
DOCA ETH exposes asynchronous task batches that leverage the BlueField Platform hardware according to the DOCA Core architecture.
DOCA ETH RXQ
There are currently no task batches in ETH RXQ.
DOCA ETH TXQ
Send Task Batch
    This is an extended task batch for Send Task which allows batched sending of packets from a     
doca_dev    
.
Task Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Enable the task batch |     
calling  |      | 
| Number of task batches |     
 Info     
Number of task batches can be expanded using  | – | 
| Max number of tasks per task batch |     
 | – | 
| Max send buffer list length |     
 Info Disabled by default. |     
 | 
| default L3/L4 offload checksum |     
     
 Info Disabled by default. |     
     
 | 
| Number of metadata's |     
 |      | 
Task Input
| Name | Description | Notes | 
| Tasks number | Number of send tasks "behind" the task batch | This number equals the number of packets to send | 
| Batch user data | User data associated for the task batch | – | 
| Packets array | Pointer to an array of buffers pointing at the packets to send per task | The sent packet is the memory in the data segment of each buffer | 
| User data array | Pointer to an array of user data per task | – | 
| Metadata array | Pointer to an array of metadata per task |     
Array length is  Info     
See  | 
| Offload flags array | Pointer to an array of offload flags per task (if users want to override default values) |     
Offload flags is a bitmask of values in  | 
Task Output
| Name | Description | 
| Status array | Pointer to an array of statuses per task of the finished task batch | 
Task Completion Success
A task batch is complete if all the send tasks finished successfully and all the packets entered the device's TX hardware. All packets in the "Packet array" are now in the ownership of the user.
Task Completion Failure
If a task batch fails, then one (or more) of the tasks associated with the task batch failed. The user can look at "Status array" to see which task/packet caused the failure.
Also, the following behavior is expected:
- The context enters stopping state 
- The packet's - doca_bufobjects are not modified
Task Limitations
In addition to all the Send Task Limitations:
- Task batch completion occurs only when all the tasks are completed (no partial completion) 
LSO Send Task Batch
    This is an extended task batch for LSO Send Task which allows batched sending of LSO packets from a     
doca_dev    
.
Task Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Enable the task batch |     
Calling  |      | 
| Number of task batches |     
 Info     
Number of task batches can be expanded using  | – | 
| Max number of tasks per task batch |     
 | – | 
| Max send buffer list length |     
 Info Default value is 1. |     
 | 
| default L3/L4 offload checksum |     
     
 Info Disabled by default. |     
     
 | 
| Default MSS |     
 Info Default value is 1500. | – | 
| Max LSO headers size |     
 |     
 | 
| Number of metadata |     
 |      | 
Task Input
| Name | Description | Notes | 
| Tasks number | Number of send tasks "behind" the task batch | This number equals the number of packets to send | 
| Batch user data | User data associated for the task batch | – | 
| Packets payload array | Pointer to an array of buffers pointing at the "large" packet's payload to send per task | The sent packet payload is the memory in the data segment of each buffer | 
| Packets headers array | Pointer to an array of gather lists, each of which when combined assembles a "large" packet's headers to send per task |     
See     
    
 | 
| User data array | Pointer to an array of user data per task | – | 
| Metadata array | Pointer to an array of metadata per task |     
Array length is  Info     
See  
 | 
| Offload flags array | Pointer to an array of offload flags per task (if users want to override default values) |     
Offload flags is a bit mask of values in  | 
| MSS array | Pointer to an array of MSS per task (if users want to override default values) | – | 
Task Output
| Name | Description | 
| Status array | Pointer to an array of status per task of the finished task batch | 
Task Completion Success
A task batch is complete if all the LSO send tasks finished successfully and all the packets entered the device's TX hardware. All packet payload in "Packets payload array" and packet headers in "Packets headers array" are now in the ownership of the user.
Task Completion Failure
If a task batch fails, then one (or more) of the tasks associated with the task batch failed, and the user can look at the "Status array" to try and figure out which task/packet caused the failure.
Also, the following behavior is expected:
- The context enters stopping state 
- The packets payload - doca_bufobjects are not modified
- The packets headers - doca_gather_listobjects are not modified
Task Limitations
In addition to all the LSO Send Task Limitations:
- Task batch completion happens only when all the tasks are completed (no partial completion) 
Event Batch
DOCA ETH exposes asynchronous event batches to notify about changes that happen asynchronously.
DOCA ETH RXQ
Managed Receive Event Batch
    This is an extended event batch for Managed Receive Event which     allows receiving packets from a     
    
doca_dev    
    
 (without requiring the application to manage the memory the packets are written to).
Event Configuration
| Description | API to Set the Configuration | API to Query Support | 
| Register to the event batch |     
Calling  |      | 
| Max events number: Equal to the maximum number of completed events per event batch completion |     
 | – | 
| Min events number: Equal to the minimum number of completed events per event batch completion |     
 | – | 
| Number of metadata's |     
 |      | 
| Enable flow tag |     
 | – | 
| Enable RX hash |     
 | – | 
| Enable timestamp |     
 | – | 
| Packet headroom size |     
 | 
 | 
| Packet tailroom size |     
 | 
 | 
Event Trigger Condition
The event batch is triggered every time several packets (number between "Min events number" and "Max events number") are received.
Event Batch Success Handler
The success callback (provided in the event of batch registration) is invoked and the user is expected to perform the following:
- Identify the number of received packets by - events_number.
- Use the - pkt_arrayparameter to process the received packets.
- Use - event_batch_user_datato get the application context.
- Query the L3/L4 checksum results of the packets using - l3_ok_arrayand- l4_ok_array.
- Query metadata array of metadata values associated with the packet using - metadata_array.
- Query flow tag value associated with the packet using - flow_tag_array.
- Query RX hash value associated with the packet using - rx_hash_array.
- Query packet timestamp associated with the packet using - timestamp_array.
- Free the buffers from - pkt_array(a- doca_bufobject) and return it to the library. This can be done in two ways:- Iterating over the buffers in - pkt_arrayand freeing them using- doca_buf_dec_refcount().
- Freeing all the buffers in - pkt_arraytogether (gives better performance) using- doca_eth_rxq_event_batch_managed_recv_pkt_array_free().
 
Event Batch Failure Handler
The failure callback (provided in the event batch registration) is invoked, and the following happens:
- The context enters stopping state 
- The - pkt_arrayparameter is NULL
- The - l3_ok_arrayparameter is NULL
- The - l4_ok_arrayparameter is NULL
- The - event_batch_user_dataparameter contains the value provided by the application when registering the event
DOCA ETH TXQ
There are no event batches in ETH TXQ at the moment.
The DOCA ETH library follows the Context state machine as described in DOCA Core Context State Machine.
The following section describes how to move to the state and what is allowed in each state.
Idle
In this state it is expected that application either:
- Destroys the context 
- Starts the context 
Allowed operations:
- Configuring the context according to Configurations 
- Starting the context 
It is possible to reach this state as follows:
| Previous State | Transition Action | 
| None | Creating the context | 
| Running | Calling stop after: 
 | 
| Stopping | Calling progress until: 
 | 
Starting
This state cannot be reached.
Running
In this state it is expected that application will do the following:
- Allocate and submit tasks 
- Call progress to complete tasks and/or receive events 
Allowed operations:
- Allocate previously configured task 
- Submit a task 
- Call - doca_eth_rxq_get_flow_queue_id()to connect the RX queue to DOCA Flow
- Call stop 
It is possible to reach this state as follows:
| Previous State | Transition Action | 
| Idle | Call start after configuration | 
Stopping
In this state, it is expected that application:
- Calls progress to complete all inflight tasks (tasks complete with failure) 
- Frees any completed tasks 
- Frees - doca_bufobjects returned by Managed Receive Event callback
Allowed operations:
- Call progress 
It is possible to reach this state as follows:
| Previous State | Transition Action | 
| Running | Call progress and fatal error occurs | 
| Running | Call stop without either: 
 | 
In addition to the CPU datapath (mentioned in Execution Phase), DOCA ETH supports running on GPU datapath. This allows applications to release the CPU from datapath management and allow low latency GPU processing of network traffic.
To export the handles, the application should call doca_ctx_set_datapath_on_gpu() before doca_ctx_start() to program the library to set up a GPU operated context.
To get the GPU context handle, the user should call doca_rxq_get_gpu_handle() which returns a pointer to a handle in the GPU memory space.
The datapath cannot be managed concurrently for the GPU and the CPU.
The DOCA ETH context is configured on the CPU and then exported to the GPU:
 
The following example shows the expected flow for a GPU-managed datapath with packets being scattered to GPU memory (for doca_eth_rxq):
- Create a DOCA GPU device handler. 
- Create - doca_eth_rxqand configure its parameters.
- Set the datapath of the context to GPU. 
- Start the context. 
- Get a GPU handle of the context. 
 
For more information regarding the GPU datapath see DOCA GPUNetIO.
This section describes DOCA ETH samples based on the DOCA ETH library.
The samples illustrate how to use the DOCA ETH API to do the following:
- Send "regular" packets (smaller than MTU) using DOCA ETH TXQ 
- Send "large" packets (larger than MTU) using DOCA ETH TXQ 
- Receive packets using DOCA ETH RXQ in Regular Receive mode 
- Receive packets using DOCA ETH RXQ in Managed Memory Pool Receive mode 
All the DOCA samples described in this section are governed under the BSD-3 software license agreement.
Running the Samples
- Refer to the following documents: - DOCA Installation Guide for Linux for details on how to install BlueField-related software. 
- DOCA Troubleshooting for any issue you may encounter with the installation, compilation, or execution of DOCA samples. 
 
- To build a given sample (in this case, eth_txq_send_ethernet_frames) run the following command. If you downloaded the sample from GitHub, update the path in the first line to reflect the location of the sample file: - cd /opt/mellanox/doca/samples/doca_eth/eth_txq_send_ethernet_frames meson /tmp/build ninja -C /tmp/build - The binary eth_txq_send_ethernet_frames is created under - /tmp/build/.
- Sample (e.g., eth_txq_send_ethernet_frames) usage: - Usage: doca_eth_txq_send_ethernet_frames [DOCA Flags] [Program Flags] DOCA Flags: -h, --help Print a help synopsis -v, --version Print program version information -l, --log-level Set the (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE> --sdk-log-level Set the SDK (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE> -j, --json <path> Parse all command flags from an input json file Program Flags: -d, --device IB device name - default: mlx5_0 -m, --mac-addr Destination MAC address to associate with the ethernet frames - default: FF:FF:FF:FF:FF:FF 
- For additional information per sample, use the - -hoption:- /tmp/build/<sample_name> -h 
Samples
- The following samples are for the CPU datapath. For GPU datapath samples, see DOCA GPUNetIO. 
- These samples are also available on GitHub. 
ETH TXQ Send Ethernet Frames
This sample illustrates how to send a "regular" packet (smaller than MTU) using DOCA ETH TXQ .
The sample logic includes:
- Locating DOCA device. 
- Initializing the required DOCA Core structures. 
- Populating DOCA memory map with one buffer to the packet's data. 
- Writing the packet's content into the allocated buffer. 
- Allocating elements from DOCA buffer inventory for the buffer. 
- Initializing and configuring DOCA ETH TXQ context. 
- Starting the DOCA ETH TXQ context. 
- Allocating DOCA ETH TXQ send task. 
- Submitting DOCA ETH TXQ send task into progress engine. 
- Retrieving DOCA ETH TXQ send task from the progress engine. 
- Handling the completed task using the provided callback. 
- Stopping the DOCA ETH TXQ context. 
- Destroying DOCA ETH TXQ context. 
- Destroying all DOCA Core structures. 
Reference:
- /opt/mellanox/doca/samples/doca_eth/eth_txq_send_ethernet_frames/eth_txq_send_ethernet_frames_sample.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_send_ethernet_frames/eth_txq_send_ethernet_frames_main.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_send_ethernet_frames/meson.build
ETH TXQ LSO Send Ethernet Frames
This sample illustrates how to send a "large" packet (larger than MTU) using DOCA ETH TXQ .
The sample logic includes:
- Locating DOCA device. 
- Initializing the required DOCA Core structures. 
- Populating DOCA memory map with one buffer to the packet's payload. 
- Writing the packet's payload into the allocated buffer. 
- Allocating elements from DOCA Buffer inventory for the buffer. 
- Allocating DOCA gather list consisting of one node to the packet's headers. 
- Writing the packet's headers into the allocated gather list node. 
- Initializing and configuring DOCA ETH TXQ context. 
- Starting the DOCA ETH TXQ context. 
- Allocating DOCA ETH TXQ LSO send task. 
- Submitting DOCA ETH TXQ LSO send task into progress engine. 
- Retrieving DOCA ETH TXQ LSO send task from the progress engine. 
- Handling the completed task using the provided callback. 
- Stopping the DOCA ETH TXQ context. 
- Destroying DOCA ETH TXQ context. 
- Destroying all DOCA Core structures. 
Reference:
- /opt/mellanox/doca/samples/doca_eth/eth_txq_lso_send_ethernet_frames/eth_txq_lso_send_ethernet_frames_sample.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_lso_send_ethernet_frames/eth_txq_lso_send_ethernet_frames_main.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_lso_send_ethernet_frames/meson.build
ETH TXQ Batch Send Ethernet Frames
This sample illustrates how to send a batch of "regular" packets (smaller than MTU) using DOCA ETH TXQ .
The sample logic includes:
- Locating DOCA device. 
- Initializing the required DOCA Core structures. 
- Populating DOCA memory map with multiple buffers, each representing a packet's data. 
- Writing the packets' content into the allocated buffers. 
- Allocating elements from DOCA Buffer inventory for the buffers. 
- Initializing and configuring DOCA ETH TXQ context. 
- Starting the DOCA ETH TXQ context. 
- Allocating DOCA ETH TXQ send task batch. 
- Copying all buffers' pointers to task batch's pkt_arry. 
- Submitting DOCA ETH TXQ send task batch into the progress engine. 
- Retrieving DOCA ETH TXQ send task batch from the progress engine. 
- Handling the completed task batch using the provided callback. 
- Stopping the DOCA ETH TXQ context. 
- Destroying DOCA ETH TXQ context. 
- Destroying all DOCA Core structures. 
Reference:
- /opt/mellanox/doca/samples/doca_eth/eth_txq_batch_send_ethernet_frames/eth_txq_batch_send_ethernet_frames_sample.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_batch_send_ethernet_frames/eth_txq_batch_send_ethernet_frames_main.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_batch_send_ethernet_frames/meson.build
ETH TXQ Batch LSO Send Ethernet Frames
This sample illustrates how to send a batch of "large" packets (larger than MTU) using DOCA ETH TXQ .
The sample logic includes:
- Locating DOCA device. 
- Initializing the required DOCA Core structures. 
- Populating DOCA memory map with multiple buffers, each representing a packet's payload. 
- Writing the packets' payload into the allocated buffers. 
- Allocating elements from DOCA Buffer inventory for the buffers. 
- Allocating DOCA gather lists each consisting of one node for the packet's headers. 
- Writing the packets' headers into the allocated gather list nodes. 
- Initializing and configuring DOCA ETH TXQ context. 
- Starting the DOCA ETH TXQ context. 
- Allocating DOCA ETH TXQ LSO send task. 
- Copying all buffers' pointers to task batch's pkt_payload_arry. 
- Copying all gather lists' pointers to task batch's headers_arry. 
- Submitting DOCA ETH TXQ LSO send task batch into the progress engine. 
- Retrieving DOCA ETH TXQ LSO send task batch from the progress engine. 
- Handling the completed task batch using the provided callback. 
- Stopping the DOCA ETH TXQ context. 
- Destroying DOCA ETH TXQ context. 
- Destroying all DOCA Core structures. 
Reference:
- /opt/mellanox/doca/samples/doca_eth/eth_txq_batch_lso_send_ethernet_frames/eth_txq_batch_lso_send_ethernet_frames_sample.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_batch_lso_send_ethernet_frames/eth_txq_batch_lso_send_ethernet_frames_main.c
- /opt/mellanox/doca/samples/doca_eth/eth_txq_batch_lso_send_ethernet_frames/meson.build
ETH RXQ Regular Receive
This sample illustrates how to receive a packet using DOCA ETH RXQ in Regular Receive mode .
The sample logic includes:
- Locating DOCA device. 
- Initializing the required DOCA Core structures. 
- Populating DOCA memory map with one buffer to the packet's data. 
- Allocating element from DOCA Buffer inventory for each buffer. 
- Initializing DOCA Flow. 
- Initializing and configuring DOCA ETH RXQ context. 
- Starting the DOCA ETH RXQ context. 
- Starting DOCA Flow. 
- Creating a pipe connecting to DOCA ETH RXQ's RX queue. 
- Allocating DOCA ETH RXQ receive task. 
- Submitting DOCA ETH RXQ receive task into the progress engine. 
- Retrieving DOCA ETH RXQ receive task from the progress engine. 
- Handling the completed task using the provided callback. 
- Stopping DOCA Flow. 
- Stopping the DOCA ETH RXQ context. 
- Destroying DOCA ETH RXQ context. 
- Destroying DOCA Flow. 
- Destroying all DOCA Core structures. 
Reference:
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_regular_receive/eth_rxq_regular_receive_sample.c
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_regular_receive/eth_rxq_regular_receive_main.c
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_regular_receive/meson.build
ETH RXQ Managed Receive
This sample illustrates how to receive packets using DOCA ETH RXQ in Managed Memory Pool Receive mode .
The sample logic includes:
- Locating DOCA device. 
- Initializing the required DOCA Core structures. 
- Calculating the required size of the buffer to receive the packets from DOCA ETH RXQ. 
- Populating DOCA memory map with a packets buffer. 
- Initializing DOCA Flow. 
- Initializing and configuring DOCA ETH RXQ context. 
- Registering DOCA ETH RXQ managed receive event. 
- Starting the DOCA ETH RXQ context. 
- Starting DOCA Flow. 
- Creating a pipe connecting to DOCA ETH RXQ's RX queue. 
- Retrieving DOCA ETH RXQ managed receive events from the progress engine. 
- Handling the completed events using the provided callback. 
- Stopping DOCA Flow. 
- Stopping the DOCA ETH RXQ context. 
- Destroying DOCA ETH RXQ context. 
- Destroying DOCA Flow. 
- Destroying all DOCA Core structures. 
Reference:
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_managed_mempool_receive/eth_rxq_managed_mempool_receive_sample.c
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_managed_mempool_receive/eth_rxq_managed_mempool_receive_main.c
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_managed_mempool_receive/meson.build
ETH RXQ Batch Managed Receive
This sample illustrates how to receive batches of packets using DOCA ETH RXQ in Managed Memory Pool Receive mode .
The sample logic includes:
- Locating DOCA device. 
- Initializing the required DOCA Core structures. 
- Calculating the required size of the buffer to receive the packets from DOCA ETH RXQ. 
- Populating DOCA memory map with a packets buffer. 
- Initializing DOCA Flow. 
- Initializing and configuring DOCA ETH RXQ context. 
- Registering DOCA ETH RXQ managed receive event batch. 
- Starting the DOCA ETH RXQ context. 
- Starting DOCA Flow. 
- Creating a pipe connecting to DOCA ETH RXQ's RX queue. 
- Retrieving DOCA ETH RXQ managed receive event batches from the progress engine. 
- Handling the completed event batches using the provided callback. 
- Stopping DOCA Flow. 
- Stopping the DOCA ETH RXQ context. 
- Destroying DOCA ETH RXQ context. 
- Destroying DOCA Flow. 
- Destroying all DOCA Core structures. 
Reference:
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_batch_managed_mempool_receive/eth_rxq_batch_managed_mempool_receive_sample.c
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_batch_managed_mempool_receive/eth_rxq_batch_managed_mempool_receive_main.c
- /opt/mellanox/doca/samples/doca_eth/eth_rxq_batch_managed_mempool_receive/meson.build