DOCA Comm Channel – New

This guide provides instructions on building and developing applications that require communication channels between the x86 host and the Arm cores on the DPU.

DOCA Comm Channel provides a communication channel between client applications on the host and servers on the DPU.

Benefits of using Comm Channel:

  • Security – Comm Channel is isolated from the network

  • Network independent – the state of the Comm Channel does not depend on the state and configuration of the network

  • Ease of use

Comm Channel provides two different data path APIs:

  • Basic Comm Channel send/receive for control messages

  • High bandwidth, low latency, zero-copy, multi-producer, multi-consumer API

The following table summarizes the differences between the two data path APIs:

Features

Basic Send/Receive

Fast Path (using doca_cc_consumer/doca_cc_producer)

Zero-copy

No

Yes

Takes network bandwidth

Yes

No

Isolated from network

Yes

Yes

Max msg size

Fixed

1GB or more (depends on hardware cap)

Multi-threaded

Safe for a single thread

Allows creation of consumer/producers per thread.

Multi-consumer

No

Yes

Multi-producer

Yes – allows multiple clients per server

Yes – allow multiple producers/consumers per connection

Requires doca_mmap and doca_buf

No

Yes

This library follows the architecture of a DOCA Core Context, it is recommended to read the following sections before:

DOCA Comm Channel based applications can run either on the host machine or on the NVIDIA® BlueField® DPU target.

Sending messages between the host and DPU can only be run with a DPU configured with DPU mode as described in NVIDIA BlueField DPU Modes of Operation.

For basic DOCA Comm Channel send and receive, the following configuration is required:

  • doca_cc_server context must run on the DPU

  • doca_cc_client context must run on the host machine

Warning

Producer and consumer objects can run on both the host and DPU. However, there must be a valid client/server connection already established on the channel.

DOCA Comm Channel is comprised of four DOCA Core Contexts. All DOCA Comm Channel contexts leverage DOCA Core architecture to expose asynchronous tasks/events that are offloaded to hardware.

A doca_cc_server context runs on the DPU and listens for incoming connections from the host side. Such host side connections are initiated by a doca_cc_client context.

Servers can receive connections from multiple clients in parallel, however, a client can only connect with one server. An established 1-to-1 connection between a client and a server is represented by a doca_cc_connection.

Once an established connection exists between a client and a server, the doca_cc_producer and doca_cc_consumer contexts can be used to run fast path channels.

The following diagram provides examples of the contexts use:

doca-comm-channel-client-server-version-1-modificationdate-1707420764043-api-v2.png

Objects

Description

DPU/Host

Scope

doca_cc_server

Allows applications on the DPU to listen on a specific server name and accept new incoming connection from the host

DPU only

Per host PCIe function (doca_dev + doca_dev_rep)

doca_cc_client

Allows client applications to connect to a specific server name on the DPU

Host only

Per host PCIe function (doca_dev)

doca_cc_connection

A connection handle created on the client side or the server side when a new connection is established. This handle is used to send/receive messages or to create doca_cc_consumers and doca_cc_producers.

DPU and host

Per client server pair

doca_cc_producer

A handle for a FIFO-like send queue that provides a zero-copy API to send messages to a specific doca_cc_consumer on the same doca_cc_connection. Multiple doca_cc_producers can be created per doca_cc_connection.

DPU and host

Per doca_cc_connection

doca_cc_consumer

A handle for a FIFO-like receive queue that provides a zero-copy API to receive messages from a doca_cc_producer

DPU and host

Per doca_cc_connection


Security Considerations

  • DOCA Comm Channel guarantees:

    • The client is connected to the server by providing the exact server name on the client side

    • Only clients on the PF/VF/SF represented by the doca_dev_rep provided upon server creation can connect to the server

    • The connection requests and data path are isolated from the network

  • DOCA Comm Channel does not provide security at the application level:

    • It is up to the user to implement application-level security and verify the identity of the client application

    • A server handles applications from a single PF/VF/SF. If a server application detects a compromised client application, the server app should consider all clients (from that PF/VF/SF) compromised.

Initialization Flow

doca_cc_server Initialization Flow

  1. A doca_cc_server is created on a specific doca_dev and a specific doca_dev_rep.

  2. A doca_cc_server must have a unique name per doca_dev/doca_dev_rep (i.e., two servers on the same doca_dev and doca_dev_rep cannot have the same name).

  3. Once doca_ctx_start() is called, the doca_cc_server can start receiving new connection requests.

  4. For the doca_cc_server to process new connection requests and messages, the user must periodically call doca_pe_progress() .

  5. When a new connection request arrives, doca_cc_server calls the connection request handler function and passes a doca_cc_connection object.

The server can now send and receive messages on the connection represented by doca_cc_connection .

doca_cc_client Initialization Flow

  1. A doca_cc_client is created on a specific doca_dev is targeting a specific doca_cc_server.

  2. Once doca_ctx_start() is called, doca_cc_client asynchronously tries to connect to the server.

  3. To establish the connection and receive messages, the user must periodically call doca_pe_progress().

  4. When the connection is established, doca_cc_client calls the state change callback indicating state change to "RUNNING".

The client can now send a receive messages.

The following diagram describes the initialization of a basic client/server connection on DOCA Comm Channel:

client-server-connection-version-1-modificationdate-1707420763813-api-v2.png

doca_cc_consumer Initialization Flow

  1. A doca_cc_consumer is created on a specific doca_cc_connection.

  2. doca_pe_progress()must be periodically called on the client/server PE to allow registration of the consumer.

  3. After the doca_cc_consumer moves to "RUNNING" state:

    1. doca_cc_consumer notifies its existence to the peer (invoking a new consumer event).

    2. The application can start posting receive tasks.

    3. A doca_cc_producer on the peer side can start sending messages to that consumer.

The initialization is described in the following diagram:

consumer-creation-flow-version-1-modificationdate-1707420763477-api-v2.png

To start using the library, users must go through a configuration phase as described in DOCA Core Context Configuration Phase.

This section describes how to configure and start the context to allow execution of tasks and retrieval of events.

Configurations

The context can be configured to match the application use case.

To find out if a certain configuration is supported, or what the min/max value for it is, refer to Device Support.

Mandatory Configurations

These configurations are mandatory and must be set by the application before attempting to start the context:

For a basic send/receive client or server:

  • A send task callback

  • A receive event callback

  • A device with appropriate support must be provided on creation

  • A valid server name must be provided on creation (for clients this is the server to connect to)

  • A connection event callback (server only)

For fast path producer or consumer:

  • A device with appropriate support must be provided on creation

  • An established client to server connection must be provided on creation

  • A doca_mmap with PCIe read/write permissions of where data should be received must be provided on creation (consumer only)

  • A post receive task callback (consumer only)

  • A send task callback (producer only)

  • A new consumer callback (triggered upon creation/destruction of a remove consumer)

Optional Configurations

The following configurations are optional, if they are not set then a default value will be used:

For basic send/receive client:

  • doca_cc_(server|client)_set_max_msg_size – set the maximum size of message that can be sent. If set, it must be matching between server and client.

  • doca_cc_(server|client)_set_recv_queue_size – set the size of the queue to receive new messages on

Device Support

DOCA Comm Channel requires a device to operate. For instructions on picking a device, see DOCA Core Device Discovery.

As device capabilities are subject to change (see DOCA Core Device Support), it is recommended to select a device using the following methods:

  • For basic client and server:

    • doca_cc_cap_server_is_supported

    • doca_cc_cap_client_is_supported

  • For extended fast path functionality:

    • doca_cc_producer_cap_is_supported

    • doca_cc_consumer_cap_is_supported

Some devices can allow different capabilities as follows:

  • The maximum length server name

  • The maximum message size

  • The maximum receive queue length

  • The maximum clients that can connect to a server

  • The maximum number of send tasks or post receive tasks

  • The maximum buffer length for fast path

Buffer Support

Basic send and receive between a client and server does not use DOCA buffers and so has no restrictions on buffer type.

  • For producers, supplied buffers need only be from a local mmap

  • For consumers, post receive buffers are required to be from a PCIe export mmap

Warning

Chained buffers are not supported in DOCA Comm Channel.


This section describes execution on CPU using DOCA Core Progress Engine. For additional execution environments, refer to section "Alternative Datapath Options".

Tasks

DOCA Comm Channel exposes asynchronous tasks that leverage the DPU hardware according to DOCA Core architecture.

Control Channel Send Task

This task allows the sending of messages between connected client and server objects.

Configuration

Description

API to Set the Configuration

API to Query Support

Number of tasks

doca_cc_server_task_send_set_conf

doca_cc_client_task_send_set_conf

doca_cc_cap_get_max_send_tasks

Maximal Message Size

doca_cc_server_set_max_msg_size

doca_cc_client_set_max_msg_size

doca_cc_server_get_max_msg_size

doca_cc_client_get_max_msg_size


Input

Common input as described in DOCA Core Task.

Name

Description

Notes

Peer

Established client/server connection

Message

Data string to send to remote client/server

The is no requirement for the message to be in DOCA mmap registered memory

Length

Number of bytes in the message

Must not exceed configured max size


Output

Common output as described in DOCA Core Task.

Task Successful Completion

After the task completes successfully:

  • The message is delivered to the connections remote client/server

  • A receive event is triggered on the remote side

Task Failed Completion

If the task fails midway:

  • The context may enter stopping state if a fatal error occurs

  • The message is not delivered to the remote side

Limitations

  • The operation is not atomic

  • Once the task has been submitted, then the message should not be updated

  • Other limitations are described in DOCA Core Task

Consumer Post Receive Task

This task allows consumer objects to publish buffers which are available for remote producers to write to.

Configuration

Description

API to Set the Configuration

API to Query Support

Enable the task

doca_cc_consumer_task_post_recv_set_conf

doca_cc_consumer_cap_is_supported

Number of tasks

doca_cc_consumer_task_post_recv_set_conf

doca_cc_consumer_cap_get_max_num_tasks

Maximal Buffer Size

doca_cc_consumer_cap_get_max_buf_size


Input

Common input as described in DOCA Core Task.

Name

Description

Notes

Buffer

Buffer that the consumer can receive data on

Data is appended to the tail of the buffer

Note

Buffers doca_mmap must have DOCA_ACCESS_FLAG_PCI_READ_WRITE flag set.


Output

Common output as described in DOCA Core Task.

Task Successful Completion

The task only completes once a producer has written to the advertised buffer, not when the post receive has completed.

Upon successful completion, the buffer contains the data written by the producer and its length is updated appropriately.

Task Failed Completion

Task failure occurs if a buffer has not been successfully posted to receive data.

If the task fails midway:

  • The context may enter stopping state if a fatal error occurs

  • Producers are not aware of the buffer so would not write to it

Limitations

  • The operation is not atomic

  • Once the task has been submitted, the buffer should not be read/written to

  • Buffer must come from memory with PCIe read/write access

  • Chained buffer lists are not supported

  • Other limitations are described in DOCA Core Task

Producer Send Task

This task allows producer objects to copy buffers for use by remote consumers.

Configuration

Description

API to Set the Configuration

API to Query Support

Enable the task

doca_cc_producer_task_send_set_conf

doca_cc_producer_cap_is_supported

Number of tasks

doca_cc_producer_task_send_set_conf

doca_cc_producer_cap_get_max_num_tasks

Maximal Buffer Size

doca_cc_producer_cap_get_max_buf_size


Input

Common input as described in DOCA Core Task.

Name

Description

Notes

Buffer

Buffer that should be copied to a consumer

Only the data residing in the data segment is copied

Consumer ID

Identifier for the target consumer to write to

Active consumers and their IDs are advertised through consumer events


Output

Common output as described in DOCA Core Task.

Task Successful Completion

After the task is completed successfully:

  • The data is copied form the buffer to the next free buffer posted by the given consumer

  • Consumers process buffers from a given consumer in the order they are sent

Task Failed Completion

If the task fails midway:

  • The context may enter stopping state if a fatal error occurs

  • The source and destination doca_buf objects are not modified

  • The destination memory may be modified

Limitations

  • The operation is not atomic

  • Once the task has been submitted, the buffer should not be read/written to

  • The buffer length should not be greater than consumer post receive buffers (an invalid value is returned otherwise)

  • All limitations described in DOCA Core Task

Events

DOCA Comm Channel exposes asynchronous events to notify about changes that happen out of the blue, according to the DOCA Core architecture. See DOCA Core Event.

Common events as described in DOCA Core Event.

Control Channel Receive Event

This event triggers whenever a remote client/server has sent a message to the local client/server object.

Configuration

Description

API to Set the Configuration

API to Query Support

Register to the event

doca_cc_server_event_msg_recv_register

doca_cc_client_event_msg_recv_register


Trigger Condition

The event is triggered when a remote message is received on any currently active connection associated with the client or server.

Output

Upon event detection, the registered callback is triggered, passing the following parameters:

  • A pointer to the message data

    Note

    The data is only valid in the context of the callback.

  • The length in bytes of the message

  • The active connection on which the message was received

Connection Status Changed Event (Server Only)

This event provides asynchronous updates on the state of any connections associated with a server.

Warning

A client object can only connect to a single server, so its connection state can be tracked through its doca_ctx state and the generic doca_ctx_set_state_changed_cb function.

Configuration

Description

API to Set the Configuration

API to Query Support

Register to the event

doca_cc_server_event_connection_register


Trigger Condition

The event is triggered when a new connection is either established or a current connection disconnected on a server.

Output

Separate callbacks are registered for connection or disconnection events with the appropriate one triggered based on the specific event.

Both callbacks contain a Boolean indicating if the connection or disconnection was successful.

Consumer Event

This event indicates that a new consumer object has been created or an existing consumer object has been destroyed.

Configuration

Description

API to Set the Configuration

API to Query Support

Register to the event

doca_cc_server_event_consumer_register

doca_cc_client_event_consumer_register


Trigger Condition

The event is triggered whenever a new consumer is created or a current consumer destroyed on the remote side of an established Comm Channel connection.

Output

The event hits a separate callback for either the creation or destruction of a consumer.

Callback parameters include:

  • The established Comm Channel connection on which the consumer is connected (on the remote side)

  • The ID of the consumer (a unique value per Comm Channel connection)

The DOCA Comm Channel library follows the Context state machine described in DOCA Core Context State Machine.

The following section describes how to move to the state and what is allowed in each state.

Idle

In this state it is expected that the application either:

  • Destroys the context

  • Starts the context

Allowed operations:

  • Configuring the context according to Configurations

  • Starting the context

It is possible to reach this state as follows:

Previous State

Transition Action

None

Create the context

Running

Call stop after making sure all tasks have been freed

Stopping

Call progress until all tasks are completed and freed


Starting

In this state it is expected that the application will:

  • Call progress to allow transition to next state (e.g., when a connection attempt completes)

Allowed operations:

  • Call progress

It is possible to reach this state as follows:

Previous State

Transition Action

Idle

Call start after configuration


Running

In this state, it is expected that the application:

  • Allocates and submit tasks

  • Calls progress to complete tasks and/or receive events

Allowed operations:

  • Allocate a previously configured task

  • Submit an allocated task

  • Call stop

It is possible to reach this state as follows:

Previous State

Transition Action

Idle

Call start after configuration

Starting

Call progress until context state transitions


Stopping

In this state, it is expected that the application will:

  • Free any completed tasks

Allowed operations:

  • Allocate previously configured task

  • Submit an allocated task

  • Call stop

It is possible to reach this state as follows:

Previous State

Transition Action

Running

Call progress and fatal error occurs

Running

Call stop without freeing all tasks


DOCA Comm Channel can only be run on a CPU datapath. See Execution Phase for details.

This section describes DOCA Comm Channel samples based on the DOCA Comm Channel library.

The samples illustrates how to use the DOCA Comm Channel API to do the following:

  • Set up a client/server between host and DPU and use it to send text messages

  • Configure fast path producers and consumers, and send messages between them

Running the Samples

  1. Refer to the following documents:

  2. To build a given sample:

    Copy
    Copied!
                

    cd /opt/mellanox/doca/samples/doca_cc/<sample_name> meson /tmp/build ninja -C /tmp/build

    The binary doca_<sample_name> is created under /tmp/build/.

  3. All DOCA Comm Channel samples accept the same input arguments:

    Sample

    Argument

    Description

    doca_cc_ctrl_path_server

    doca_cc_ctrl_path_client

    doca_cc_data_path_high_speed_server

    doca_cc_data_path_high_speed_client

    -p, --pci-addr

    DOCA Comm Channel device PCIe address

    -r, --rep-pci

    DOCA Comm Channel device representor PCIe address (required only on DPU)

    -t, --text

    Text to be sent to the other side of channel (overwrites default)

  4. For additional information per sample, use the -h option:

    Copy
    Copied!
                

    /tmp/build/<sample_name> -h

Samples

DOCA Comm Channel Control Path Client/Server

Warning

doca_cc_ctrl_path_server must be run on the DPU side and started before doca_cc_ctrl_path_client is started on the host.

This sample sets up a client server connection between the host and DPU.

The connection is used to pass two messages, the first sent by the client when the connection is established and the second by the server on receipt of the client's message.

The sample logic includes:

  1. Locating DOCA device.

  2. Initializing the core DOCA structures.

  3. Initializing and configuring client/server contexts.

  4. Registering tasks and events for sending/receiving messages and tracking connection changes.

  5. Allocating and submitting tasks for sending control path messages.

  6. Handling event completions for receiving messages.

  7. Stopping and destroying client/server objects.

References:

  • /opt/mellanox/doca/samples/doca_cc/cc_ctrl_path_client/cc_ctrl_path_client_main.c

  • /opt/mellanox/doca/samples/doca_cc/cc_ctrl_path_client/cc_ctrl_path_client_sample.c

  • /opt/mellanox/doca/samples/doca_cc/cc_ctrl_path_server/cc_ctrl_path_server_main.c

  • /opt/mellanox/doca/samples/doca_cc/cc_ctrl_path_server/cc_ctrl_path_server_sample.c

  • /opt/mellanox/doca/samples/doca_cc/cc_ctrl_path_common.c

  • /opt/mellanox/doca/samples/doca_cc/cc_ctrl_path_common.h

DOCA Comm Channel Data Path Client/Server

Warning

doca_cc_data_path_high_speed_server should be run on the DPU side and should be started before doca_cc_data_path_high_speed_client is started on the host

This sample sets up a client server connection between host and DPU.

The connection is used to create a producer and consumer on both sides and pass a message across the two fastpath connections.

The sample logic includes:

  1. Locating DOCA device.

  2. Initializing the core DOCA structures.

  3. Initializing and configuring client/server contexts.

  4. Initializing and configuring producer/consumer contexts on top of an established connection.

  5. Submitting post receive tasks for population by producers.

  6. Submitting send tasks from producers to write to consumers.

  7. Stopping and destroying producer/consumer objects.

  8. Stopping and destroying client/server objects.

References:

  • /opt/mellanox/doca/samples/doca_cc/cc_data_path_high_speed_client/cc_data_path_high_speed_client_main.c

  • /opt/mellanox/doca/samples/doca_cc/cc_data_path_high_speedclient/cc_data_path_high_speed_client_sample.c

  • /opt/mellanox/doca/samples/doca_cc/cc_data_path_high_speedserver/cc_data_path_high_speed_server_main.c

  • /opt/mellanox/doca/samples/doca_cc/cc_data_path_high_speedserver/cc_data_path_high_speed_server_sample.c

  • /opt/mellanox/doca/samples/doca_cc/cc_data_path_high_speed_common.c

  • /opt/mellanox/doca/samples/doca_cc/cc_data_path_high_speed_common.h

© Copyright 2023, NVIDIA. Last updated on Feb 9, 2024.