DOCA Storage Target RDMA Application Guide

Introduction

The doca_storage_target_rdma application provides a simple volatile memory solution.

System Design

The doca_storage_target_rdma application perform the following key functions:

  • Provide a storage region for the storage use-case user to interact with.

  • Perform read and write of memory using doca_RDMA.

To accomplish these tasks, the application establishes a TCP server and listens for an incomming connection form a storage service.

Application architecture

The doca_storage_target_rdma application is divided into two main functional areas:

  • Control-time and shared resources

  • Per-thread data path resources

target_rdma_-_objects-version-1-modificationdate-1744969021973-api-v2.png

The application execution follows two primary phases:

  • Control phase

  • Data path phase

Control Phase

This phase begins once a storage service connects via TCP.The application then waits for specific control commands:

  • Query storage

    • Report on the storage dimmensions

  • Init storage

    • Validate requested core count

    • Prepare local memory

    • Import remote memory

    • Create worker objects

  • Wait for RDMA connection requests

    • Wait for a number of RDMA connections based on the specified core count from init storage

  • Start storage

    • Wait until all connections are ready to process tasks

    • Submit initial tasks

    • Launch threads

Issuing the start storage command initiates the data path phase. While the data threads begin execution, the main thread continues to wait for final control commands to complete the application's lifecycle:

  • Stop storage

  • Shutdown

Data Path Phase

This phase is executed per thread and involves each thread performing I/O operations requested by the client:

  1. Receive IO request

  2. Perform RDMA read / write

  3. Send IO response

DOCA Libraries

This application leverages the following DOCA libraries:

Compiling the Application

This application is compiled as part of the set of storage applications. For compilation instructions, refer to the DOCA Storage Applications page.

Running the Application

Application Execution

DOCA Storage Target RDMA is provided in source form. Therefore, compilation is required before the application can be executed.

  • Application usage instructions:

    Usage: doca_storage_target_rdma [DOCA Flags] [Program Flags]
 
DOCA Flags:
  -h, --help                        Print a help synopsis
  -v, --version                     Print program version information
  -l, --log-level                   Set the (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>
  --sdk-log-level                   Set the SDK (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>
  -j, --json <path>                 Parse all command flags from an input json file
 
Program Flags:
  -d, --device                      Device identifier
  --cpu                             CPU core to which the process affinity can be set
  --listen-port                     TCP listen port number
  --binary-content                  Path to binary .sbc file containing the initial content to be represented by this storage instance
  --block-count                     Number of available storage blocks. (Ignored when using content binary file) Default: 128
  --block-size                      Block size used by the storage. (Ignored when using content binary file) Default: 4096

    This usage printout can be printed to the command line using the -h (or --help) options:

    ./doca_storage_target_rdma -h

    For additional information, refer to section "Command-line Flags".

  • CLI example for running the application:

    ./doca_storage_target_rdma -d 3b:00.0 --listen-port 12345 --block-size 4096 --block-count 64 --cpu 0

    The user DOCA device PCIe address (3b:00.0) should match the addresses of the desired PCIe device.

  • The application also supports a JSON-based deployment mode in which all command-line arguments are provided through a JSON file:

    ./doca_storage_target_rdma --json [json_file]

    For example:

    ./doca_storage_target_rdma --json doca_storage_target_rdma_params.json

    Before execution, ensure that the JSON file contains valid configuration parameters, particularly the correct PCIe device addresses required for deployment.

Command-line Flags

Flag Type

Short Flag

Long Flag/JSON Key

Description

JSON Content

General flags

h

help

Print a help synopsis

N/A

v

version

Print program version information

N/A

l

log-level

Set the log level for the application:

  • DISABLE=10

  • CRITICAL=20

  • ERROR=30

  • WARNING=40

  • INFO=50

  • DEBUG=60

  • TRACE=70 (requires compilation with TRACE log level support)
"log-level": 60

N/A

sdk-log-level

Set the log level for the program:

  • DISABLE=10

  • CRITICAL=20

  • ERROR=30

  • WARNING=40

  • INFO=50

  • DEBUG=60

  • TRACE=70
"sdk-log-level": 40

j

json

Parse all command flags from an input JSON file

N/A

Program flags

d

device

DOCA device identifier. One of:

  • PCIe address: 3b:00.0

  • InfiniBand name: mlx5_0

  • Network interface name: en3f0pf0sf0

This flag is a mandatory.
"device": "03:00.0"

N/A

--cpu

Index of CPU to use. One data path thread is spawned per CPU. Index starts at 0.

The user can specify this argument multiple times to create more threads.

This flag is a mandatory.
"cpu": 6

N/A

--listen-port

Port to listen upon for incomming TCP connections

This flag is a mandatory.
"listen-port": 12345

N/A

--binary-content

Path to a file to be used to provide initial content to the storage instance.
"binary-content": "data_1.sbc"

N/A

--block-count

Number of storage blocks to provide
"block-count": 64

N/A

--block-size

Size of each storage block
"block-size": 4096

A user should provide one of:

  • --binary-content : Where the file is a .sbc file

    • The sbc file provides storage dimmensions and data to populate the blocks

OR (Random / uninitialisaed bytes with a user defined dimmension)

  • --block-count

  • --block-size

OR (Initialised bytes with a user defined dimmension)

  • --block-count

  • --block-size

  • --binary-content : Where the file is plain content to be distributed across the storage and its size == block count * block size

Troubleshooting

Refer to the NVIDIA BlueField Platform Software Troubleshooting Guide for any issue encountered with the installation or execution of the DOCA applications.

Application Code Flow

Control Phase

    target_rdma_app app{parse_target_rdma_app_cli_args(argc, argv)};

    Parse CLI arguments, apply default values, and create the application instance.

    app.wait_for_client_connection();

    Wait for a storage service app to connect via TCP.

    app.wait_for_and_process_query_storage();

    Wait for the storage service to send a query storage control message, then:

    • Send a query storage response with the dimmensions of this storage instance

    app.wait_for_and_process_init_storage();

    Wait for the storage service to send an init storage control message, then:

    • Verify that the requested core count does not exceed the available cores

    • Create local storage doca_mmap

    • Import remote memory doca_mmap

    • Create data path resources:

      • Worker objects

      • IO message memory regions

      • doca_pe objects

      • doca_rdma objects

    • Send an init storage response

    app.wait_for_and_process_create_rdma_connections();

    Wait for the storage service to send a number of create rdma connection control messages Processing each by:

    • Create a connection by creating an exported connection details using doca_rdma context specified in the command

    • Starting the connection using the provided remote connection details

    app.wait_for_and_process_start_storage();

    Wait for the storage service to send a start storage control message, then:

    • Wait until all doca_rdma contexts are ready to execute tasks (both sides have started their respective connections)

    • Allocate task objects

    • Submit receive tasks

    • Create and start worker threads

    app.wait_for_and_process_stop_storage();

    Wait for the storage service to send a stop storage control message (test complete), then:

    • Signal worker threads to stop

    • Join worker threads

    app.wait_for_and_process_shutdown();

    Wait for the storage service to send a shutdown control message, then:

    • Collect and store run statistics

    • Destroy data path objects

    • Send a shutdown response

    app.display_stats();

    Display collected statistics and destroy all control path objects.

References

  • /opt/mellanox/doca/applications/storage/
