NVIDIA DOCA Storage Zero Copy Target RDMA Application Guide
DOCA Storage Zero Copy Target RDMA (target_rdma) acts as a mock storage service, preparing an area of memory equal in size to the block created by the doca_storage_zero_copy_initiator_comch
(initiator_comch). This application waits for IO messages from the doca_storage_zero_copy_comch_to_rdma
(comch_to_rdma) and performs the necessary RDMA read or write operations to fulfill the initiators' read or write request (i.e., RDMA write for a read IO message, DMA read for a write IO message).
DOCA Storage Zero Copy Target RDMA uses a TCP socket for out-of-band control messages, then uses two DOCA RDMA connections:
One for the data path to receive and reply to IO messages; and
Another to perform the RDMA read and write operations which actually move data to or from the memory created by initiator_comch
DOCA Storage Zero Copy Target RDMA executes in three stages:
Preparation Stage
During this stage the application performs the following:
Creates a TCP server socket.
Waits for comch_to_rdma to connect.
Waits for a configure data path control message (buffer count, buffer size, doca_mmap export details) from comch_to_rdma.
Imports the received doca_mmap.
Create a local memory region.
Creates a local doca_mmap.
Creates a doca_buf_inventory.
Sends a configure data path control message response to comch_to_rdma.
Waits for
N
"create RDMA connection" control messages from comch_to_rdma.Creates the RDMA context.
Exports the connection details.
Starts connecting using the provided remote connection details.
Sends a create RDMA connection control message response to comch_to_rdma.
Waits for a "start data path connections" control message from comch_to_rdma.
Verifies that all RDMA connections are ready to use.
Sends a start data path connections control message response to comch_to_rdma.
Waits for a start storage control message from comch_to_rdma.
Starts data path threads.
Sends a start storage control message response to comch_to_rdma.
Data Path Stage
In this stage, the data path threads start. Each thread begins by submitting receive RDMA tasks then executing a tight loop and polling the progress engine (PE) as quickly as possible until a "data path stop" IO message is received.
The process of handling an IO message involves the following steps:
Determine memory locations to be used for decoding the IO message.
Submit a RDMA read/RDMA write operation.
Upon completion of the RDMA read/write, send a response IO message to BlueField.
Resubmit the RDMA receive task.
Teardown Stage
In this stage the application performs the following:
Waits for a destroy objects control message from.
Destroys data path objects.
Sends a destroy objects control message response to comch_to_rdma.
Destroys control path objects.
This application leverages the following DOCA libraries:
This application is compiled as part of the set of storage zero copy applications. For compilation instructions, refer to NVIDIA DOCA Storage Zero Copy.
Application Execution
DOCA Storage Zero Copy Comch to RDMA is provided in source form. Therefore, a compilation is required before the application can be executed.
Application usage instructions:
Usage: doca_storage_zero_copy_target_rdma [DOCA Flags] [Program Flags] DOCA Flags: -h, --help Print a help synopsis -v, --version Print program version information -l, --log-level Set the (numeric) log level
for
the program <10
=DISABLE,20
=CRITICAL,30
=ERROR,40
=WARNING,50
=INFO,60
=DEBUG,70
=TRACE> --sdk-log-level Set the SDK (numeric) log levelfor
the program <10
=DISABLE,20
=CRITICAL,30
=ERROR,40
=WARNING,50
=INFO,60
=DEBUG,70
=TRACE> -j, --json <path> Parse all command flags from an input json file Program Flags: -d, --device Device identifier -r, --representor Device host side representor identifier --listen-port TCP Port on which to listenfor
incoming connections --cpu CPU core to which the process affinity can be setInfoThis usage printout can be printed to the command line using the
-h
(or--help
) options:./ doca_storage_zero_copy_target_rdma -h
For additional information, refer to section "Command Line Flags".
CLI example for running the application on the BlueField:
./doca_storage_zero_copy_target_rdma -d 03:00.0 --listen-port 12345 --cpu 12
InfoThe DOCA device PCIe address,
3b:00.0
, should match the address of the desired PCIe device.The application also supports a JSON-based deployment mode, in which all command-line arguments are provided through a JSON file:
./doca_storage_zero_copy_target_rdma --json [json_file]
For example:
./doca_storage_zero_copy_target_rdma --json doca_storage_zero_copy_comch_to_rdma_params.json
NoteBefore execution, ensure that the used JSON file contains the correct configuration parameters, and especially the PCIe addresses necessary for the deployment.
Command Line Flags
Flag Type |
Short Flag |
Long Flag/JSON Key |
Description |
JSON Content |
General flags |
|
|
Print a help synopsis |
N/A |
|
|
Print program version information |
N/A |
|
|
|
Set the log level for the application:
|
|
|
N/A |
|
Set the log level for the program:
|
|
|
|
|
Parse all command flags from an input JSON file |
N/A |
|
Program flags |
|
|
DOCA device identifier. One of:
Note
This is a mandatory flag.
|
|
N/A |
|
TCP port on which to listen for incoming connections Note
This is a mandatory flag.
|
|
|
N/A |
|
Index of CPU to use. One data path thread is spawned per CPU. Index starts at 0. Note
The user can specify this argument multiple times to create more threads.
Note
This is a mandatory flag.
|
|
Troubleshooting
Refer to the NVIDIA DOCA Troubleshooting Guide for any issue encountered with the installation or execution of the DOCA applications.
Control Thread Flow
Parse application arguments:
auto
const
cfg = parse_cli_args(argc, argv);Prepare the parser (
doca_argp_init
).Register parameters (
doca_argp_param_create
).Parse the arguments (
doca_argp_start
).Destroy the parser (
doca_argp_destroy
).
Display the configuration:
print_config(cfg);
Create application instance:
g_app.reset(storage::zero_copy::make_storage_application(cfg));
Run the application:
g_app->run()
Find and open the specified device:
m_dev = storage::common::open_device(m_cfg.device_id);
Start the TCP server and wait for comch_to_rdma to connect:
start_listening(); wait_for_tcp_client();
Wait for a "configure storage" control message from comch_to_rdma.
Configure storage:
configure_storage(configuration);
Create thread contexts:
Create transaction contexts.
Create IO messages.
Create PE.
Create mmap for IO message buffers.
Send configure storage control message response to comch_to_rdma.
Wait for
N
"create RDMA connection" control messages from comch_to_rdma:Create RDMA context.
Export connection details.
Start connection using received remote connection details.
Send a "create RDMA connection" control message response (containing RDMA connection details from target_rdma RDMA context) to comch_to_rdma.
Wait for "start data path" control message from comch_to_rdma:
Verify all connections are ready (comch and RDMA):
establish_rdma_connections();
Send a "start storage" control message response to comch_to_rdma.
Wait for start storage control message from comch_to_rdma:
Create data path threads.
Start data path threads.
Send a "start storage" control message response to comch_to_rdma.
Run all threads until completion.
Wait for "destroy objects" control message.
Destroy data path objects.
Send destroy objects control message response to BlueField.
Display stats:
printf
("+================================================+\n"
);printf
("| Stats\n"
);printf
("+================================================+\n"
);for
(uint32_t ii = 0; ii != stats.size(); ++ii) {printf
("| Thread[%u]\n"
, ii); autoconst
pe_hit_rate_pct = (static_cast
<double
>(stats[ii].pe_hit_count) / (static_cast
<double
>(stats[ii].pe_hit_count) +static_cast
<double
>(stats[ii].pe_miss_count))) * 100.;printf
("| PE hit rate: %2.03lf%% (%lu:%lu)\n"
, pe_hit_rate_pct, stats[ii].pe_hit_count, stats[ii].pe_miss_count);printf
("+------------------------------------------------+\n"
); }printf
("+================================================+\n"
);Destroy control path objects.
Performance Data Path Thread Flow
The data path involves polling the PE as quickly as possible to receive IO messages from BlueField.
Run until BlueField sends a stop IO message:
while
(hot_data->running_flag) { doca_pe_progress(pe) ? ++(hot_data->pe_hit_count) : ++(hot_data->pe_miss_count); }Handle BlueField IO message:
Calculate memory addresses to use for local and remote memory.
Set buffer addresses and sizes into source and destination buffers into RDMA task.
Start RDMA read/write task.
Upon completion of RDMA task respond to BlueField.
Re-submit RDMA recv task.
/opt/mellanox/doca/applications/storage/