RDMA Aware Networks Programming User Manual

Typical Application

This documents provides two program examples:

  • The first code, RDMA_RC_example, uses the VPI verbs API, demonstrating how to perform RC: Send, Receive, RDMA Read and RDMA Write operations.

  • The second code, multicast example, uses RDMA_CM verbs API, demonstrating Multicast UD.

The structure of a typical application is as follows. The functions in the programming example that implement each step are indicated in bold.

  1. Get the device list;
    First you must retrieve the list of available IB devices on the local host. Every device in this list contains both a name and a GUID. For example the device names can be: mthca0, mlx4_1.
    Implemented in programming example by resources_create

  2. Open the requested device;
    Iterate over the device list, choose a device according to its GUID or name and open it.
    Implemented in programming example by resources_create

  3. Query the device capabilities;
    The device capabilities allow the user to understand the supported features (APM, SRQ) and capabilities of the opened device.
    Implemented in programming example by resources_create

  4. Allocate a Protection Domain to contain your resources;
    A Protection Domain (PD) allows the user to restrict which components can interact with only each other. These components can be AH, QP, MR, MW, and SRQ.
    Implemented in programming example by resources_create

  5. Register a memory region;
    VPI only works with registered memory. Any memory buffer which is valid in the process's virtual space can be registered. During the registration process the user sets memory permissions and receives local and remote keys (lkey/rkey) which will later be used to refer to this memory buffer.
    Implemented in programming example by resources_create

  6. Create a Completion Queue (CQ);
    A CQ contains completed work requests (WR). Each WR will generate a completion queue entry (CQE) that is placed on the CQ. The CQE will specify if the WR was completed successfully or not.
    Implemented in programming example by resources_create

  7. Create a Queue Pair (QP);
    Creating a QP will also create an associated send queue and receive queue. Implemented in programming example by resources_create

  8. Bring up a QP;
    A created QP still cannot be used until it is transitioned through several states, eventually getting to Ready To Send (RTS). This provides needed information used by the QP to be able send / receive data.
    Implemented in programming example by connect_qp, modify_qp_to_init, post_receive, modify_qp_to_rtr, and modify_qp_to_rts.

  9. Post work requests and poll for completion;
    Use the created QP for communication operations.
    Implemented in programming example by post_send and poll_completion.

  10. Cleanup;
    Destroy objects in the reverse order you created them:
    Delete QP
    Delete CQ
    Deregister MR
    Deallocate PD
    Close device
    Implemented in programming example by resources_destroy.

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.