Connection Manager (CM) ID Operations
Template:
int rdma_create_id(struct rdma_event_channel *channel, struct rdma_cm_id **id, void *con- text, enum rdma_port_space ps)
Input Parameters:
channel The communication channel that events associated with the allocated rdma_cm_id will be reported on.
idA reference where the allocated communication identifier will be returned.
context User specified context associated with the rdma_cm_id. psRDMA port space.
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
Creates an identifier that is used to track communication information.
Notes:
rdma_cm_ids are conceptually equivalent to a socket for RDMA communication. The difference is that RDMA communication requires explicitly binding to a specified RDMA device before communication can occur, and most operations are asynchronous in nature. Communication events on an rdma_cm_id are reported through the associated event channel. Users must release the rdma_cm_id by calling rdma_destroy_id.
PORT SPACES Details of the services provided by the different port spaces are outlined below.
RDMA_PS_TCP Provides reliable, connection-oriented QP communication. Unlike TCP, the RDMA port space provides message, not stream, based communication.
RDMA_PS_UDPProvidesunreliable,connection lessQP communication. Supports both datagram and multicast communication.
See Also:
rdma_cm, rdma_create_event_channel, rdma_destroy_id, rdma_get_devices, rdma_bind_addr, rdma_resolve_addr, rdma_connect, rdma_listen, rdma_set_option
Template:
int rdma_destroy_id (struct rdma_cm_id *id)
Input Parameters:
idThe communication identifier to destroy.
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
Destroys the specified rdma_cm_id and cancels any outstanding asynchronous operation.
Notes:
Users must free any associated QP with the rdma_cm_id before calling this routine and ack an related events.
See Also:
rdma_create_id, rdma_destroy_qp, rdma_ack_cm_event
Template:
int rdma_migrate_id(struct rdma_cm_id *id, struct rdma_event_channel *channel)
Input Parameters:
idAn existing RDMA communication identifier to migrate
channel The new event channel for rdma_cm_id events
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_migrate_id migrates a communication identifier to a different event channel and moves any pending events associated with the rdma_cm_id to the new channel.
No polling for events on the rdma_cm_id's current channel nor running of any routines on the rdma_cm_id should be done while migrating between channels. rdma_migrate_id will block while there are any unacknowledged events on the current event channel.
If the channel parameter is NULL, then the specified rdma_cm_id will be placed into synchro- nous operation mode. All calls on the id will block until the operation completes.
Template:
int rdma_set_option(struct rdma_cm_id *id, int level, int optname, void *optval, size_t optlen)
Input Parameters:
idRDMA communication identifier
levelProtocol level of the option to set
optname Name of the option to set
optvalReference to the option data
optlen The size of the option data (optval) buffer
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_set_option sets communication options for an rdma_cm_id. Option levels and details may be found in the enums in the relevant header files.
Template:
int rdma_create_ep(struct rdma_cm_id **id, struct rdma_addrinfo *res, struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr)
Input Parameters:
idA reference where the allocated communication identifier will be returned
res Address information associated with the rdma_cm_id returned from rdma_getaddrinfo
pd OPtional protection domain if a QP is associated with the rdma_cm_id
qp_init_attrOptional initial QP attributes
Output Parameters:
idThe communication identifier is returned through this reference
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure
Description:
rdma_create_ep creates an identifier and optional QP used to track communication information.
If qp_init_attr is not NULL, then a QP will be allocated and associated with the rdma_cm_id, id. If a protection domain (PD) is provided, then the QP will be created on that PD. Otherwise the QP will be allocated on a default PD.
The rdma_cm_id will be set to use synchronous operations (connect, listen and get_request). To use asynchronous operations, rdma_cm_id must be migrated to a user allocated event channel using rdma_migrate_id.
rdm_cm_id must be released after use, using rdma_destroy_ep. struct rdma_addrinfo is defined as follows:
struct rdma_addrinfo {
int ai_flags;
int ai_family;
int ai_qp_type;
int ai_port_space;
socklen_tai_src_len;
socklen_tai_dst_len;
struct | sockaddr | *ai_src_addr; | |
struct | sockaddr | *ai_dst_addr; | |
char | *ai_src_canonname; | ||
char | *ai_dst_canonname; | ||
size_t | ai_route_len; | ||
void | *ai_route; | ||
size_t | ai_connect_len; | ||
void | *ai_connect; | ||
struct | rdma_addrinfo | *ai_next; | |
}; |
ai_flags Hint flags which control the operation. Supported flags are:
RAI_PASSIVE, RAI_NUMERICHOST and RAI_NOROUTE
ai_familyAddressfamily forthe source and destinationaddress (AF_INET, AF_INET6, AF_IB)
ai_qp_type The type of RDMA QP used
ai_port_space RDMA port space used (RDMA_PS_UDP or RDMA_PS_TCP) ai_src_len Length of the source address referenced by ai_src_addr ai_dst_lenLength of the destination address referenced by ai_dst_addr
*ai_src_addrAddress of local RDMA device, if provided
*ai_dst_addrAddress of destination RDMA device, if provided
*ai_src_canonname The canonical for the source
*ai_dst_canonname The canonical for the destination
ai_route_len Sizeof the routinginformationbufferreferenced by ai_route.
*ai_routeRouting information for RDMA transports that require routing data as part of connection establishment
ai_connect_len Size of connection information referenced by ai_connect
*ai_connect Data exchanged aspart of the connectionestablishment process
*ai_nextPointer to the next rdma_addrinfo structure in the list
Template:
int rdma_destroy_ep (struct rdma_cm_id *id)
Input Parameters:
idThe communication identifier to destroy
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure
Description:
rdma_destroy_ep destroys the specified rdma_cm_id and all associated resources, including QPs associated with the id.
Template:
int rdma_resolve_addr (struct rdma_cm_id *id, struct sockaddr *src_addr, struct sockaddr
*dst_addr, int timeout_ms)
Input Parameters:
idRDMA identifier.
src_addrSource address information. This parameter may be NULL. dst_addrDestination address information.
timeout_ms Time to wait for resolution to complete.
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_resolve_addr resolves destination and optional source addresses from IP addresses to an RDMA address. If successful, the specified rdma_cm_id will be bound to a local device.
Notes:
This call is used to map a given destination IP address to a usable RDMA address. The IP to RDMA address mapping is done using the local routing tables, or via ARP. If a source address is given, the rdma_cm_id is bound to that address, the same as if rdma_bind_addr were called. If no source address is given, and the rdma_cm_id has not yet been bound to a device, then the rdma_cm_id will be bound to a source address based on the local routing tables. After this call, the rdma_cm_id will be bound to an RDMA device. This call is typically made from the active side of a connection before calling rdma_resolve_route and rdma_connect.
InfiniBand Specific
This call maps the destination and, if given, source IP addresses to GIDs. In order to perform the mapping, IPoIB must be running on both the local and remote nodes.
See Also:
rdma_create_id, rdma_resolve_route, rdma_connect, rdma_create_qp, rdma_get_cm_event, rdma_bind_addr, rdma_get_src_port, rdma_get_dst_port, rdma_get_local_addr, rdma_get_peer_addr
Template:
int rdma_bind_addr (struct rdma_cm_id *id, struct sockaddr *addr)
Input Parameters:
idRDMA identifier.
addrLocal address information. Wildcard values are permitted.
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_bind_addr associates a source address with an rdma_cm_id. The address may be wild carded. If binding to a specific local address, the rdma_cm_id will also be bound to a local RDMA device.
Notes:
Typically, this routine is called before calling rdma_listen to bind to a specific port number, but it may also be called on the active side of a connection before calling rdma_resolve_addr to bind to a specific address. If used to bind to port 0, the rdma_cm will select an available port, which can be retrieved with rdma_get_src_port.
See Also:
rdma_create_id, rdma_listen, rdma_resolve_addr, rdma_create_qp, rdma_get_local_addr, rdma_get_src_port
Template:
int rdma_resolve_route (struct rdma_cm_id *id, int timeout_ms)
Input Parameters:
idRDMA identifier.
addrLocal address information. Wildcard values are permitted.
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_resolve_route resolves an RDMA route to the destination address in order to establish a connection. The destination must already have been resolved by calling rdma_resolve_addr.
Thus this function is called on the client side after rdma_resolve_addr but before calling rdma_- connect. For InfiniBand connections, the call obtains a path record which is used by the connec- tion.
Template:
int rdma_listen(struct rdma_cm_id *id, int backlog)
Input Parameters:
id RDMA communication identifier
backlog The backlog of incoming connection requests
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_listen initiates a listen for incoming connection requests or datagram service lookup. The listen is restricted to the locally bound source address.
Please note that the rdma_cm_id must already have been bound to a local address by calling rdma_bind_addr before calling rdma_listen. If the rdma_cm_id is bound to a specific IP address, the listen will be restricted to that address and the associated RDMA device. If the rdma_cm_id is bound to an RDMA port number only, the listen will occur across all RDMA devices.
Template:
int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
Input Parameters:
id RDMA communication identifier conn_paramOptional connection parameters
Output Parameters:
none
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_connect initiates an active connection request. For a connected rdma_cm_id, id, the call initiates a connection request to a remote destination. or an unconnected rdma_cm_id, it initiates a lookup of the remote QP providing the datagram service. The user must already have resolved a route to the destination address by having called rdma_resolve_route or rdma_create_ep before calling this method.
For InfiniBand specific connections, the QPs are configured with minimum RNR NAK timer and local ACK values. The minimum RNR NAK timer value is set to 0, for a delay of 655 ms. The local ACK timeout is calculated based on the packet lifetime and local HCA ACK delay. The packet lifetime is determined by the InfiniBand Subnet Administrator and is part of the resolved route (path record) information. The HCA ACK delay is a property of the locally used HCA. Retry count and RNR retry count values are 3-bit values.
Connections established over iWarp RDMA devices currently require that the active side of the connection send the first message.
struct rdma_conn_param is defined as follows:
struct rdma_conn_param {
const void *private_data; uint8_t private_data_len; uint8_t responder_resources; uint8_t initiator_depth; uint8_t flow_control;
uint8_t retry_count;ignored when accepting uint8_t rnr_retry_count;
uint8_t srq;ignored if QP created on the rdma_cm_id
uint32_t qp_num;ignored if QP created on the rdma_cm_id
};
Here is a more detailed description of the rdma_conn_param structure members:
private_data References a user-controlled data buffer. The contents of the buffer are copied and transparently passed to the remote side as part of the communication request. May be NULL if private_data is not required.
private_data_len Specifies the size of the user-controlled data buffer. Note that the actual amount of data transferred to the remote side is transport dependent and may be larger than that requested.
responder_resources The maximum number of outstanding RDMA read and atomic operations that the local side will accept from the remote side. Applies only to RDMA_PS_TCP. This value must be less than or equal to the local RDMA device attribute max_qp_rd_atom and remote RDMA device attribute max_qp_init_rd_atom. The remote endpoint can adjust this value when accepting the connection.
initiator_depth The maximum number of outstanding RDMA read and atomic operations that the local side will have to the remote side. Applies only to RDMA_PS_TCP. This value must be less than or equal to the local RDMA device attribute max_qp_init_rd_atom and remote RDMA device attribute max_qp_rd_atom. The remote endpoint can adjust this value when accepting the connection.
flow_control Specifies if hardware flow control is available. This value is exchanged with the remote peer and is not used to configure the QP. Applies only to RDMA_PS_TCP.
retry_count The maximum number of times that a data transfer operation should be retried on the connection when an error occurs. This setting controls the number of times to retry send, RDMA, and atomic operations when timeouts occur. Applies only to RDMA_PS_TCP.
rnr_retry_count The maximum number of times that a send operation from the remote peer should be retried on a connection after receiving a receiver not ready (RNR) error. RNR errors are generated when a send request arrives before a buffer has been posted to receive the incoming data. Applies only to RDMA_PS_TCP.
srq Specifies if the QP associated with the connection is using a shared receive queue. This field is ignored by the library if a QP has been created on the rdma_cm_id. Applies only to RDMA_PS_TCP.
qp_numSpecifies the QP number associated with the connection. This field is ignored by the library if a QP has been created on the rdma_cm_id. Applies only to RDMA_PS_TCP.
Template:
int rdma_get_request (struct rdma_cm_id *listen, struct rdma_cm_id **id)
Input Parameters:
listenListening rdma_cm_id
id rdma_cm_id associated with the new connection
Output Parameters:
idA pointer to rdma_cm_id associated with the request
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_get_request retrieves the next pending connection request event. The call may only be used on listening rdma_cm_ids operating synchronously. If the call is successful, a new rdma_c- m_id (id) representing the connection request will be returned to the user. The new rdma_cm_id will reference event information associated with the request until the user calls rdma_reject, rdma_accept, or rdma_destroy_id on the newly created identifier. For a description of the event data, see rdma_get_cm_event.
If QP attributes are associated with the listening endpoint, the returned rdma_cm_id will also ref- erence an allocated QP.
Template:
int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
Input Parameters: | RDMA communication identifier | ||
conn_param rdma_connect) | Optionalconnectionparameters | (described | under |
Output Parameters: | |||
None | |||
Return Value: |
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_accept is called from the listening side to accept a connection or datagram service lookup request.
Unlike the socket accept routine, rdma_accept is not called on a listening rdma_cm_id. Instead, after calling rdma_listen, the user waits for an RDMA_CM_EVENT_CONNECT_REQUEST event to occur. Connection request events give the user a newly created rdma_cm_id, similar to a new socket, but the rdma_cm_id is bound to a specific RDMA device. rdma_accept is called on the new rdma_cm_id.
Template:
int rdma_reject(struct rdma_cm_id *id, const void *private_data, uint8_t private_data_len)
Input Parameters:
idRDMA communication identifier
private_dataOptional private data to send with the reject message private_data_len Size (in bytes) of the private data being sent
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_reject is called from the listening side to reject a connection or datagram service lookup request.
After receiving a connection request event, a user may call rdma_reject to reject the request. The optional private data will be passed to the remote side if the underlying RDMA transport sup- ports private data in the reject message.
Template:
int rdma_notify(struct rdma_cm_id *id, enum ibv_event_type event)
Input Parameters:
idRDMA communication identifier
eventAsynchronous event
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_notify is used to notify the librdmacm of asynchronous events which have occurred on a QP associated with the rdma_cm_id, id.
Asynchronous events that occur on a QP are reported through the user's device event handler. This routine is used to notify the librdmacm of communication events. In most cases, use of this routine is not necessary, however if connection establishment is done out of band (such as done through InfiniBand), it is possible to receive data on a QP that is not yet considered connected. This routine forces the connection into an established state in this case in order to handle the rare situation where the connection never forms on its own. Calling this routine ensures the delivery of the RDMA_CM_EVENT_ESTABLISHED event to the application. Events that should be reported to the CM are: IB_EVENT_COMM_EST.
Template:
int rdma_disconnect(struct rdma_cm_id *id)
Input Parameters:
idRDMA communication identifier
Output Parameters:
None
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_disconnect disconnects a connection and transitions any associated QP to the error state. This action will result in any posted work requests being flushed to the completion queue. rdma_disconnect may be called by both the client and server side of the connection. After suc- cessfully disconnecting, an RDMA_CM_EVENT_DISCONNECTED event will be generated on both sides of the connection.
Template:
uint16_t rdma_get_src_port(struct rdma_cm_id *id)
Input Parameters:
idRDMA communication identifier
Output Parameters:
None
Return Value:
Returns the 16-bit port number associated with the local endpoint of 0 if the rdma_cm_id, id, is not bound to a port
Description:
rdma_get_src_port retrieves the local port number for an rdma_cm_id (id) which has been bound to a local address. If the id is not bound to a port, the routine will return 0.
Template:
uint16_t rdma_get_dst_port(struct rdma_cm_id *id)
Input Parameters:
idRDMA communication identifier
Output Parameters:
None
Return Value:
Returns the 16-bit port number associated with the peer endpoint of 0 if the rdma_cm_id, id, is not connected
Description:
rdma_get_dst_port retrieves the port associated with the peer endpoint. If the rdma_cm_id, id, is not connected, then the routine will return 0.
Template:
struct sockaddr {}rdma_get_local_addr{*}(struct rdma_cm_id *id)
Input Parameters:
idRDMA communication identifier
Output Parameters:
None
Return Value:
Returns a pointer to the local sockaddr address of the rdma_cm_id, id. If the id is not bound to an address, then the contents of the sockaddr structure will be set to all zeros
Description:
rdma_get_local_addr retrieves the local IP address for the rdma_cm_id which has been bound to a local device.
Template:
struct sockaddr * rdma_get_peer_addr (struct rdma_cm_id *id)
Input Parameters:
idRDMA communication identifier
Output Parameters:
None
Return Value:
A pointer to the sockaddr address of the connected peer. If the rdma_cm_id is not connected, then the contents of the sockaddr structure will be set to all zeros
Description:
rdma_get_peer_addr retrieves the remote IP address of a bound rdma_cm_id.
Template:
struct ibv_context ** rdma_get_devices (int *num_devices)
Input Parameters:
num_devices If non-NULL, set to the number of devices returned
Output Parameters:
num_devices Number of RDMA devices currently available
Return Value:
Array of available RDMA devices on success or NULL if the request fails
Description:
rdma_get_devices retrieves an array of RDMA devices currently available. Devices remain opened while librdmacm is loaded and the array must be released by calling rdma_free_devices.
Template:
void rdma_free_devices (struct ibv_context **list)
Input Parameters:
list List of devices returned from rdma_get_devices
Output Parameters:
None
Return Value:
None
Description:
rdma_free_devices frees the device array returned by the rdma_get_devices routine.
Template:
int rdma_getaddrinfo(char *node, char *service, struct rdma_addrinfo *hints, struct rdma_ad- drinfo **res)
Input Parameters:
nodeOptional: name, dotted-decimal IPv4 or IPv6 hex address to resolve
service The service name or port number of the address
hintsReference to an rmda_addrinfo structure containing hints about the type of service the caller supports resA pointer to a linked list of rdma_addrinfo structures containing response information
Output Parameters:
resAn rdma_addrinfo structure which returns information needed to establish communication
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_getaddrinfo provides transport independent address translation. It resolves the destination node and service address and returns information required to establish device communication. It is the functional equivalent of getaddrinfo.
Please note that either node or service must be provided. If hints are provided, the operation will be controlled by hints.ai_flags. If RAI_PASSIVE is specified, the call will resolve address infor- mation for use on the passive side of a connection.
The rdma_addrinfo structure is described under the rdma_create_ep routine.
Template:
void rdma_freeaddrinfo(struct rdma_addrinfo *res)
Input Parameters:
resThe rdma_addrinfo structure to free
Output Parameters:
None
Return Value:
None
Description:
rdma_freeaddrinfo releases the rdma_addrinfo (res) structure returned by the rdma_getaddrinfo routine. Note that if ai_next is not NULL, rdma_freeaddrinfo will free the entire list of addrinfo structures.
Template:
int rdma_create_qp (struct rdma_cm_id *id, struct ibv_pd *pd, struct ibv_qp_init_attr
*qp_init_attr)
Input Parameters:
idRDMA identifier.
pd protection domain for the QP. qp_init_attrinitial QP attributes.
Output Parameters:
qp_init_attr The actual capabilities and properties of the created QP are returned through this structure
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_create_qp allocates a QP associated with the specified rdma_cm_id and transitions it for sending and receiving. The actual capabilities and properties of the created QP will be returned to the user through the qp_init_attr parameter.
Notes:
The rdma_cm_id must be bound to a local RDMA device before calling this function, and the protection domain must be for that same device. QPs allocated to an rdma_cm_id are automati-
cally transitioned by the librdmacm through their states. After being allocated, the QP will be ready to handle posting of receives. If the QP is unconnected, it will be ready to post sends.
See Also:
rdma_bind_addr, rdma_resolve_addr, rdma_destroy_qp, ibv_create_qp, ibv_modify_qp
Template:
void rdma_destroy_qp (struct rdma_cm_id *id)
Input Parameters:
idRDMA identifier.
Output Parameters:
none
Return Value:
none
Description:
rdma_destroy_qp destroys a QP allocated on the rdma_cm_id.
Notes:
Users must destroy any QP associated with an rdma_cm_id before destroying the ID.
See Also:
rdma_create_qp, rdma_destroy_id, ibv_destroy_qp
Template:
int rdma_join_multicast (struct rdma_cm_id *id, struct sockaddr *addr, void *context)
Input Parameters:
id Communication identifier associated with the request.
addrMulticast address identifying the group to join. contextUser-defined context associated with the join request.
Output Parameters:
none
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_join_multicast joins a multicast group and attaches an associated QP to the group.
Notes:
Before joining a multicast group, the rdma_cm_id must be bound to an RDMA device by calling rdma_bind_addr or rdma_resolve_addr. Use of rdma_resolve_addr requires the local routing tables to resolve the multicast address to an RDMA device, unless a specific source address is provided. The user must call rdma_leave_multicast to leave the multicast group and release any multicast resources. After the join operation completes, any associated QP is automatically attached to the multicast group, and the join context is returned to the user through the private_- data field in the rdma_cm_event.
See Also:
rdma_leave_multicast, rdma_bind_addr, rdma_resolve_addr, rdma_create_qp, rdma_get_c- m_event
Template:
int rdma_leave_multicast (struct rdma_cm_id *id, struct sockaddr *addr)
Input Parameters:
id Communication identifier associated with the request.
addrMulticast address identifying the group to leave.
Output Parameters:
none
Return Value:
0 on success, -1 on error. If the call fails, errno will be set to indicate the reason for the failure.
Description:
rdma_leave_multicast leaves a multicast group and detaches an associated QP from the group.
Notes:
Calling this function before a group has been fully joined results in canceling the join operation. Users should be aware that messages received from the multicast group may stilled be queued for completion processing immediately after leaving a multicast group. Destroying an rdma_cm_id will automatically leave all multicast groups.
See Also:
rdma_join_multicast, rdma_destroy_qp