VMA Library Architecture

The VMA library is a dynamically linked user-space library. Use of the VMA library does not require any code changes or recompiling of user applications. Instead, it is dynamically loaded via the Linux OS environment variable, LD_PRELOAD. However, it is possible to load VMA library dynamically without using the LD_PRELOAD parameter, which requires minor application modifications. Using VMA without code modification is described in Running VMA.

When a user application transmits TCP and UDP, unicast and multicast IPv4 data, or listens for such network traffic data, the VMA library:

  • Intercepts the socket receive and send calls made to the stream socket or datagram socket address families.

  • Implements the underlying work in user space (instead of allowing the buffers to pass on to the standard OS network kernel libraries).

VMA implements native RDMA verbs API. The native RDMA verbs have been extended into the Ethernet RDMA-capable NICs, enabling the packets to pass directly between the user application and the Ethernet NIC, bypassing the kernel and its TCP/UDP handling network stack.

You can implement the code in native RDMA verbs API, without making any changes to your applications. The VMA library does all the heavy lifting under the hood, while transparently presenting the same standard socket API to the application, thus redirecting the data flow.

The VMA library operates in a standard networking stack fashion to serve multiple network interfaces.

The VMA library behaves according to the way the application calls the bind, connect, and setsockopt directives and the administrator sets the route lookup to determine the interface to be used for the socket traffic. The library knows whether data is passing to or from an Ethernet NIC. If the data is passing to/from a supported HCA or Ethernet NIC, the VMA library intercepts the call and does the bypass work. If the data is passing to/from an unsupported HCA or Ethernet NIC, the VMA library passes the call to the usual kernel libraries responsible for handling network traffic. Thus, the same application can listen in on multiple HCAs or Ethernet NICs, without requiring any configuration changes for the hybrid environment.

The VMA library has an internal thread which is responsible for performing general operations in order to maintain a high level of performance. These operations are performed in the context of a separate thread to that of the main application.

The main activities performed by the internal thread are:

  • Poll the CQ if the application does not do so to avoid packet drops

  • Synchronize the card clock with the system clock

  • Handle any application housekeeping TCP connections (which should not impact its data path performance). For example: sending acknowledgements, retransmissions etc...

  • Handle the final closing of TCP sockets

  • Update VMA statistics tool data

  • Update epoll file descriptor contexts for available non-offloaded data

  • Handle bond management

There are several parameters by which the user can set the characteristics of the internal thread. See section VMA Configuration Parameters for a detailed description.

The following Internet socket types are supported:

  • Datagram sockets, also known as connectionless sockets, which use User Datagram Protocol (UDP)

  • Stream sockets, also known as connection-oriented sockets, which use Transmission Control Protocol (TCP)

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.