Create Content

image image image image image

On This Page

System Requirements

The platform and server requirements for GPUDirect RDMA are detailed in the following table:

Platform

Type and Version

HCAs

  • NVIDIA® ConnectX®-4 (VPI/EN)
  • NVIDIA® ConnectX®-4 Lx
  • NVIDIA® ConnectX®-5 (VPI/EN)
  • NVIDIA® ConnectX®-6 (VPI/EN)
  • NVIDIA® ConnectX®-6 Dx
  • NVIDIA® ConnectX®-6 Lx
GPUs
  • NVIDIA® Tesla™ / Quadro K-Series or Tesla™ / Quadro™ P-Series GPU

Software/Plugins

Recommendations

Once the NVIDIA software components are installed, it is important to make sure that the GPUDirect kernel module is properly loaded on each of the compute systems where you plan to run the job that requires the GPUDirect RDMA. To do that, run:

service nv_peer_mem status                                         

For other Linux flavors, run:

lsmod | grep nv_peer_mem                                         

Usually, this kernel module is set to load by default by the system startup service. If it is not loaded, GPUDirect RDMA would not work, which would result in a very high latency for message communications.

In this case, to start the module, run:

service nv_peer_mem start                                              

Or for other Linux flavors, run:

modprobe nv_peer_mem   

To achieve the best performance for GPUDirect RDMA, it is required that both the HCA and the GPU be physically located on the same PCIe IO root complex. 

For additional information on the system's architecture, either review the system manual, or run:

lspci -tv |grep NVIDIA