Create Content



Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

This How To describes how to enable PVRDMA in VMware vSphere 6.5/6.7 with Mellanox ConnectX-5 network card. 

This guide assumes the following software and drivers are installed:

  • VMware ESXi 6.7 Update 2, build 13006603
  • vCenter 6.7 Update 2, build 13007421
  • Distributed Switch 6.6.0
  • ConnectX® Ethernet Driver for VMware® ESXi Server 4.17.13.1-1vmw.670.2.48.13006603 
  • CentOS 7.6

References

Components Overview

vSphere Distributed Switch

A vSphere Distributed Switch provides centralized management and monitoring of the networking configuration of all hosts that are associated with the switch. You must set up a distributed switch on a vCenter Server system, and its settings will be propagated to all hosts that are associated with the switch.

Paravirtual RDMA (PVRDMA)

Direct Memory Access (DMA) is an ability of a device to access host memory directly, without the intervention of the CPU.

Remote Direct Memory Access (RDMA) is the ability of accessing (read, write) memory on a remote machine without interrupting the processing of the CPU(s) on that system.

RDMA Advantages

  • Zero-copy - applications can perform data transfers without the involvement of the network software stack. Data is sent and received directly to the buffers without being copied between the network layers.

  • Kernel bypass - applications can perform data transfers directly from user-space without kernel involvement.

  • No CPU involvement - applications can access remote memory without consuming any CPU time in the remote server. The remote memory server will be read without any intervention from the remote process (or processor). Moreover, the caches of the remote CPU will not be filled with the accessed memory content.

PVRDMA Architecture

Image Removed

VM

                          • Expose a virtual PCIe device to VM
                          • PVRDMA Device Driver
                          • InfiniBand User Library
                          • Provides direct HW access for data path
                          • RDMA API calls proxied to PVRDMA backend

PVRDMA backend

                          • Creates virtual RDMA resources for VM
                          • Guests operate on these virtual resources

ESXi

  • Leverage native RDMA and Core drivers
  • Create corresponding resources in HCA

    Accelerating VM Data

    Image Removed

    VM memory address translations registered with HCA – Buffer registration.

    Application issues a request (Work Request) to read/write from a particular guest address, size.

    PVRDMA backend intercepts these requests issues requests to mapped hardware resources.

    HCA performs DMAs to/from application memory without any SW involvement.

                                • Enables direct zero-copy data transfers in HW

    Solution Overview

    Equipment

     Image Removed

    Solution Logical Design

    Image Removed

    Bill of Materials

    Image Removed

    Solution Physical  Network Wiring

     Image Removed

    Network Configuration

    The below table provides details of ESXi server names and network configuration:

    ESXi

    Server

    Server

    Name

    IP and NICsHigh-Speed Ethernet Network

    Management Network

    192.168.1.0

    ESXi-01sl01w01esx21noneeno0: From DHCP (reserved)ESXi-02sl01w01esx22noneeno0: From DHCP (reserved)ESXi-03sl01w01esx23noneeno0: From DHCP (reserved)ESXi-04sl01w01esx24noneeno0: From DHCP (reserved)

    The below table provides details of VMs names and network configuration:

    VM

    Server

    Name

    Created on Aug 8, 2019

    On This Page

    Introduction

    This document instructs how to enable PVRDMA in VMware vSphere 6.5/6.7 with NVIDIA ConnectX network cards. 

    This guide assumes that the following software and drivers have been pre-installed:

    • VMware ESXi 6.7 Update 2, build 13006603
    • vCenter 6.7 Update 2, build 13007421
    • Distributed Switch 6.6.0
    • ConnectX® Ethernet Driver for VMware® ESXi Server 4.17.13.1-1vmw.670.2.48.13006603 
    • CentOS 7.6

    References

    Components Overview

    vSphere Distributed Switch

    A vSphere Distributed Switch provides centralized management and monitoring of the networking configuration of all hosts that are associated with the switch. You must set up a distributed switch on a vCenter server system, and its settings will be propagated to all hosts that are associated with the switch.

    Paravirtual RDMA (PVRDMA)

    Direct Memory Access (DMA) - A device's capability to access the host memory directly, without the intervention of the CPU.

    Remote Direct Memory Access (RDMA) - The ability to accessing memory (read, write) on a remote machine without interrupting the CPU(s) processes on the system.

    RDMA Advantages:

    • Zero-copy - Allows applications to perform data transfers without involving the network software stack. Data is sent and received directly to the buffers without being copied between the network layers.

    • Kernel bypass - Allows applications to perform data transfers directly from the user-space without the kernel's involvement.

    • CPU offload - Allows applications to access a remote memory without consuming any CPU time on the remote server. The remote memory server will be read without any intervention from the remote process (or processor). Moreover, the cache of the remote CPU will not be filled with the accessed memory content.

    PVRDMA Architecture

    Image Added

    Accelerating VM Data

    Image Added

    Solution Overview

    Setup

     Image Added

    Solution Logical Design

    Image Added

    Bill of Materials

    Image Added

    Solution Physical Network Wiring

     Image Added

    Configuration

    Network Configuration

    The below table provides the ESXi server names and details on their network configuration:

    ESXi

    Server

    Server

    Name

    IP and NICs
    High-Speed Ethernet Network

    Management Network

    192.168.1.0

    VM
    ESXi-01
    pvrdma-vm01192.168.11.51
    sl01w01esx21noneeno0: From DHCP (reserved)
    VM
    ESXi-02
    pvrdma-vm02192.168.11.52
    sl01w01esx22noneeno0: From DHCP (reserved)
    VM
    ESXi-03
    pvrdma-vm03192.168.11.53
    sl01w01esx23noneeno0: From DHCP (reserved)
    VM
    ESXi-04
    pvrdma-vm04192.168.11.54
    sl01w01esx24noneeno0: From DHCP (reserved)

    ESXi Host Configuration

    Check host configurations

    Check host configurations

    1. EnableSSH Access to ESXi server.

    2. Log into ESXi vSphere Command-Line Interface with root permissions.

    3. Verify that the host is equipped with Mellanox adapter.

    Code Block
    languagetext
    themeFadeToGrey
    titleESXi Console
    ~ lspci | grep Mellanox
    
    0000:02:00.0 Network controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [vmnic2]
    0000:02:00.1 Network controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [vmnic3]
    Info
    Note: in this case, Mellanox card is using vmnic2 and vmnic3.

    4. Verify the logical RDMA devices currently registered on the system.

    Code Blocklanguagetext

    The below table provides the VM names and details on their network configuration:

    VM

    Server

    Name

    IP and NICs
    High-Speed Ethernet Network

    Management Network

    192.168.1.0

    VM-01pvrdma-vm01192.168.11.51eno0: From DHCP (reserved)
    VM-02pvrdma-vm02192.168.11.52eno0: From DHCP (reserved)
    VM-03pvrdma-vm03192.168.11.53eno0: From DHCP (reserved)
    VM-04pvrdma-vm04192.168.11.54eno0: From DHCP (reserved)

    ESXi Host Configuration

    Check the host configurations:

    1. Enable SSH Access to ESXi server.
    2. Log into the ESXi vSphere Command-Line Interface with root permissions.
    3. Verify that the host is equipped with an NVIDIA adapter card:

      Code Block
      languagetext
      themeFadeToGrey
      titleESXi Console
    ~
    1. ~ lspci 
    esxcli
    1. | 
    rdma device list Name Driver State MTU Speed Paired Uplink Description ------- ---------- ------ ---- -------- ------------- -----
    1. grep Mellanox
      
      0000:02:00.0 Network controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [vmnic2]
      0000:02:00.1 Network controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [vmnic3]
      Info
      Note: in this case, the NVIDIA card is using vmnic2 and vmnic3.
    2. Verify that the logical RDMA devices are currently registered on the system:

      Code Block
      languagetext
      themeFadeToGrey
      titleESXi Console
      ~ esxcli rdma device list
      Name 	 Driver 	 State 	 MTU    Speed 	   Paired Uplink 	Description
      -------	 ----------	 ------	 ----   --------
    vmrdma0
    1.  
    nmlx5_rdma
    1.   
    Active 1024
    1. -------------	-----------------------------------
      vmrdma0	 nmlx5_rdma  Active	 1024	100 Gbps   vmnic2			MT28800 Family [ConnectX-5 MT28831]
      vmrdma1	 nmlx5_rdma  Down	 1024	0		   vmnic3			MT28800 Family [ConnectX-5 MT28831]

    Deployment

    Guide

    Prerequisites

    vSphere

    Before starting the deployment process, a vSphere Distributed Switch (vDS) must be created:

    Creating a vDS

    Perform the following steps to To create a new vDS:

    1.
    1. Launch the vSphere Web Client, and connect to a vCenter Server instance.
    2. select
    1. On the vSphere Web Client home screen, select the vCenter object from the list on the left.
    Then,

    1. Hover over the Distributed Switches from the Inventory Lists area
    .3. On the right side of the vSphere Web Client, click the Create a
    1. , and click on the New Distributed Switch icon (
    it looks like a switch with a green plus mark in the corner).
    This launches the New vDS wizard.

    Image Removed

    4. Supply
    1. see below image):
      Image Added
      This will launch the New vDS creation wizard.
    2. Provide a name for the new distributed switch, and select the location within the vCenter inventory
    (a datacenter object or a folder)
    1. where you would like to store the new vDS
    .
    1. (a datacenter object or a folder). Click Next.
      Image Modified
    5. Select the version of the vDS you would like to create.
    1. Select the version of the vDS you would like to create:
      Image Modified
    6. Specify
    1. Set the number of uplink ports
    as 2 and create a default port group with the name.
    1. to 2. Tick the Create a default port group box, and give a name to that group:
      Image Modified
    7.
    1. Click Next
    and then Finish
    1. to Finish.

    Adding

    and managing hosts.Perform the following steps to

    Hosts to the vDS

    To add an ESXi host to an existing vDS:

    1.
    1. Launch the vSphere Web Client, and connect to a vCenter Server instance.
    2.
    1. Navigate to the list of distributed switches.
    3. Select
    1. Choose the new distributed switch
    in
    1. from the list of objects on the right, and select Add and Manage Hosts from the Actions menu
    .
    1. :

      Image Modified
    4.
    1. Select the Add
    Hosts radio button
    1. hosts button, and click Next
    .
    1. :

      Image Modified
    5.
    1. Click on the
    green
    1. New hosts option (a green plus icon) to add an ESXi host.
      Image Added
      This opens the Select New Host dialog box.
    Image Removed
    6.
    1. From the list of new hosts
    to add, place a check mark next to the name of each ESXi host you would like
    1. , tick the boxes with the names of the ESXi hosts you wish to add to the vDS.Image Added

      Click

    OK when
    1. OK when you are done, and then click

    Next to
    1. Next to continue.

    Image Removed

    7. The
    1. In the next screen, make sure both the Manage physical adapters and Manage VMkernel adapters

    options
    1.  checkboxes are

    selected
    1. ticked. Click Next to continue.

    1. Image Modified

    8. Configure vmnic2 in
    1. Configure vmnic2 in each ESXi host as an Uplink 1 for vDS
    .
    1. :
      Image Modified
    9.
    1. Create and attach the vmkernel adapter vmk2 to
    vDS port group
    1. the sl01-w01-vds02-
    pvrdma. Click
    1. pvrdma vDS port group. Click the green plus icon
    and Select an existing network.
    Click OK
    to continue.

    Image Removed

    10. Click Next to continue.

    Image Removed

    11. Supply
    1. , and select one of the existing networksClick OK.Image Added
      Image Added
      Click Next.
    2. Provide an IPv4 address and Subnet mask for the vmk2 vmkernel adapter
    vmk2.
    Click Next
    to continue.
    1. :
      Image Modified
    12. 
    1. Click Next

    to continue.

    Image Removed

    13. Click Next to continue. 

    Image Removed

    14. Click Finish.
    1. until the wizard is finished:
      Image Added
      Image Added

    2. Click Finish:
      Image Modified

    Configure an ESXi Host for PVRDMA

    To use PVRDMA in vSphere 6.5/6.7, your environment must meet meet several configuration requirements.

    To configure an ESXi host for PVRDMA, perform follow the following below steps, will described in bottom.

    Tag a VMkernel Adapter for PVRDMA

    Select To tag a VMkernel adapter, select it and enable it for PVRDMA communication using by performing the following steps:

    1.
    1. In the vSphere Web Client, navigate to the host.
    2.
    1. On the Configure tab, expand
    System.3. Click
    1. the System subheading and click Advanced System Settings.
    4.
    1. Locate Net.PVRDMAvmknic and click Edit.
    5.
    1. Enter the value of the VMkernel adapter that you want to use, and click OK.
    In our lab environment, we entered vmk2.

    Image Removed

    Image Removed

    (Optional). You can use ESXI CLI to Tag
    1. Image Added
      Image Added
      In this example, vmk2 was used.


    Note

    Optional: 

    To tag a vmknic created on the DVS

    which VRDMA should use

    and used by the VRDMA for TCP channel

    by run

    , you can use ESXI CLI by running the following command line:

    Code Block
    languagetext
    themeFadeToGrey
    titleESXi Console
    esxcli system settings advanced set -o /Net/PVRDMAVmknic -s 
    vmk1
    vmk2

    Enable

    the

    Firewall Rule for PVRDMA

    Enable To enable the firewall rule for PVRDMA in the security profile of the ESXi host using the following procedure:

    1.
    1. In the vSphere Web Client, navigate to the host.
    2. On
    1. In the Configure tab, expand the System subheading.
    4. In the
    1. Go to Security Profile → Firewall(6.7) or Firewall(6.5) section
    , click
    1. and click Edit.
    5.
    1. Scroll to the pvrdma rule and
    select
    1. tick the
    check
    1. relevant box next to it
    .
    1. :
      Image Modified
    6.
    1. Click OK to finish.


    Note
    (

    Optional

    ).

    You can

    use

    use ESXI CLI

    to Enable

     to enable the pvrdma firewall rule (or disable the firewall

    ?!

    ) with the following command line:

    Code Block
    languagetext
    themeFadeToGrey
    titleESXi Console
    esxcli network firewall ruleset set -e true -r pvrdma



    Assign

    a

    PVRDMA Adapter to a Virtual Machine

    To enable a virtual machine to exchange data using RDMA, you must associate the VM with a PVRDMA network adapter. The steps are as followsTo do so:

    1.
    1. Locate the VM in the vSphere Web Client.
    a.
    1. Select a data center, folder, cluster, resource pool
    ,
    1. or a host, and click on the VMs tab.
    b.
    1. Click Virtual Machines and double-click the VMfrom the list.
    2.
    1. Power off the VM.
    3.
    1. In the Configure tab of the VM, expand the Settings subheading, and select VM Hardware.
    4.
    1. Click Edit, and select the Virtual Hardware tab in the dialog box displaying the settings.
      Image Added
    5.
    1. At the bottom of the window next to New device,
    select
    1. select Network, and
    click
    1. click Add.
      Image Added
    6.
    1. Expand the New Network section, and connect the VM to a distributed port group.
    7.
    1. For Adapter Type, select PVRDMA.
      Image Added
    8.
    1. Expand the Memory section,
    select
    1. tick the box next to Reserve all guest memory (All locked).
    Image Removed
    1. Image Added
    9.
    1. Click OK to close the dialog window.
    10.
    1. Power on the virtual machine

    Configure Guest OS for PVRDMA

    Warning
    This step assumes a procedure to assign a PVRDMA Adapter to a Virtual A prerequisite of this step is assigning a PVRDMA adapter to a Virtual Machine with CentOS 7.2 or later preferable to install CentOS 7.6., and Ubuntu 18.04. 

    To configure a Guest OS for PVRDMA,

    you need to

    install a PVRDMA driver.

    The steps are as follows:

    1. Create VM with CentOS 7.2 or later preferable to install CentOS 7.6 and with compatibility VM version 14.
    Info
    You can upgrade existing VM with greater than CentOS 7.2 compatibility to 13 in Esxi 6.5 and 14 in Esxi 6.7 if not upgraded.

             2. Add PVRDMA adapter over DVS portgroup from vCenter.

    If you have a greater than CentOS 7.2 on Esxi 6.7 with VM compatibility 14

            3. Install InfiniBand drivers and reload pvrdma driver

    The installation process depends on the ESXi version, VM tools and Guest OS version: 

    Guest OS: CentOS 7.3 and later
    VM hardware version 14
    ESXi v6.7
    Guest OS: CentOS 7.2
    VM hardware version 13
    ESXi v6.5


    1. Create a VM with VM Compatibility version 14, and install CentOS version 7.3 or later.
    2. Add the PVRDMA adapter over a DVS portgroup from the vCenter.
    3. Install the InfiniBand packages, and reload the pvrdma driver with the following command line:

      Code Block
      languagetext
      themeFadeToGrey
      titleVM Console
      yum groupinstall "Infiniband Support" –y
      rmmod vmw_pvrdma
      modprobe vmw_pvrdma
      ibv_devinfo

    If you have a CentOS 7.2 on Esxi 6.5 with VM compatibility 13

             3. Install InfiniBand drivers






    1. Create a VM with VM Compatibility version 13, and install CentOS 7.2.
    2. Add the PVRDMA adapter over a DVS portgroup from the vCenter.
    3. Install the InfiniBand drivers with the following command line:

      Code Block
      languagetext
      themeFadeToGrey
      titleVM Console
      yum groupinstall "Infiniband Support" –y
    4. Installing vrdma
    1. Install the pvrdma driver:

      Code Block
      languagetext
      themeFadeToGrey
      titleVM Console
      tar xf vrdma_ib_devel.tar
      cd vrdma_ib_devel/
      make
      cp pvrdma.ko /lib/modules/3.10.0-327.el7.x86_64/extra/
      depmod –a
      modprobe pvrdma
    5. Installing
    1. Install the vrdma lib:

      Code Block
      languagetext
      themeFadeToGrey
      titleVM Console
      cd /tmp/
      tar xf libvrdma_devel.tar
      cd libvrdma_devel/
      ./autogen.sh
      ./configure --libdir=/lib64
      make
      make install
      cp pvrdma.driver /etc/libibverbs.d/
      rmmod pvrdma
      modprobe pvrdma
      ibv_devinfo

    If you have a CentOS 7.2 on Esxi 6.7 with VM compatibility 14


             3. Install InfiniBand driversFor Guest OS: Ubuntu 18.04, VM hardware version 14, ESXi v6.7 , the vmw_pvrdma driver should already be included in 18.04. The user level libraries can be installed using:

    Code Block
    languagetext
    themeFadeToGrey
    titleVM Console
    yumapt-get groupinstall "Infiniband Support" –y

             4. Upgrade VM kernel to be greater than 4.10:

    Code Block
    languagetext
    themeFadeToGrey
    titleVM Console
    rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
    rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
    yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
    yum --enablerepo=elrepo-kernel install kernel-ml
    reboot
    edit /boot/grub/grub.conf with new kernel version vmlinuz and initramfs  as in /boot/
             5. Installing vrdma lib
    install rdma-core
    reboot
    Info
    In case the VM compatibility version does not match the above, upgrade the VM compatibility to version 13 in ESXi 6.5, and to version 14 in Esxi 6.7.

    Deployment Verification

    To test the communication using PVRDMA, Perftest is used. This is a collection of tests written over uverbs intended for use as a performance micro-benchmark.

    The tests may be used for hardware or software tuning, as well as for functional testing.

    To install and run the benchmark:

    Deployment Verification
    1. Install Perftest:

      Code Block
      languagetext
      themeFadeToGrey
      titleVM Console
      yum install git
      git clone https://github.com/linux-rdma/perftest.git
      cd perftest/
      yum install autotools-dev automake
      yum install libtool
      yum install libibverbs-devel
      ./autogen.sh
      ./configure
      make -j 8
    2. Check the network interface name:

      Code Block
      languagetext
      themeFadeToGrey
      titleVM Console
    cd /tmp/
    tar xf libvrdma_devel.tar
    cd libvrdma_devel/
    ./autogen.sh
    ./configure --libdir=/lib64
    make
    make install
    cp pvrdma.driver /etc/libibverbs.d/
    rmmod pvrdma
    modprobe pvrdma
    ibv_devinfo
    1. ifconfig
      ...
      ens224f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
    2. Add Static IP configuration to the network interface. Modify /etc/sysconfig/network-scripts/ifcfg-ens224f0

      Code Block
      languagetext
      themeFadeToGrey
      titleAdd to file
      HWADDR=00:50:56:aa:65:92
      DNS1=192.168.1.21
      DOMAIN=vwd.clx
      BOOTPROTO="static"
      NAME="ens224f0"
      DEVICE="ens224f0"
      ONBOOT="yes"
      USERCTL=no
      IPADDR=192.168.11.51
      NETMASK=255.255.255.0
      PEERDNS=no
      IPV6INIT=no
      IPV6_AUTOCONF=no
      ping 192.168.11.51

    Repeat steps 1-3 for the second VM.

    On the first VM ("Server"), run the following:

    Code Block
    languagetext
    themeFadeToGrey
    titleVM01 "Server" Console
    systemctl disable firewall
    systemctl stop firewalld
    systemctl disable firewalld
    firewall-cmd --state
    ./ib_write_bw -x 0 -d vmw_pvrdma0 --report_gbits

    On the second VM ("Client"), run the following:

    Code Block
    languagetext
    themeFadeToGrey
    titleVM02 "Client" Console
    ./ib_write_bw -x 0 -F 192.168.11.51 -d vmw_pvrdma0 --report_gbits
    ************************************
    
    * Waiting for client to connect... *
    ************************************
    -------------------------------------------------------------------------------- -------
    RDMA_Write BW Test
    Dual-port : OFF Device : vmw_pvrdma0
    Number of qps : 1 Transport type : IB
    Connection type : RC Using SRQ : OFF
    CQ Moderation : 100
    Mtu : 1024[B]
    Link type : Ethernet
    GID index : 0
    Max inline data : 0[B]
    rdma_cm QPs : OFF
    Data ex. method : Ethernet
    -------------------------------------------------------------------------------- -------
    local address: LID 0000 QPN 0x0004 PSN 0xfb9486 RKey 0x000005 VAddr 0x007f68c62 a1000
    GID: 254:128:00:00:00:00:00:00:02:80:86:255:254:170:101:146
    remote address: LID 0000 QPN 0x0002 PSN 0xe72165 RKey 0x000003 VAddr 0x007f2ab4 361000
    GID: 254:128:00:00:00:00:00:00:02:80:86:255:254:170:58:174
    -------------------------------------------------------------------------------- -------
    #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
    65536  5000          90.56           90.39            0.172405
    -------------------------------------------------------------------------------- -------


    Done!

    Authors

    Include Page
    SA:Boris Kovalev
    SA:Boris Kovalev

    Related Documents

    Content by Label
    showLabelsfalse
    showSpacefalse
    sortcreation
    cqllabel in ("roce","vmware")