RDG for VMware vSphere 8.0 with NSX 4.x Accelerated by NVIDIA Networking

RDG for VMware vSphere 8.0 with NSX 4.x Accelerated by NVIDIA Networking

Created on Dec 12 2022

Scope

This document provides details on the Enhanced Data Path (EDP) mode configuration on a VDS switch, using an NVIDIA network fabric to achieve a high performance data plane.

Note

VMware ESXi, vSphere Cluster, vCenter and NSX installation and configuration are out of the scope of this document.

Abbreviations and Acronyms

Term

Definition

Term

Definition

DPDK

Data Plane Development Kit

PMD

Poll Mode Driver

DAC

Direct Attach Copper

SR-IOV

Single Root I/O Virtualization

EDP

Enhanced Data Path

RSS

Receive Side Scaling

ENS

Enhanced Networking Stack

TEP

Tunnel Endpoints

FPO

Flow Processing Offload

VDS

MPPS

Million Packets per Second

VM

Virtual Machine

NFV

Network Function Virtualization

VF

Virtual Function

NIC

Network Interface Card

VNF-C

Virtual Network Function Component

NUMA

Non-Uniform Memory Access

vNIC

Virtual Network Interface

References

Introduction

This document describes how to configure the VMware vSphere 8.0 cluster and VMware NSX 4.0.1.1 with NVIDIA ConnectX-6 DX network adapter. The ENS Model 1 and Model 0 for DPDK application on Ubuntu 20.04 Virtual Machine on top of ESXi 8.0 GA native driver will be configured.

This guide assumes the following software and drivers are installed:

ENS Model 0

In Model 0 there is n o hardware offload. All procedures are performed in the software (ENS).

Model 0 is supported on NVIDIA ConnectX-4 and newer cards from ESXi Version 6.7 and above, and supports the following features:

  • RSS (from ESXi Version 7.0u2).

  • Offload of inner and/or outer checksum validation to NVIDIA ConnectX hardware.

ENS Model 1

In ENS Model 1, NIC offloads support partial flow processing, such as packet classification and decapsulation on Rx. Therefore, even when the flow processing is offloaded to the hardware, the VMkernel networking stack must still see the packets and finish the processing on the Rx path, or preprocess them for encapsulation offload.

ENS Model 1 works with GENEVE but not with VXLAN.

EDP Modes

Interrupt mode: In this mode, the network interface card (NIC) driver receives packets through interrupt signals. Although this mode has a relatively high latency, it does not consume significant CPU cores overhead.

Poll mode: In this mode, the NIC driver constantly checks the NIC hardware for incoming packets. This mode offers low latency reception of packets but requires dedicated CPU cores to continuously poll for incoming packets.

Model 1 supports both modes — Poll mode and Interrupt mode.

Solution Architecture

Key Components and Technologies

  • NVIDIA Cumulus Linux

    NVIDIA® Cumulus® Linux is the industry's most innovative open network operating system that allows you to automate, customize, and scale your data center network like no other.

  • NVIDIA Spectrum Ethernet Switches

    Flexible form-factors with 16 to 128 physical ports, supporting 1GbE through 400GbE speeds.
    Based on a ground-breaking silicon technology optimized for performance and scalability, NVIDIA Spectrum switches are ideal for building high-performance, cost-effective, and efficient Cloud Data Center Networks, Ethernet Storage Fabric, and Deep Learning Interconnects.
    NVIDIA combines the benefits of NVIDIA Spectrum switches, based on an industry-leading application-specific integrated circuit (ASIC) technology, with a wide variety of modern network operating system choices, including NVIDIA Cumulus® Linux , SONiC and NVIDIA Onyx®.

  • NVIDIA ConnectX SmartNICs
    10/25/40/50/100/200 and 400G Ethernet Network Adapters
    The industry-leading NVIDIA® ConnectX® family of smart network interface cards (SmartNICs) offer advanced hardware offloads and accelerations.
    NVIDIA Ethernet adapters enable the highest ROI and lowest Total Cost of Ownership for hyperscale, public and private clouds, storage, machine learning, AI, big data, and telco platforms.

  • NVIDIA LinkX Cables

    The NVIDIA® LinkX® product family of cables and transceivers provides the industry’s most complete line of 10, 25, 40, 50, 100, 200, and 400GbE in Ethernet and 100, 200 and 400Gb/s InfiniBand products for Cloud, HPC, hyperscale, Enterprise, telco, storage and artificial intelligence, data center applications.

  • NSX Data Center

    NSX is a multi-hypervisor capable networking and security platform. It is a highly extensible platform that can address endpoint heterogeneity including containers, Kubernetes support, public clouds and cross cloud services with AWS, and other hypervisors.

  • DPDK

    DPDK is a set of libraries and drivers that support accelerated software processing for greater throughput and scalability.
    DPDK enables higher levels of packet processing throughput than what is achievable using the standard Linux kernel network stack.
    This optimized library gives application developers the ability to address challenging data plane processing needs that are typically found in Telecom workloads.

  • Enhanced Datapath (EDP)

    Enhanced Network Stack (also appears as Enhanced Datapath) is a networking stack mode which provides superior network performance when configured and enabled. It is primarily utilized in NFV workloads, which require the performance benefits this mode provides.
    ENS utilizes the DPDK Poll Mode driver model to significantly improve packet rate and latency for small message sizes.

    • Enhanced Datapath - Standard: This mode is a variant of the Enhanced Data Path mode. It is available only on ESXi hypervisor 7.0 and later versions. Please consult your account representative for applicability.

    • Enhanced Datapath - Performance: This is the Enhanced Data Path switch mode for ESXi host transport node. This mode provides accelerated networking performance. It requires nodes to use VMXNET3 vNIC enabled network cards. It is not supported on NSX Edge nodes and Public Gateways. The supported hypervisor is ESXi. It is recommended to run ESXi v6.7 U2 and later versions.

  • TRex - Realistic Traffic Generator

    TRex is an open-source, low-cost, stateful, and stateless traffic generator fueled by DPDK. It generates L4-7 traffic based on pre-processing and smart replay of L7 traffic templates. TRex amplifies both client and server side traffic and can scale up to 200Gb/sec with one UCS.

Logical Design

The setup used here includes 4 ESXi servers connected to one NVIDIA® Spectrum®-2 SN3000 Ethernet switch.

Logical_Design.png

Host Network Design

Host_network_design.png

Bill of Materials

vSphere Cluster

BoM.png

Deployment and Configuration

Wiring

wiring.png

Setup Configuration

This document does not cover highly available VMware vSphere cluster deployment.

Before starting the configuration process, make sure you are familiar with VMware vSphere, vCenter and NSX deployment and management procedures.

This guide does not cover the required vSphere and vCenter installation and configuration, as these are often based on customer's requirements.

The installation process requires administrator privileges on the target machine.

In addition, the following components are required:

1. ESXi server platform with NVIDIA® ConnectX®-6 Dx network adapter with inbox ESXi driver and firmware version 22.34.1002.

2. NVIDIA® Spectrum®-2 SN3000 Ethernet switch.

3. VMware vSphere 8.0 Cluster installed and configured.

4. VMware vCenter 8.0a installed and configured.

5. VMware NSX 4.0.1.1 .

Network

Prerequisites

Switch OS
NVIDIA Cumulus 5 .1.

Management Network
DHCP and DNS services are required.

Warning

The components installation and configuration are not covered in this guide.

Network Switch Configuration

Switch Configuration

Run the following commands on both Leaf switches in the vSphere Cluster to configure the VLAN .

Switch console

Copy
Copied!
            

nv set interface swp1-32 bridge domain br_default   nv set bridge domain br_default vlan 215,1620,1624   nv set bridge domain br_default untagged 215   nv config apply   nv config save

Hosts Preparation

Hosts in the vSphere Cluster must be configured before a data plane intensive workload can be attached to the VDS Enhanced switch.

To prepare the host for setup:

  • Physical server configuration

    All ESXi servers must have the same PCIe placement for the NIC and expose the same interface name.

  • vSphere cluster with minimum 3 VMware vSphere ESXi 8.0 hosts

  • vCenter 8.0 server

    Warning

    Installation of vCenter, ESXi hosts and configuration vSphere cluster are beyond the scope of this document.

  • Installer privileges: The installation requires administrator privileges on the target machine

  • Connection to ESXi host management interface

  • High speed network connectivity

  • Verify that NTP is configured in your environment, and that it works properly.

    NTP_on_ESXi.PNG

    NTP_on_vCenter.PNG

  • Make sure to:

    • Disable Virtualization (SR-IOV) in the BIOS (prerequisites).

    • Disable SR-IOV in the firmware and in the MLNX_OFED Driver. See the following document for further information.

To ensure optimal performance:

  • Configure BIOS for optimal performance.

  • Enable CPU hyperthreading.

  • Enable Turbo Boost.

  • Disable NUMA node interleaving.

  1. Power Management: Set this setting to "High" or "Maximum Performance" (verbiage depending on the vendor) to ensure that the CPUs always runs at least at the base frequency and uses the shallowest idle. The VMworld 2019 “Extreme Performance Series: Performance Best Practices (HBI2526BE)” presentation is an excellent source of information about the Power Management technology. Its major conclusion, however, does not apply to this workload, which may not benefit from higher maximum Turbo Boost frequencies and could be at greater risk of jitter.

  2. Enable Hyperthreading on the ESXi server:

    Enable this setting on systems that support it. Hyperthreading allows a single processor core to run two independent threads simultaneously. On processors with hyperthreading, each core can have two logical threads that share the core's resources, such as memory caches and functional units. BIOS providers might refer to the hyperthreaded core as a ‘Logical Processor’.

  3. Turbo Boost: Enable this setting in the BIOS. It allows the processor to operate faster than the rated frequency for peak loads. For more information about Turbo Boost, see Frequently Asked Questions on the Intel Turbo Boost Technology page on Intel's website.

  4. NUMA Node Interleaving: Ensure that this setting is disabled. With the NUMA node interleaving setting enabled, the hypervisor sees the available memory as one contiguous area. Therefore, the ability to place memory pages local to the CPU is lost, and the hypervisor sees all resources on the host as local.

Network Adapter Compatibility

To check the NIC's model, go to the vSphere Client, and navigate to:

ESXi server > "Configure" > "Networking" > "Physical Adapters".

Look for the relevant device on the right pane.
For example, in the below image, the MCX623106AE-CDA adapter card OPN is shown.

nic_compatibility_01.png

To check if the NIC supports Enhanced Data Path, go to the VMware Compatibility Guide, and follow the below steps:

  1. Go to "Product Release Version", and select the ESXi version.

  2. Go to "Features", and select "Enhanced Data Path - Interrupt Mode" and "Enhanced Data Path – Poll Mode".

  3. Under Keyword, inset the NIC OPN.

  4. Look for your NIC OPN in the list under "I/O Device and Model Information".

    nic_compatibility_02.png

Click the model link to check the driver and firmware versions.

nic_compatibility_03.png

Network Adapter Firmware update

To update the firmware to version 22.34.1002, please refer to the following link: HowTo Install NVIDIA Firmware Tools (MFT) on VMware ESXi 6.7 and 7.0 - Solutions - NVIDIA Networking Docs.

Hosts Network Configuration

Prerequisites

  • vSphere SL-WL01-Cluster clusters with VMware vSphere ESXi 8.0 or above hosts.

  • vCenter 8.0a or above.

  • The installation requires administrator privileges on the target machine.

  • Connection to ESXi host management interface.

  • High speed network connectivity.

This table provides details on the ESXi servers and switches in the SL-WL01-Cluster cluster, system names and their network configurations.

Server

Server

Name

IP and NICs

High-Speed Ethernet Network

Management Network

192.168.1.0/24

ESXi-01

sl01w01esx11

vmk1: 192.168.22.111 (vMotion)

vmk0: 192.168.1.111 From DHCP (reserved)

ESXi-02

sl01w01esx12

vmk1: 192.168.22.112 (vMotion)

vmk0: 192.168.1.112 From DHCP (reserved)

ESXi-03

sl01w01esx13

vmk1: 192.168.22.113 (vMotion)

vmk0: 192.168.1.113 From DHCP (reserved)

ESXi-04

sl01w01esx14

vmk1: 192.168.22.114 (vMotion)

vmk0: 192.168.1.114 From DHCP (reserved)

Leaf-01

clx-swx-035

-

10.7.215.37

Creating a Distributed Switch for DPDK Traffic

To create a new vDS:

  1. Launch the vSphere Web Client, and connect to a vCenter Server instance.

    creating_vds_01.png

  2. Create a distributed switch for vSAN traffic.

    creating_vds_02.png

  3. Provide a name for the new distributed switch , and select the location in which you would like to store the new vDS within the vCenter inventory (a data center object or a folder).

    Click NEXT .

    creating_vds_03.png

  4. Select the version of the vDS to create.

    Click NEXT.

    creating_vds_04.png

  5. Specify the Netwrok Offloads compatibility as None, and the number of uplink ports as 2. Uncheck the Create a default port group box, and enter a name for that group.

    Click NEXT.

    creating_vds_05.png

  6. Click Finish.

    creating_vds_06.png

  7. Set the MTU for the newly created distributed switch.

    Right-click the new distributed switch in the list of objects, and select Settings → Edit Settings... from the Actions menu.

    creating_vds_07.png

  8. In the Storage-DSwitch-Edit Settings dialog box, set the MTU to 9000, Discovery Protocol to Link Layer Discovery Protocol and Operation to Both.

    Click OK.

    creating_vds_08.png

Adding Hosts to a vDS

To add an ESXi host to an existing vDS:

  1. Launch the vSphere Web Client, and connect to a vCenter Server instance.

  2. Navigate to the list of Hosts in the SL MGMT cluster, and select ESXi host.

  3. Select Configure → Networking → Physical adapters.

  4. Check the network ports that you are going to use. In this case, vmnic4 and vmnic5 are used.

    image2023-1-29_13-50-43.png

  5. Navigate to the list of distributed switches.

  6. Right-click on the new distributed switch in the list of objects, and select Add and Manage Hosts from the Actions menu.

    adding_host_to_vds_02.png

  7. Select the Add hosts button, and click NEXT .

    adding_host_to_vds_03.png

  8. From the list of the new hosts, check the boxes with the names of each ESXi host you would like to add to the VDS.

    Click NEXT.

    adding_host_to_vds_04.png

  9. In the next Manage physical adapters menu, click on Adapters on all hosts and configure the adapters (in this case - vmnic4 and vmnic5) in an ESXi host as Uplink 1 and Uplink 2 for the VDS.

    adding_host_to_vds_05.png

  10. In the next Manage VMkernel adapters and Migrate VM networking menus, click NEXT to continue.

  11. Click FINISH.

    adding_host_to_vds_06.png

Warning

Adding a Port Group and a VMkernel Network for vMotion traffic is out of the scope of this document.

VMware NSX Manager Installation and Configuration

Prerequisites

Below are the prerequisites for deploying an NSX Manager.

  • Necessary resources to deploy NSX-Manager appliance on vCenter.

  • Port group for Management network. It is preferable to keep on the same network as vCenter.

  • Allocate IP address for NSX Manager. Allocate 4 IPs in case you plan to growth to 3 nodes cluster.

    In this guide we use only 1 NSX Manager.

  • NSX licenses and media OVA file downloaded from the VMware website.

  • Create a DNS A record for the NSX-Manager Hostnames.

Network Latency Requirements

The maximum network latency between NSX Managers in a NSX Manager cluster is 10ms.

The maximum network latency between NSX Managers and Transport Nodes is 150ms.

Storage Requirements

  • The maximum disk access latency is under 10ms.

  • It is recommended that NSX Managers are placed in a shared storage.

  • Storage should be highly available to avoid an outage causing all NSX Manager file systems to be set into a read-only mode upon a storage failure event.

    Please consult relevant documentation for your storage technology on how to optimally design a highly available storage solution.

Deployment

  1. Download the NSX manager OVA from my.vmware.com.

    NSXT_Install_00.png

  2. Login to the vCenter.

  3. Create a new VM Folder.

    NSXT_Install_00b.png

  4. Click VMs and Templates -> Right click the appropriate folder (where you would like to deploy the NSX Manager) -> Click Deploy OVF Template .

    NSXT_Install_01.png

  5. Select Local file , and click on UPLOAD FILES.

    NSXT_Install_02.png

  6. Browse the OVA file for NSX unified appliance. Click on Open and N EXT .

    NSXT_Install_03.png

    NSXT_Install_04.png

  7. Assign a VM name, and select the folder in which you wish to place the NSX manager. Click NEXT.

    NSXT_Install_05.png

  8. Select the comp ute resource whe re you wish to place your NSX manager, i.e. the cluster, resource pool or the host. Verify a successful compatibility, and c lick NEXT.

    NSXT_Install_06.png

  9. Review the details, and c lick NEXT.

    NSXT_Install_07.png

  10. Select deployment size (Medium), and click NEXT.

    Warning

    When selecting "Small" as the deployment configuration, some of the services do not work. Always use "Medium" or "Large".

    NSXT_Install_08.png

  11. Select the Storage/datastore where the NSX manager should be placed. C lick NEXT.

    NSXT_Install_09.png

  12. Select the Management network, and click NEXT to move to the "Customize Template" screen.

    NSXT_Install_10.png

  13. Specify the GRUB root password, and scroll down.

    Important

    Password complexity (minimum 12 symbols) is required.

    NSXT_Install_11.png

  14. Specify the root password, and scroll down.

    NSXT_Install_12.png

  15. Specify the admin, and scroll down.

    NSXT_Install_13.png

  16. Specify the audit account password, and scroll down.

    NSXT_Install_14.png

  17. Provide the Hostname, Rolename (the NSX Manager has 3 roles, as seen below) and networking details. Then, scroll down .

    NSXT_Install_15.png

  18. Assign the DNS and NTP details, and choose whether you need SSH to be enabled on the NSX Manager. Click NEXT.

    NSXT_Install_16.png

  19. Review the details, and click FINISH to complete the wizard. Wait for the NSX manager appliance to be successfully deployed.

    NSXT_Install_17.png

As seen below, the NSX Manager will be deployed in a maximum of 5-10 minutes.

NSXT_Install_fin.png

Once the appliance is successfully deployed, Power it on and wait for at least 15 minutes for all the services to come up.

Once the services are up, a login prompt is shown. At this point, wait for 10 more minutes as backend services must start the web portal related services.

Post Deployment Health Checks

Once done, o pen the console and login with the admin credentials by using putty. You will be shown the version number and the role.

Next, Verify the network configuration by using the below command.

NSX-T Manager Console

Copy
Copied!
            

get interface eth0

nsxt_validation_01.png

Check the services status by running the following command.

NSX-T Manager Console

Copy
Copied!
            

get services

nsxt_validation_02a.png

Warning

Liagent, migration-coordinator, and SNMP services are not started by default.

Configuration

Before you are starting to configure the NSX, please make sure that all ESXi hosts in this cluster are compliant.

NSXT_config_00.png

  1. Login from the NSX manager UI login page by using the following URL: " https:// ".

    NSXT_config_01.png

  2. Accept the EULA to log into your first NSX manager.

    NSXT_config_02.png

  3. Here you can decide whether you want to join the Customer Experience Improvement Program.

    NSXT_config_03.png

  4. You may take a look at the "What's new" page, or skip it if you wish to move on.

  5. Finally, the NSX manager interface should open up.

    NSXT_config_04.png

  6. To add your NSX license, n avigate to System->Licenses , and click on + ADD LICENSE.

    NSXT_config_05.png

  7. Add a license , and click on ADD .

    NSXT_config_06.png

    NSXT_config_07.png

Add a Compute Manager

  1. To add a vCenter as a compute manager, navigate to System->Fabric->Compute Managers, and c lick on ADD COMPUTE MANAGER.

    Add_Compute_Manager_01.png

  2. Fill in the vCenter details, and click ADD.

    Add_Compute_Manager_02.png

  3. In the vCenter server thumbprint window, click on ADD.

    Add_Compute_Manager_03.png

  4. Wait for the registration to complete. You will see the status change to registered, and the connection status set to "Up".

    Add_Compute_Manager_04.png

  5. Wait for the registration to complete. You will see the status change to registered, and the connection status set to "Up".

Add_Compute_Manager_04.png

Optional

To deploy additional NSX manager nodes, navigate to System → Appliances, and click on Add NSX appliance.

Add_NSX_Appliance_01.png

Fill in the additional manager details, such as Hostname, Management IP/Netmask, Gateway, DNS and NTP, and select the Node Size of the manager ( the size should be identical to the size of the first manager ). Click NEXT.

Add_NSX_Appliance_02.png

Add_NSX_Appliance_03.png

Select the appliance placement and network configurations, and click NEXT.

Add_NSX_Appliance_04.png

Add_NSX_Appliance_05.png

Complete the NSX manager passwords for access (use the same password used in the first manager ), and enable SSH access.

Click on INSTALL APPLIANCE.

Add_NSX_Appliance_06.png

Repeat the same steps for deploying a third NSX manager node. Wait for both nodes to be deployed, and for the joined cluster to be created.

Assign the cluster VIP for ease of management. Make sure that the cluster IP does not cause load balancing.

Add_NSX_Appliance_07.png

Add_NSX_Appliance_08.png

The deployment of the NSX manager and controller cluster are completed.

Enhanced Data Path Configuration

Network administrators must prepare the network with the supported NICs and drivers before creating VDS enabled transport zones in the enhanced data path mode.

Uplink Profiles

An uplink profile defines the policies for links from hypervisor hosts to the NSX logical switches.

The settings defined by these profiles may include teaming policies, active/standby links, transport VLAN ID and MTU settings.

Uplink profiles enable consistent configuration of identical capabilities for network adapters across multiple hosts and nodes. By default, two uplink profiles are provided with NSX, but since they cannot be edited, new profiles for the Edge uplink should be created for hosts’ uplinks as well.

See Create an Uplink Profile for further details.

Create a Host Uplink Profile

To create a Host uplink profile i n the NSX Manager , navigate to System → Fabric → Profiles → Uplink Profiles → +ADD.

Create_Host_Uplink_Profile_01.png

  1. Assign a name to the profile, and fill the description (optional).

  2. Under "Teamings", set the Teaming Policy to "Failover Order".

  3. Set the Active Uplinks to uplink-1 and the Standby Uplinks to uplink-2.

The Transport VLAN will be an Overlay VLAN ID (in our example - 1624), since these uplinks are connected directly to the hosts, and must be tagged accordingly.

Create_Host_Uplink_Profile_02.PNG

Create_Host_Uplink_Profile_03.png

Transport Zones

Transport Zones dictate which hosts (and consequently, which VMs) can participate in the use of a particular network. There are two types of transport zones - an Overlay and a VLAN.

  1. The overlay transport zone is used by both host transport nodes and NSX Edges, and is responsible for communication over the overlay network.

  2. The VLAN transport zone is used by the NSX Edge for its VLAN uplinks.

Both types create a VDS on the host or Edge to allow virtual-to-physical packet flow by binding logical router uplinks and downlinks to physical NICs. For more information, please see Transport Zones and Profiles.

Creating a Transport Zone

To create a Transport Zone i n the NSX Manager , navigate to System → Fabric → Transport Zones → +ADD.

Create_Transport_Zone_01.PNG

Transport Zone with Overlay Traffic Type

Provide a Name, and select the Traffic Type as Overlay (Geneve). Click ADD.

Create_Transport_Zone_02.PNG

Create_Transport_Zone_03.png

Important

Traffic Type as VLAN is out of the scope of this document.

Creating an NSX IP Pool

Each transport node, i.e. hypervisors, is assigned with an IP address for the TEP interface.

You can use DHCP, a Static IP list and an IP address Pool to assign IP addresses for the TEP (Tunnel Endpoint) interfaces.

To configure the ESXi hosts as transport nodes, create an NSX IP Pool to assign IP addresses for the TEP interfaces.

To create an IP Pool i n the NSX Manager:

  1. Navigate to Networking → IP Address Pools → ADD IP ADDRESS POOL .

    Create_NSX-T_IP_Pool_01.png

  2. Specify the name and description of the IP address pool, and click “ Set”.

    Create_NSX-T_IP_Pool_02.png

  3. Click ADD SUBNET → Select “ IP Ranges ” .

    Create_NSX-T_IP_Pool_03.png

  4. Specify the IP address ranges along with the CIDR and the Gateway IP address for the specified IP ranges. Click “ADD” .

    Create_NSX-T_IP_Pool_04.png

  5. Click Apply.

    Create_NSX-T_IP_Pool_05.png

  6. Click “Save” to create the IP Pool.

    Create_NSX-T_IP_Pool_06.png

    Once the IP Pool is created, the status is changed to “Success” .

    Create_NSX-T_IP_Pool_07.png

Host Transport Node

A host transport node is a node that participates in an NSX overlay or VLAN networking.

VDS Modes:

  • Standard applies to all the supported hosts.

  • Enhanced Datapath - Standard is an interrupt driven variant of the Enhanced Data Path mode.

  • Enhanced Datapath - Performance is the Enhanced Data Path switch mode. This mode provides accelerated networking performance, but also introduces additional prerequisites. In order to benefit from this mode, workloads must be compiled with DPDK and use VMNET3 for their vNIC.

Installing NSX components on a Host Transport Node

To add a Host Transport Node:

1. Ensure that at least one Physical NIC (pNIC) is available for NSX. In the below example, the host is configured with a ConnectX-6 Dx pNIC, vmnic0 on SL-WL01-Cluster.

2. Make sure the ConnectX-6 Dx card is running firmware version 22.34.1002:

    1. Enable SSH Access to ESXi server.

    2. Log into ESXi vSphere Command-Line Interface with root permissions.

    3. Verify that the host is equipped with an NVIDIA adapter card.

      ESXi CLI

      Copy
      Copied!
                  

      lspci | grep Mellanox

Add_Host_Transport_Node_01.png

3. Verify that driver firmware version 22.34.1002 is installed.

ESXi CLI

Copy
Copied!
            

esxcli network nic get -n vmnic4

Add_Host_Transport_Node_01b.png

If the above driver versions are not installed, refer to the below user guides to learn how to upgrade your drivers to the latest versions: HowTo Install NVIDIA Firmware Tools (MFT) on VMware ESXi 6.7 and 7.0 - Solutions - NVIDIA Networking Docs

4. Push out the necessary NSX components to the hosts.

In the NSX Manager, navigate to System → Fabric → Nodes → Host Transport Nodes, and from the “Managed By” dropdown menu, change the Managed by from Standalone Hosts to the vCenter server you just added.
By expanding the name of the cluster, you will be able to see that your hosts are not in a prepared state and are not connected to the NSX Controller.

Select the Cluster and click Configure NSX.

Add_Host_Transport_Node_02.png

5. Click on Create New Transport Node Profile.

Add_Node_02.PNG

6. Provide the below details i n the Add Transport Node Profile window

    • Provide a profile Name (SL-WL01-TNP-EDP-STD).

    • Select vCenter and VDS in the Name.

    • From the Transport Zone dropdown menu, select the SL-WL01-Overlay-TZ - Overlay transport zone(s) created earlier.

    • From the Uplink Profile dropdown menu, select the host uplink profile created earlier (Uplink-profile-01).

    • From the IP Assignment dropdown menu, select the Use IP Pool.

    • From the IP Pool dropdown menu, select the SL-WL01-TEP-IPpool Tunnel EndPoint (TEP) IP Pool created earlier.

    • Select uplink-1 and uplink-2 in the Teaming Policy Uplink Mapping.

    • From the Mode dropdown menu, select the Enhanced Datapath - Standard or Enhanced Datapath - Performance.

Add_Host_Transport_Node_03.png

In case you selected the Enhanced Datapath - Performance mode, you must provide the CPU Config.

7. To provide the CPU Config, check to which NUMA node the NIC is connected:

    1. Enable the SSH Access to ESXi server.

    2. Log into ESXi vSphere Command-Line Interface with root permissions.

    3. Run the following commands in ESXi CLI:

    4. To find out the number for NUMA nodes on your host, run the “esxcli hardware memory get” command.

ESXi Host Console

Copy
Copied!
            

~ esxcli hardware memory get Physical Memory: 274731274240 Bytes Reliable Memory: 0 Bytes NUMA Node Count: 4

8. To find out the NIC’s affinity, run the following command:

Copy
Copied!
            

vsish -e cat /net/pNics/<vmnicX>/properties | grep -i numa

Sample:

ESXi CLI

Copy
Copied!
            

vsish -e cat /net/pNics/vmnic4/properties | grep -i numa

Add_Host_Transport_Node_01perf.png

9. Enter the CPU Config, select NUMA node and numbers of CPU Lcores.

Add_Host_Transport_Node_04.png

10. Click ADD.

11. On the NSX Installation window, check a Transport Node Profile and click Apply.

Add_Host_Transport_Node_05.png

This starts the host preparation process, so allow a few minutes for the NSX VIBs to be installed on the hosts and for the transport nodes to be configured.

Add_Host_Transport_Node_07.png

In the vSphere cluster:

Add_Host_Transport_Node_06.png

Click the refresh button as needed, until you see that the status is “Success” and that the status indicators are “Up”.

Add_Host_Transport_Node_08.png

In the vSphere Cluster:

Add_Host_Transport_Node_09.png

Verifying FPO Model 1 Level 1 is Enabled

Log into the ESXi vSphere Command-Line Interface with root permissions, and run the following commands in the ESXi CLI to get details on the new VDS switch and VMK interface(s):

ESXi CLI

Copy
Copied!
            

esxcfg-vswitch -l

Add_Host_Transport_Node_10.png

ESXi CLI

Copy
Copied!
            

esxcfg-vmknic -l

Add_Host_Transport_Node_11.png

Check the FPO model on the VDS (SL-WL01-DS01).

ESXi CLI

Copy
Copied!
            

nsxdp-cli ens fpo get -dvs SL-WL01-DS01

FPO_enabled.png

Check the FPO Model on the vmnic4 (in our environment).

ESXi CLI

Copy
Copied!
            

nsxdp-cli ens fpo status get -n vmnic4

FPO_status.png

Run t he nsxdp-cli command to see the ENS switch.

ESXi CLI

Copy
Copied!
            

nsxdp-cli ens switch list

Add_Host_Transport_Node_12.png

Check the ENS ports.

To check ENS ports in Enhanced Datapath - Performance mode ( formerly - a Poll ), run the following command:

ESXi CLI

Copy
Copied!
            

nsxdp-cli ens port list

ESXi CLI

Copy
Copied!
            

~ nsxdp-cli ens port list portID ensPID TxQ RxQ hwMAC numMACs type Queue Placement(tx|rx) ------------------------------------------------------------------------------ 67108883 0 1 1 00:50:56:6e:bc:a4 0 GENERIC 0 |0 67108884 1 1 1 00:50:56:62:ed:f9 0 GENERIC 0 |0 2214592526 2 1 2 00:00:00:00:00:00 0 UPLINK 0 |0 2214592533 2 1 2 00:00:00:00:00:00 0 UPLINK 0 |0

To Check ENS ports in the Enhanced Datapath - Standard ( formerly - an Interrupt ) mode, please run the following command:

ESXi CLI

Copy
Copied!
            

nsxdp-cli ens port list

Show the ENS status of the interfaces: make sure that the ENS driven is TRUE, and check whether the Enhanced Datapath - Standard mode is Enabled/Disabled on vmnic4,5.

ESXi CLI

Copy
Copied!
            

esxcfg-nics -e

ENS_status_of_interfaces.png

Make sure that vmnic4,5 is UP.

ESXi CLI

Copy
Copied!
            

esxcfg-nics -l

Port_UP.png

Creating a Segment

An NSX Segment ( formerly - a Logical Switch) is a representation of layer-2 connectivity across transport nodes (ESXi Hosts), with layer-3 IP reachability between the segments. Each segment is assigned a VNI (virtual network identifier) which is similar to a VLAN ID.

Virtual machines attached to the same segment can communicate with each other, even across separate physical hosts via Geneve tunnels.

To create the Segment:

  1. Log in from the NSX manager UI login page by using the URL "https://".

  2. Navigate to Networking → Segments.

    create_segment_01.png

  3. Click ADD Segment.

    create_segment_02.png

  4. Fill up the Segment Name, Transport Zone, Subnets and VLAN. Click SAVE .

    create_segment_03a.png

  5. Click NO.

    create_segment_04.png

    create_segment_05.png

  6. If you enter your vCenter, and Navigate to Networking → VDS, you will now see this new Segment (Distributed Port Group) to which you can attach a VM.

    create_segment_06.png

Configuring a Virtual Machine

This section explains how to run DPDK TestPMD workload i n the ENS Flow Processing Offload (FPO) Model 1 Level 1.

Warning

To configure the ENS Flow Processing Offload (FPO) Model 0, please see Appendix A.

Adding a Network Adapter to a VM

  1. Locate the Virtual Machine in the vSphere Web Client.

    1. Select a data center, folder, cluster, resource pool, or a host, and click the Related Objects tab.

    2. Click Virtual Machines, and select the virtual machine from the list.

  2. Shut Down Host OS on the Virtual Machine. The VM is used with Ubuntu 20.04 Operation System.

  3. For optimal performance and to attach the VM to the Segment created earlier, Select the VM, and go to "Edit Settings" → Virtual Hardware.

    VM_configuration_01.png

  4. Click ADD NEW DEVICE → Network adapter.

    VM_configuration_02.png

  5. Click on the NEW Network drop-down menu, and select Browse...

    Select the relevant Segment. In this case, the group is SL-WL01-EDP-Seg01. Click OK.

    VM_configuration_03.png

  6. Expand the New Network section.

    Verify that the Adapter Type is VMXNET3, and change Shares to High.

    VM_configuration_04.png

  7. For optimal performance, select logical processor affinity for this VM. To check the range of logical processors, run the following command.

    ESXi CLI

    Copy
    Copied!
                

    vsish -e cat /hardware/cpuTopology/numa/nodes/0

    VM_configuration_01a.png

  8. Select the VM, right-click it, and select Edit Settings.
    Set Shares and Scheduling Affinity.
    Sample:

    VM_configuration_06.png

  9. Select the VM Options tab. Expand the CPU Topology section.

    VM_configuration_07.png

  10. Select Cores per Socket, NUMA Node and Device Assignment for New Network.
    Example:

    VM_configuration_08.png

  11. Select Reserve all guest memory and set Memory Shares to High.

    VM_configuration_09.png

  12. Enable Huge Page Support.

    The virtual infrastructure supports backing Guest OS memory with 1GB Huge Pages for memory intensive and large memory DPDK applications.

    To use 1GB Huge Pages to back the DPDK application's guest operating system memory, set sched.mem.lpage.enable1GPage = TRUE to the VM.

    Important

    Before you enable Huge Pages, consider the resource availability for other workloads in the cluster and for cluster operations such as HA and DRS.

    To set Huge Pages, perform the following steps:

    1. Click VM Options, and select Advanced.

    2. Scroll down to Configuration Parameters, and select Edit Settings.

    3. Click Add Configuration Params, and enter sched.mem.lpage.enable1GPage = TRUE. Click OK.

      VM_configuration_10.png

  13. Power on the VM.

  14. Set up a Static IP on the second VM's interface.

    set_IP.png

  15. Check the ENS ports on the ESXi server.

    ESXi CLI

    Copy
    Copied!
                

    nsxdp-cli ens port list

    VM_configuration_11.png

    Important

    Create a VM on the second ESXi host, and repeat the Add Network Adapter to the VM step.

  16. Run Ping.

    ping.png

  17. Check the Offload traffic running on the ESXi host:

    ESXi CLI

    Copy
    Copied!
                

    [root@sl01w01esx12:~] nsxdp-cli ens flow-table dump -l 0 FT dstMAC srcMAC VLAN srcPort srcIP dstIP proto VNI srcPort/type dstPort/code Actions hwHits hwBytes ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- L4 00:50:56:9e:64:1a 00:50:56:9e:28:e7 0 3 192.168.121.122 192.168.121.123 1 66560 8 0 bmap:0x1400080 inval(s):131 cg:1091 dp:0x4 len:704; GENEVE DECAP; DFW on dstPort; 0 0     [root@sl01w01esx12:~] nsxdp-cli ens flow-table dump -l 1 FT dstMAC srcMAC VLAN srcPort srcIP dstIP proto VNI srcPort/type dstPort/code Actions hwHits hwBytes ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- L4 00:50:56:9e:28:e7 00:50:56:9e:64:1a 0 4 192.168.121.123 192.168.121.122 1 0 0 0 bmap:0x20000c0 inval(s):119 cg:1091 dp:0x3 len:814; DFW on srcPort; VNI: 66560; GENEVE ENCAP VNI: 66560; 0 0  

Preparing the Driver for the DPDK Application

To prepare the driver for the DPDK, run the following steps in VM OS:

Build DPDK v20.05

  • Clone DPDK:

    VM console

    Copy
    Copied!
                

    cd /root git clone git://dpdk.org/dpdk dpdk_repo cd dpdk_repo git checkout v20.05

  • Install meson:

    VM console

    Copy
    Copied!
                

    apt install python3-pip sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 1 pip3 install meson

  • Install ninja:

    VM console

    Copy
    Copied!
                

    cd /tmp wget https://github.com/ninja-build/ninja/releases/download/v1.10.2/ninja-linux.zip apt install unzip unzip ninja-linux.zip cp ninja /usr/local/bin/

  • Build DPDK:

    VM console

    Copy
    Copied!
                

    cd /root/dpdk_repo meson -Dtests=false -Ddisable_drivers=net/ark,net/atlantic,net/avp,net/axgbe,net/bond,net/i*,net/p*,net/netvsc --prefix=`pwd`/dpdk-install build ninja -C build

  • Prepare the Driver

Before the driver preparation, retrieve the second network interface name and ID.

To check the interface names, which are different from the management IP, follow this e xample:

VM console

Copy
Copied!
            

~# ip a s 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:bc:e3:6c brd ff:ff:ff:ff:ff:ff inet 10.7.215.62/24 brd 10.7.215.255 scope global dynamic ens160 valid_lft 4602sec preferred_lft 4602sec inet6 fe80::250:56ff:febc:e36c/64 scope link valid_lft forever preferred_lft forever 3: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:bc:5b:10 brd ff:ff:ff:ff:ff:ff inet 192.168.1.62/24 brd 192.168.1.255 scope global ens192 valid_lft forever preferred_lft forever inet6 fe80::250:56ff:febc:5b10/64 scope link valid_lft forever preferred_lft forever

In our case, ens192 is the second adapter.

Check the PCI ID of the driver .

Example:

VM console

Copy
Copied!
            

~# lspci | grep VMXNET3 03:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01) 0b:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)

In ours case it is 0b:00.0.

To prepare the driver, refer to the following example:

VM console

Copy
Copied!
            

~# ifconfig ens192 0 ~# /root/dpdk_repo/usertools/dpdk-devbind.py -u 0b:00.0 ~# modprobe uio ~# insmod /root/dpdk-kmods/linux/igb_uio/igb_uio.ko ~# /root/dpdk_repo/usertools/dpdk-devbind.py -b igb_uio 0b:00.0

Verification

For verification, examine the network devices using the DPDK-compatible driver:

VM console

Copy
Copied!
            

~# /root/dpdk_repo/usertools/dpdk-devbind.py --status-dev net   Network devices using DPDK-compatible driver ============================================ 0000:0b:00.0 'VMXNET3 Ethernet Controller 07b0' drv=igb_uio unused=vmxnet3,vfio-pci   Network devices using kernel driver =================================== 0000:03:00.0 'VMXNET3 Ethernet Controller 07b0' if=ens160 drv=vmxnet3 unused=igb_uio,vfio-pci *Active*

Appendix A - Enable FPO Model 0

To Enable FPO Model 0, perform the following steps:

  1. Check the FPO model.

    1. Log into the ESXi vSphere Command-Line Interface with root permissions, and run the following commands in the ESXi CLI to view information about the new N-VDS switch and VMK interface(s).

      ESXi CLI

      Copy
      Copied!
                  

      esxcfg-vswitch -l

      Add_Host_Transport_Node_10.png

      ESXi Host Console

      Copy
      Copied!
                  

      esxcfg-vmknic -l

      Add_Host_Transport_Node_11.png

    2. .Check the FPO model on the VDS (SL-WL01-DS01) switch.

      ESXi CLI

      Copy
      Copied!
                  

      ~ nsxdp-cli ens fpo get -dvs SL-WL01-DS01

      FPO_enabled.png

    3. Check the running FPO Model on the vmnic4 (in our environment).

      ESXi Host Console

      Copy
      Copied!
                  

      nsxdp-cli ens fpo status get -n vmnic4

      FPO_status.png

  2. Enable FPO Model 0.

    1. Enable Model 0.

      ESXi CLI

      Copy
      Copied!
                  

      ~ nsxdp-cli ens fpo set -dvs SL-WL01-DS01 --disable The new setting will take effect after reactivating ENS.

    2. Run t he nsxdp-cli command to see the ENS switch.

      ESXi CLI

      Copy
      Copied!
                  

      ~ nsxdp-cli ens switch list name swID maxPorts numActivePorts numPorts mtu numLcores lcoreIDs ------------------------------------------------------------------------------ DvsPortset-0 0 16384 3 3 9000 1 0

    3. D isable ENS by running the following command, and wait 5-10 seconds.

      ESXi CLI

      Copy
      Copied!
                  

      ~ esxcfg-vswitch -Y DvsPortset-0

    4. Ena ble the ENS.
      To Enable ENS in the Enhanced Datapath - Performance mode ( formerly - a Poll ), run the following command.

      ESXi Host Console

      Copy
      Copied!
                  

      ~ esxcfg-vswitch -y DvsPortset-0

      To Enable ENS in the Enhanced Datapath - Standard ( formerly - an Interrupt ) mode, please run the following command.

      ESXi Host Console

      Copy
      Copied!
                  

      ~ esxcfg-vswitch -y -i DvsPortset-0

  3. Check the FPO model.

    ESXi CLI

    Copy
    Copied!
                

    ~ nsxdp-cli ens fpo get -dvs SL-WL01-DS01 FPO is globally enabled. FPO is disabled on SL-WL01-DS01. FPO Model is set to Model 1 Level 1 but disabled on SL-WL01-DS01. Geneve Critical bit is enabled. IP Discovery through DHCP is enabled. IP Discovery through ICMP6 ND is disabled.

  4. Check the ENS ports.

    To Check the ENS ports in the Enhanced Datapath - Standard (formerly - an Interrupt) mode, please run the following command.

    ESXi CLI

    Copy
    Copied!
                

    ~ nsxdp-cli ens port list   portID ensPID TxQ RxQ hwMAC numMACs type Queue Placement(tx|rx) ------------------------------------------------------------------------------ 67108884 0 4 1 00:50:56:6c:57:41 0 VNIC 4 4 4 4 |5 67108885 1 4 1 00:50:56:63:0b:10 0 VNIC 2 2 2 2 |3 2214592528 2 24 24 04:3f:72:f5:92:6b 0 UPLINK 1 3 5 7 - - - - - - - - - - - - - - - - - - - - |0 0 0 0 0 2 4 6 - - - - - - - - - - - - - - - - 2214592526 3 24 24 04:3f:72:f5:92:6a 0 UPLINK 1 3 5 7 - - - - - - - - - - - - - - - - - - - - |0 0 0 0 0 2 4 6 - - - - - - - - - - - - - - - -

  5. Show ENS status of interfaces. ENS driven is TRUE on vmnic4 and vmnic5.

    ESXi CLI

    Copy
    Copied!
                

    esxcfg-nics -e

    ENS_status_of_interfaces.png

  6. Check that vmnic4 and vmnic5 is UP.

    ESXi CLI

    Copy
    Copied!
                

    esxcfg-nics -l

    Port_UP.png

  7. Go to Configure a Virtual Machine steps.

Appendix B - Performance Testing

Performance Testing

Warning

Disclaimer: The performance results listed in this document are indicative, and should not be considered formal performance targets for NVIDIA products.

Setup

image2021-11-15_12-20-8.png

Setup Specs:

  • Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz , 16-Cores, 2 NUMA nodes

  • RAM 96 GB

  • NIC ConnectX-6 Dx dual-port (yet the traffic goes only to a single port with 100 Gb/s)

  • Driver v4.23.0.36

  • Firmware v22.34.1002

  • VMware ESXi 8.0.0 build-20513097

  • VMware vCenter 8.0.0 build-20920323

  • VMware NSX 4.0.1.1.0.20598726 4.0.1.1.0.20598726 .0.1.1.0.20598726

  • DPDK & testpmd v21.05 with 5tswap forward-mode

  • TRex v3.02 as a packet generator

Warning

The environment was optimized according to this document.

Commands

On the testpmd side:

In the test lab, 4 Cores are used for testpmd and 4 RX/TX queues.

VM console

Copy
Copied!
            

# cd dpdk/build/app/ # ./dpdk-testpmd -l 0-5 --master-lcore=0 -w 0000:0b:00.0,rxq_cqe_comp_en=1,mprq_en=1,rxqs_min_mprq=1 --socket-mem=1024 -- --burst=64 --txd=2048 --rxd=2048 --mbcache=512 --rxq=4 --txq=4 --nb-cores=4 --forward-mode=5tswap -i -a --rss-udp --port-topology=loop

On the TRex side:

VM console

Copy
Copied!
            

# cd v3.2/ # nohup ./t-rex-64 --no-scapy-server --no-ofed-check -i -c 8 & # ./trex-console   > start -f udp_64_simple.py -m 2mpps -p 0 >tui

udp_64_simple.py file:

VM console

Copy
Copied!
            

from trex_stl_lib.api import *   class STLS1(object):   def create_stream (self):   pkt = Ether()/IP(src="16.0.0.1",dst="48.0.0.1")/UDP(dport=12)/(22*'x')   vm = STLScVmRaw( [ STLVmFlowVar(name="v_port", min_value=4337, max_value=5337, size=2, op="inc"), STLVmWrFlowVar(fv_name="v_port", pkt_offset= "UDP.sport" ), STLVmFixChecksumHw(l3_offset="IP",l4_offset="UDP",l4_type=CTRexVmInsFixHwCs.L4_TYPE_UDP),   ] )   return STLStream(packet = STLPktBuilder(pkt = pkt ,vm = vm ) , mode = STLTXCont(pps = 8000000) )     def get_streams (self, direction = 0, **kwargs): # create 1 stream return [ self.create_stream() ]     # dynamic load - used for trex console or simulator def register(): return STLS1()

Overlay Results

results.png

Conclusion

The benchmark results in this performance study show the great advantages of the Model 1 configuration in terms of MPPS.

Model 1 provides up to 16% performance improvement compared to Model 0.

The ConnectX-6 Dx network adapter allows to o ffload the entire data-path to the ConnectX hardware. This offloads the CPU from performing the data communication tasks and generates a significant performance boost that is critical in the new era of accelerated computing associated with a massive amount of data transfers.

Authors

image2020-11-17_6-50-16.png

Boris Kovalev

Boris Kovalev has worked for the past several years as a Solutions Architect, focusing on NVIDIA Networking/Mellanox technology, and is responsible for complex machine learning, Big Data and advanced VMware-based cloud research and design. Boris previously spent more than 20 years as a senior consultant and solutions architect at multiple companies, most recently at VMware. He has written multiple reference designs covering VMware, machine learning, Kubernetes, and container solutions which are available at the Mellanox Documents website.

Last updated on Sep 12, 2023.