image image image image image image



On This Page

Scope

This Quick Start Guide (QSG) demonstrates how to build and install FD.io Vector Packet Processing (VPP) using NVIDIA ConnectX NIC, and how to evaluate VPP performance using TRex traffic generator achieving full line rate.

Abbreviations and Acronyms

TermDefinitionTermDefinition
VPPVector Packet ProcessingDPDKData Plane Development Kit
NICNetwork InterfaceDACDirect Attach Copper
PMDPoll Mode DriverTGTraffic generator 
DUT Device Under TestMPPSMillions packets per second

References

NVIDIA DPDK Poll Mode Driver (PMD)

DPDK Data Plane Development Kit

VPP - How_To_Optimize_Performance_(System_Tuning)

Installing TRex in a few steps using Nvidia ConnectX adapters.

Introduction

VPP is an open source code project which delivers high performances for network packet processing. Its platform is built on a ‘packet processing graph’ modular approach which applies to packet vector and is integrated with DPDK Data Plane Development Kit plugin, achieving fast I/O.

NVIDIA ConnectX Smart Network Interfaces Card (SmartNIC) family together with NVIDIA DPDK Poll Mode Driver (PMD) constitute an ideal hardware and software stack for VPP to match high performances.

This document walks you through the steps on how to compile VPP code with Nvidia DPDK PMD, run VPP and measure performances for L3 IPv4 routing using TRex traffic generator.

Solution Architecture

Logical Design

The simplest setup is used here: 2 bare metal servers connected back-to-back.
Single TRex server connected to single VPP server with 2 ethernet Direct Attach Copper (DAC) cables.


Software Stack Components

Bill of Materials

Deployment and Configuration

Wiring

IPv4 Routing Scheme

  • VPP server is set with the two interfaces 1.1.1.2/24 and 2.2.2.2/24
  • TRex server is set with the two interfaces 1.1.1.1/24 and 2.2.2.1/2
  • Route for subnet 16.0.0.0/8 is set for VPP server to route back IP 16.0.0.1 packet to TRex server

Host

BIOS Settings

For Trex and VPP servers.

  1. Set the electricity power saving to performance mode.
  2. Set the VT-D flag on.
  3. Set the Turbo-boost on.

GRUB File Settings

For Trex and VPP servers - edit grub file with the following parameters.

intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=32 intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable nohz_full=<core-list> rcu_nocbs=<core-list> rcu_nocb_poll isolcpus=<core-list>

For more information on each parameter, please visit How_To_Optimize_Performance_(System_Tuning).

Building and Configuring VPP with L3 Routes

  1. Install the prerequisite packages.

    yum install git make gcc nasm rdma-core-devel libmnl-devel epel-release dnf-plugins-core -y
    yum config-manager --set-enabled powertools
  2. Download the VPP source code. 

    git clone https://github.com/FDio/vpp
    git checkout origin/stable/2009
  3. Install the VPP depended building packages. 

    cd $HOME-PATH/vpp
    make install-dep
  4. Compile VPP with NVIDIA DPDK PMD. 

    cd $HOME-PATH/vpp
    make build-release DPDK_MLX5_PMD=y
    cp $HOME-PATH//vpp/build-root/install-vpp-native/external/lib/librte_pmd_mlx5_glue.so* /usr/lib/
    cp $HOME-PATH//vpp/build-root/install-vpp-native/external/lib/librte_pmd_mlx5_glue.so* /usr/lib64/
  5. Configure the VPP configuration file $HOME-PATH/vpp/src/vpp/conf/startup.conf.

    • Set the main VPP application core: main-core 1
    • Set the number of data plane workers cores: corelist-workers 2-9
    • Set the number of RX queues: num-rx-queues 8
    • Set the number of TX queues: num-tx-queues 8
    • Set the first interface: dev 0000:61:00.0.
      To find interface pci slot number, use this command:

      lspci | grep "Mellanox" | cut -d " " -f 1
    • Set the second interface: dev 0000:61:00.0.
      To find interface pci slot number use this command:

      lspci | grep "Mellanox" | cut -d " " -f 1
    • "Uncomment" parameter 'no-multi-seg' to improve performances.

      It is recommended to set an equal number of queues and cores.

      See example file:

      startup.conf
      unix {
        nodaemon
        log /var/log/vpp/vpp.log
        full-coredump
        cli-listen /run/vpp/cli.sock
        gid vpp
      }
      
      api-trace {
      ## This stanza controls binary API tracing. Unless there is a very strong reason,
      ## please leave this feature enabled.
        on
      ## Additional parameters:
      ##
      ## To set the number of binary API trace records in the circular buffer, configure nitems
      ##
      ## nitems <nnn>
      ##
      ## To save the api message table decode tables, configure a filename. Results in /tmp/<filename>
      ## Very handy for understanding api message changes between versions, identifying missing
      ## plugins, and so forth.
      ##
      ## save-api-table <filename>
      }
      
      api-segment {
        gid vpp
      }
      
      socksvr {
        default
      }
      
      cpu {
      	## In the VPP there is one main thread and optionally the user can create worker(s)
      	## The main thread and worker thread(s) can be pinned to CPU core(s) manually or automatically
      
      	## Manual pinning of thread(s) to CPU core(s)
      
      	## Set logical CPU core where main thread runs, if main core is not set
      	## VPP will use core 1 if available
      	main-core 1
      
      	## Set logical CPU core(s) where worker threads are running
      	corelist-workers 2-9
      
      	## Automatic pinning of thread(s) to CPU core(s)
      
      	## Sets number of CPU core(s) to be skipped (1 ... N-1)
      	## Skipped CPU core(s) are not used for pinning main thread and working thread(s).
      	## The main thread is automatically pinned to the first available CPU core and worker(s)
      	## are pinned to next free CPU core(s) after core assigned to main thread
      	# skip-cores 4
      
      	## Specify a number of workers to be created
      	## Workers are pinned to N consecutive CPU cores while skipping "skip-cores" CPU core(s)
      	## and main thread's CPU core
      	# workers 2
      
      	## Set scheduling policy and priority of main and worker threads
      
      	## Scheduling policy options are: other (SCHED_OTHER), batch (SCHED_BATCH)
      	## idle (SCHED_IDLE), fifo (SCHED_FIFO), rr (SCHED_RR)
      	# scheduler-policy fifo
      
      	## Scheduling priority is used only for "real-time policies (fifo and rr),
      	## and has to be in the range of priorities supported for a particular policy
      	# scheduler-priority 50
      }
      
      # buffers {
      	## Increase number of buffers allocated, needed only in scenarios with
      	## large number of interfaces and worker threads. Value is per numa node.
      	## Default is 16384 (8192 if running unpriviledged)
      	# buffers-per-numa 128000
      
      	## Size of buffer data area
      	## Default is 2048
      	# default data-size 2048
      # }
      
      dpdk {
      	## Change default settings for all interfaces
      	dev default {
      		## Number of receive queues, enables RSS
      		## Default is 1
      		num-rx-queues 8
      
      		## Number of transmit queues, Default is equal
      		## to number of worker threads or 1 if no workers treads
      		num-tx-queues 8 
      
      		## Number of descriptors in transmit and receive rings
      		## increasing or reducing number can impact performance
      		## Default is 1024 for both rx and tx
      		# num-rx-desc 512
      		# num-tx-desc 512
      
      		## VLAN strip offload mode for interface
      		## Default is off
      		# vlan-strip-offload on
      
      		## TCP Segment Offload
      		## Default is off
      		## To enable TSO, 'enable-tcp-udp-checksum' must be set
      		# tso on
      
      		## Devargs
                      ## device specific init args
                      ## Default is NULL
      		# devargs safe-mode-support=1,pipeline-mode-support=1
      
      		## rss-queues
      		## set valid rss steering queues
      		# rss-queues 0,2,5-7
      	}
      
      	## Whitelist specific interface by specifying PCI address
      	dev 0000:61:00.0
      	dev 0000:61:00.1
      	## Blacklist specific device type by specifying PCI vendor:device
              ## Whitelist entries take precedence
      	# blacklist 8086:10fb
      
      	## Set interface name
      	# dev 0000:02:00.1 {
      	#	name eth0
      	# }
      
      	## Whitelist specific interface by specifying PCI address and in
      	## addition specify custom parameters for this interface
      	# dev 0000:02:00.1 {
      	#	num-rx-queues 2
      	# }
      
      	## Change UIO driver used by VPP, Options are: igb_uio, vfio-pci,
      	## uio_pci_generic or auto (default)
      	# uio-driver vfio-pci
      
      	## Disable multi-segment buffers, improves performance but
      	## disables Jumbo MTU support
      	no-multi-seg
      
      	## Change hugepages allocation per-socket, needed only if there is need for
      	## larger number of mbufs. Default is 256M on each detected CPU socket
      	# socket-mem 2048,2048
      
      	## Disables UDP / TCP TX checksum offload. Typically needed for use
      	## faster vector PMDs (together with no-multi-seg)
      	# no-tx-checksum-offload
      
      	## Enable UDP / TCP TX checksum offload
      	## This is the reversed option of 'no-tx-checksum-offload'
      	# enable-tcp-udp-checksum
      }
      
      ## node variant defaults
      #node {
      
      ## specify the preferred default variant
      #	default	{ variant avx512 }
      
      ## specify the preferred variant, for a given node
      #	ip4-rewrite { variant avx2 }
      
      #}
      
      
      # plugins {
      	## Adjusting the plugin path depending on where the VPP plugins are
      	#path /root/vpp/build-root/install-vpp-native/vpp/lib/vpp_plugins
      
      	## Disable all plugins by default and then selectively enable specific plugins
      	#plugin default { disable }
      	#plugin dpdk_plugin.so { enable }
      	# plugin acl_plugin.so { enable }
      
      	## Enable all plugins by default and then selectively disable specific plugins
      	# plugin dpdk_plugin.so { disable }
      	# plugin acl_plugin.so { disable }
      # }
      
      ## Statistics Segment
      # statseg {
          # socket-name <filename>, name of the stats segment socket
          #     defaults to /run/vpp/stats.sock
          # size <nnn>[KMG], size of the stats segment, defaults to 32mb
          # per-node-counters on | off, defaults to none
          # update-interval <f64-seconds>, sets the segment scrape / update interval
      # }
      
  6. Add VPP group:

    groupadd vpp
  7. Run VPP binary:

    $HOME-PATH/vpp/build-root/build-vpp-native/vpp/bin/./vpp -c $HOME-PATH/vpp/src/vpp/conf/startup.conf
  8. Configure VPP with L3 interfaces and routes - make sure to configure L3 IPv4 interfaces with the correct IPs.
    This document sets the first interface to be 1.1.1.2/24 and second interface to 2.2.2.2/24. 

    ## Start VPP CLI ##
    $HOME-PATH/vpp/build-root/build-vpp-native/vpp/bin/./vppctl
    ## Set 2 L3 interfaces with IP and subnet ##
    vppctl set int ip address HundredGigabitEthernet61/0/0 1.1.1.2/24
    vppctl set interface state HundredGigabitEthernet61/0/0 up
    vppctl set int ip address HundredGigabitEthernet61/0/1 2.2.2.2/24
    vppctl set interface state HundredGigabitEthernet61/0/1 up
    ## Set static route ##
    vppctl ip route add 16.0.0.0/8 via 1.1.1.1 HundredGigabitEthernet61/0/0

Installing TRex Server with L3 Interfaces

For building the TRex server, please follow the steps in the following guide: Installing TRex in a few steps using Nvidia ConnectX adapters.

Make sure to configure L3 IPv4 interfaces with the correct IPs.

This document sets the first interface to be 1.1.1.1/24 and the second interface to 2.2.2.1/24. 

Installation process example:

./dpdk_setup_ports.py -i
By default, IP based configuration file will be created. Do you want to use MAC based config? (y/N)n
+----+------+---------+-------------------+-----------------------------------------+-----------+----------+----------+
| ID | NUMA |   PCI   |        MAC        |                  Name                   |  Driver   | Linux IF |  Active  |
+====+======+=========+===================+=========================================+===========+==========+==========+
| 0  | 0    | 02:00.0 | 38:63:bb:33:16:28 | NetXtreme BCM5719 Gigabit Ethernet PCIe | tg3       | eno1     | *Active* |
+----+------+---------+-------------------+-----------------------------------------+-----------+----------+----------+
| 1  | 0    | 02:00.1 | 38:63:bb:33:16:29 | NetXtreme BCM5719 Gigabit Ethernet PCIe | tg3       | eno2     |          |
+----+------+---------+-------------------+-----------------------------------------+-----------+----------+----------+
| 2  | 0    | 02:00.2 | 38:63:bb:33:16:2a | NetXtreme BCM5719 Gigabit Ethernet PCIe | tg3       | eno3     |          |
+----+------+---------+-------------------+-----------------------------------------+-----------+----------+----------+
| 3  | 0    | 02:00.3 | 38:63:bb:33:16:2b | NetXtreme BCM5719 Gigabit Ethernet PCIe | tg3       | eno4     |          |
+----+------+---------+-------------------+-----------------------------------------+-----------+----------+----------+
| 4  | 0    | 08:00.0 | ec:0d:9a:8a:28:3a | MT28800 Family [ConnectX-5 Ex]          | mlx5_core | ens1f0   |          |
+----+------+---------+-------------------+-----------------------------------------+-----------+----------+----------+
| 5  | 0    | 08:00.1 | ec:0d:9a:8a:28:3b | MT28800 Family [ConnectX-5 Ex]          | mlx5_core | ens1f1   |          |
+----+------+---------+-------------------+-----------------------------------------+-----------+----------+----------+
Please choose even number of interfaces from the list above, either by ID , PCI or Linux IF
Stateful will use order of interfaces: Client1 Server1 Client2 Server2 etc. for flows.
Stateless can be in any order.
Enter list of interfaces separated by space (for example: 1 3) : 4 5

For interface 4, assuming loopback to it's dual interface 5.
Putting IP 1.1.1.1, default gw 2.2.2.2 Change it?(y/N).y
Please enter IP address for interface 4: 1.1.1.1
Please enter default gateway for interface 4: 1.1.1.2
For interface 5, assuming loopback to it's dual interface 4.
Putting IP 2.2.2.2, default gw 1.1.1.1 Change it?(y/N).y
Please enter IP address for interface 5: 2.2.2.1
Please enter default gateway for interface 5: 2.2.2.2
Print preview of generated config? (Y/n)y
### Config file generated by dpdk_setup_ports.py ###

- version: 2
  interfaces: ['08:00.0', '08:00.1']
  port_info:
      - ip: 1.1.1.1
        default_gw: 1.1.1.2
      - ip: 2.2.2.1
        default_gw: 2.2.2.2

  platform:
      master_thread_id: 0
      latency_thread_id: 6
      dual_if:
        - socket: 0
          threads: [1,2,3,4,5,12,13,14,15,16,17]


Save the config to file? (Y/n)y
Default filename is /etc/trex_cfg.yaml
Press ENTER to confirm or enter new file:
File /etc/trex_cfg.yaml already exist, overwrite? (y/N)y
Saved to /etc/trex_cfg.yaml.

Verification

Configure TRex to generate L3 UDP packets:

  1. Copy this file to TRex home directory udp_1pkt_simple.py
  2. Get into TRex console and generate the packets. 
$TREX-HOME-PATH/./trex-console
trex>start -f udp_1pkt_simple.py -m 1mpps -p 0

where:

ParameterDescription
-fLoads the desired script that constructs the packet format
-mSpecifies the packet rate 
-pSpecifies which TRex port to use

          3. Check that the packets are routed back to TRex:

             In "tui" terminal, validate that port 0 is sending packets and port 1 is receiving packets.

trex>tui

Appendix

Performance Testing

This graph shows the maximum number of packets per second and bandwidth using CPU Model Platinum 8168 CPU @ 2.70GHz.

Note

NVIDIA Gen3 ConnectX 100G NIC can provide 148Mpps, achieving full line rate when using packet size of 64Byte.

Stronger CPU will gain better performances.

Viewing extended DPDK performance report is available on the DPDK official site: perf-reports

Authors


Amir Zeidner

For the past several years, Amir has worked as a Solutions Architect primarily in the Telco space, leading advanced solutions to answer 5G, NFV, and SDN networking infrastructures requirements. Amir’s expertise in data plane acceleration technologies, such as Accelerated Switching and Network Processing (ASAP²) and DPDK, together with a deep knowledge of open source cloud-based infrastructures, allows him to promote and deliver unique end-to-end NVIDIA Networking solutions throughout the Telco world.


  


Notice

This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. Neither NVIDIA Corporation nor any of its direct or indirect subsidiaries and affiliates (collectively: “NVIDIA”) make any representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or functionality.
NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice.
Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete.
NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA product referenced in this document. No contractual obligations are formed either directly or indirectly by this document.
NVIDIA products are not designed, authorized, or warranted to be suitable for use in medical, military, aircraft, space, or life support equipment, nor in applications where failure or malfunction of the NVIDIA product can reasonably be expected to result in personal injury, death, or property or environmental damage. NVIDIA accepts no liability for inclusion and/or use of NVIDIA products in such equipment or applications and therefore such inclusion and/or use is at customer’s own risk.
NVIDIA makes no representation or warranty that products based on this document will be suitable for any specified use. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to evaluate and determine the applicability of any information contained in this document, ensure the product is suitable and fit for the application planned by customer, and perform the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this document. NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
No license, either expressed or implied, is granted under any NVIDIA patent right, copyright, or other NVIDIA intellectual property right under this document. Information published by NVIDIA regarding third-party products or services does not constitute a license from NVIDIA to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property rights of the third party, or a license from NVIDIA under the patents or other intellectual property rights of NVIDIA.
Reproduction of information in this document is permissible only if approved in advance by NVIDIA in writing, reproduced without alteration and in full compliance with all applicable export laws and regulations, and accompanied by all associated conditions, limitations, and notices.
THIS DOCUMENT AND ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF ANY USE OF THIS DOCUMENT, EVEN IF NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the products described herein shall be limited in accordance with the Terms of Sale for the product.

Trademarks
NVIDIA, the NVIDIA logo, and Mellanox are trademarks and/or registered trademarks of NVIDIA Corporation and/or Mellanox Technologies Ltd. in the U.S. and in other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

Copyright
© 2022 NVIDIA Corporation & affiliates. All Rights Reserved.