Created on July 7, 2021.
Scope
The following Reference Deployment Guide (RDG) explains how to build a high performing Kubernetes (K8s) cluster with containerd container runtime that is capable of running DPDK-based applications over NVIDIA Networking end-to-end Ethernet infrastructure.
This RDG describes a solution with multiple servers connected to a single switch that provides secondary network for the Kubernetes cluster. A more complex scale-out network topology of multiple L2 domains is beyond the scope of this document.
Abbreviations and Acronyms
Term | Definition | Term | Definition |
---|---|---|---|
CNI | Container Network Interface | LLDP | Link Layer Discovery Protocol |
CR | Custom Resources | NFD | Node Feature Discovery |
CRD | Custom Resources Definition | OCI | Open Container Initiative |
CRI | Container Runtime Interface | PF | Physical Function |
DHCP | Dynamic Host Configuration Protocol | QSG | Quick Start Guide |
DNS | Domain Name System | RDG | Reference Deployment Guide |
DP | Device Plugin | RDMA | Remote Direct Memory Access |
DPDK | Data Plane Development Kit | RoCE | RDMA over Converged Ethernet |
EVPN | Ethernet VPN | SR-IOV | Single Root Input Output Virtualization |
HWE | Hardware Enablement | VF | Virtual Function |
IPAM | IP Address Management | VPN | Virtual Private Network |
K8s | Kubernetes | VXLAN | Virtual eXtensible Local Area Network |
Introduction
Provisioning Kubernetes cluster with containerd container runtime for running DPDK-based workloads may become an extremely complicated task.
Proper design and software and hardware component selection may become a gating task toward successful deployment.
This guide provides a complete solution cycle including technology overview, design, component selection, and deployment steps.
The solution will be delivered on top of standard servers over the NVIDIA end-to-end Ethernet infrastructure.
In this document, we will be using the new NVIDIA Network Operator which is in charge of deploying and configuring SRIOV Device Plugin and SRIOV CNI. These components allow to run DPDK workloads on a Kubernetes Worker Node.
References
- What is K8s?
- NVIDIA Network Operator
- RDMA CNI
Data Plane Development Kit (DPDK) | NVIDIA Poll Mode Driver (PMD)
Solution Architecture
Key Components and Technologies
NVIDIA ConnectX SmartNICs
10/25/40/50/100/200 and 400G Ethernet Network Adapters
The industry-leading NVIDIA® ConnectX® family of smart network interface cards (SmartNICs) offer advanced hardware offloads and accelerations.
NVIDIA Ethernet adapters enable the highest ROI and lowest Total Cost of Ownership for hyperscale, public and private clouds, storage, machine learning, AI, big data, and telco platforms.
NVIDIA LinkX Cables
The NVIDIA® LinkX® product family of cables and transceivers provides the industry’s most complete line of 10, 25, 40, 50, 100, 200, and 400GbE in Ethernet and 100, 200 and 400Gb/s InfiniBand products for Cloud, HPC, hyperscale, Enterprise, telco, storage and artificial intelligence, data center applications.
- NVIDIA Spectrum Ethernet Switches
Flexible form-factors with 16 to 128 physical ports, supporting 1GbE through 400GbE speeds.
Based on a ground-breaking silicon technology optimized for performance and scalability, NVIDIA Spectrum switches are ideal for building high-performance, cost-effective, and efficient Cloud Data Center Networks, Ethernet Storage Fabric, and Deep Learning Interconnects.
NVIDIA combines the benefits of NVIDIA Spectrum™ switches, based on an industry-leading application-specific integrated circuit (ASIC) technology, with a wide variety of modern network operating system choices, including NVIDIA Cumulus® Linux, SONiC and NVIDIA Onyx®.
- NVIDIA Cumulus Linux
NVIDIA® Cumulus® Linux is the industry's most innovative open network operating system that allows you to automate, customize, and scale your data center network like no other.
RDMA
RDMA is a technology that allows computers in a network to exchange data without involving the processor, cache or operating system of either computer.
Like locally based DMA, RDMA improves throughput and performance and frees up compute resources.
Kubernetes
Kubernetes is an open-source container orchestration platform for deployment automation, scaling, and management of containerized applications.
- Kubespray
Kubespray is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for generic OS/Kubernetes clusters configuration management tasks and provides:- A highly available cluster
- Composable attributes
- Support for most popular Linux distributions
- NVIDIA Network Operator
An analog to the NVIDIA GPU Operator, the NVIDIA Network Operator simplifies scale-out network design for Kubernetes by automating aspects of network deployment and configuration that would otherwise require manual work. It loads the required drivers, libraries, device plugins, and CNIs on any cluster node with an NVIDIA network interface. Paired with the NVIDIA GPU Operator, the Network Operator enables GPUDirect RDMA, a key technology that accelerates cloud-native AI workloads by orders of magnitude. The NVIDIA Network Operator uses Kubernetes CRD and the Operator Framework to provision the host software needed for enabling accelerated networking.
What is containerd?
An industry-standard container runtime with an emphasis on simplicity, robustness and portability. containerd is available as a daemon for Linux and Windows. It manages the complete container lifecycle of its host system, from image transfer and storage to container execution and supervision to low-level storage to network attachments and beyond.NVIDIA PMDs
NVIDIA Poll Mode Driver (PMD) is an open-source upstream driver, embedded within dpdk.org releases, designed for fast packet processing and low latency by providing kernel bypass for receive and send and by avoiding the interrupt processing performance overhead.TRex—Realistic Traffic Generator
TRex is an open source stateful and stateless traffic generator fueled by DPDK. It generates L3-7 traffic and provides in one tool capabilities provided by commercial tools. TRex can scale up to 200Gb/sec with one server.
Logical Design
The logical design includes the following parts:
Deployment node running Kubespray that deploys Kubernetes clusters.
K8s master node running all Kubernetes management components.
K8s worker nodes.
TRex server.
High-speed Ethernet fabric for DPDK tenant network
Deployment and K8s management network.
Fabric Design
The high-performance network is a secondary network for Kubernetes cluster and required the L2 network topology.
This RDG describes a solution with multiple servers connected to a single switch that provides secondary network for the Kubernetes cluster.
A more complex scale-out network topology of multiple L2 domains is beyond the scope of this document.
Software Stack Components
Bill of Materials
The following hardware setup is utilized in this guide.

The above table does not contain the management network connectivity components.
Deployment and Configuration
Wiring
On each K8s worker node and TRex server, the first port of each NVIDIA Network Adapter is wired to the NVIDIA switch in high-performance fabric using NVIDIA LinkX DAC cables.
Deployment and Management network is part of IT infrastructure and is not covered in this guide.
Fabric
Prerequisites
- High-performance Ethernet fabric
Single switch
NVIDIA SN2100Switch OS
Cumulus Linux v4.2.1
- Deployment and management network
DNS and DHCP network services and network topology are part of the IT infrastructure. The component installation and configuration are not covered in this guide.
Network Configuration
Below are the server names with their relevant network configurations.
|
| IP and NICS | |
High-speed network | Management network 1/25 GbE | ||
Deployment node | depserver | ens4f0: DHCP | |
Master node | node1 | ens4f0: DHCP | |
Worker node | node2 | ens2f0: no IP set | ens4f0: DHCP |
Worker node | node3 | ens2f0: no IP set | ens4f0: DHCP |
TRex server | node4 | ens2f0: no IP set ens2f1: no IP set | ens4f0: DHCP 192.168.222.103 |
High-speed switch | leaf01 | mgmt0: From DHCP |
ensXf0 high-speed network interfaces do not require additional configuration.
Fabric Configuration
This solution is based on Cumulus Linux v4.2.1 switch operation system.
Intermediate-level Linux knowledge is assumed for this guide. Familiarity with basic text editing, Linux file permissions, and process monitoring is required. A variety of text editors are pre-installed, including vi and nano.
Networking engineers who are unfamiliar with Linux concepts should refer to this reference guide to compare the Cumulus Linux CLI and configuration options and their equivalent Cisco Nexus 3000 NX-OS commands and settings. There is also a series of short videos with an introduction to Linux and Cumulus-Linux-specific concepts.
A Greenfield deployment is assumed for this guide. Please refer to the following guide for Upgrading Cumulus Linux.
Fabric configuration steps:
Administratively enable all physical ports.
Create a bridge and configure one or more front panel ports as members of the bridge.
Commit configuration.
Switch configuration steps.
Linux swx-mld-l03 4.19.0-cl-1-amd64 #1 SMP Cumulus 4.19.94-1+cl4.2.1u1 (2020-08-28) x86_64 Welcome to NVIDIA Cumulus (R) Linux (R) For support and online technical documentation, visit http://www.cumulusnetworks.com/support The registered trademark Linux (R) is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. cumulus@leaf01:mgmt:~$ net add interface swp1-16 cumulus@leaf01:mgmt:~$ net add bridge bridge ports swp1-16 cumulus@leaf01:mgmt:~$ net commit
To view link status, use the net show interface all command. The following examples show the output of ports in admin down, down, and up modes.
cumulus@leaf01:mgmt:~$ net show interface all State Name Spd MTU Mode LLDP Summary ----- ------ ---- ----- ---------- ------------------------------- ------------------------ UP lo N/A 65536 Loopback IP: 127.0.0.1/8 lo IP: ::1/128 UP eth0 1G 1500 Mgmt mgmt-xxx-xxx-xxx-xxx (8) Master: mgmt(UP) eth0 IP: 192.168.222.201/24(DHCP) UP swp1 100G 9216 Access/L2 Master: bridge(UP) UP swp2 100G 9216 Access/L2 node2 (0c:42:a1:2b:74:ae) Master: bridge(UP) UP swp3 100G 9216 Access/L2 Master: bridge(UP) UP swp4 100G 9216 Access/L2 node3 (0c:42:a1:24:05:4a) Master: bridge(UP) UP swp5 100G 9216 Access/L2 Master: bridge(UP) UP swp6 100G 9216 Access/L2 node4 (0c:42:a1:24:05:1a) Master: bridge(UP) UP swp7 100G 9216 Access/L2 Master: bridge(UP) UP swp8 100G 9216 Access/L2 node4 (0c:42:a1:24:05:1b) Master: bridge(UP) DN swp9 N/A 9216 Access/L2 Master: bridge(UP) DN swp10 N/A 9216 Access/L2 Master: bridge(UP) DN swp11 N/A 9216 Access/L2 Master: bridge(UP) DN swp12 N/A 9216 Access/L2 Master: bridge(UP) DN swp13 N/A 9216 Access/L2 Master: bridge(UP) DN swp14 N/A 9216 Access/L2 Master: bridge(UP) DN swp15 N/A 9216 Access/L2 Master: bridge(UP) DN swp16 N/A 9216 Access/L2 Master: bridge(UP) UP bridge N/A 9216 Bridge/L2 UP mgmt N/A 65536 VRF IP: 127.0.0.1/8 mgmt IP: ::1/128
Nodes Configuration
General Prerequisites:
Hardware
All the K8s worker nodes have the same hardware specification (see BoM for details).Host BIOS
Verify that SR-IOV supported server platform is being used and review the BIOS settings in the server platform vendor documentation to enable SR-IOV in the BIOS.Host OS
Ubuntu Server 20.04 operating system should be installed on all servers with OpenSSH server packages.Experience with Kubernetes
Familiarization with the Kubernetes Cluster architecture is essential.
Make sure that the BIOS settings on the worker nodes servers have SR-IOV enabled and that the servers are tuned for maximum performance.
All worker nodes must have the same PCIe placement for the NIC and expose the same interface name.
Host OS Prerequisites
Make sure Ubuntu Server 20.04 operating system is installed on all servers with OpenSSH server packages and create a non-root depuser account with sudo privileges without password.
Update the Ubuntu software packages by running the following commands:
$ sudo apt-get update $ sudo apt-get upgrade -y $ sudo reboot
In this solution we added the following line to the EOF /etc/sudoers:
$ sudo vim /etc/sudoers #includedir /etc/sudoers.d #K8s cluster deployment user with sudo privileges without password depuser ALL=(ALL) NOPASSWD:ALL
NIC Firmware Upgrade
It is recommended to upgrade the NIC firmware on the worker nodes to the latest released version.
Download mlxup firmware update and query utility to each worker node and update the NIC firmware.
The most recent version of mlxup can be downloaded from the official download page. mlxup can download and update the NIC firmware to the latest firmware over the Internet.
The utility execution required sudo privileges:
# wget http://www.mellanox.com/downloads/firmware/mlxup/4.15.2/SFX/linux_x64/mlxup # chmod +x mlxup # ./mlxup -online -u
RDMA Subsystem Configuration
RDMA subsystem configuration is required on each worker node.
Instal LLDP Daemon and RDMA Core Userspace Libraries and Daemons.
Worker Node console# apt install -y lldpd rdma-core
LLDPD is a daemon able to receive and send LLDP frames. The Link Layer Discovery Protocol (LLDP) is a vendor-neutral Layer 2 protocol that allows a network device to advertise its identity and capabilities on the local network.
Identify the name of the RDMA-capable interface for high-performance K8s network.
In this guide, ens2f0 network interface for high-performance K8s network was chosen and will be activated by NVIDIA Network Operator deployment:Worker Node console# rdma link link rocep7s0f0/1 state DOWN physical_state DISABLED netdev ens2f0 link rocep7s0f1/1 state DOWN physical_state DISABLED netdev ens2f1 link rocep131s0f0/1 state ACTIVE physical_state LINK_UP netdev ens4f0 link rocep131s0f1/1 state DOWN physical_state DISABLED netdev ens4f1
Set RDMA subsystem network namespace mode to exclusive mode.
RDMA subsystem network namespace mode (netns parameter in ib_core module) in exclusive mode allows network namespace isolation for RDMA workloads on the worker node servers. Please create /etc/modprobe.d/ib_core.conf configuration file to change ib_core module parameters:/etc/modprobe.d/ib_core.conf# Set netns to exclusive mode for namespace isolation options ib_core netns_mode=0
Then re-generate the initial RAM disks and reboot servers:
Worker Node console# update-initramfs -u # reboot
After the server comes back, check netns mode:
Worker Node console# rdma system netns exclusive
K8s Cluster Deployment and Configuration
The Kubernetes cluster in this solution will be installed using Kubespray with a non-root depuser account from the deployment node.
SSH Private Key and SSH Passwordless Login
Log in to the deployment node as a deployment user (in this case, depuser) and create an SSH private key for configuring the passwordless authentication on your computer by running the following commands:
$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/depuser/.ssh/id_rsa): Created directory '/home/depuser/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/depuser/.ssh/id_rsa Your public key has been saved in /home/depuser/.ssh/id_rsa.pub The key fingerprint is: SHA256:IfcjdT/spXVHVd3n6wm1OmaWUXGuHnPmvqoXZ6WZYl0 depuser@depserver The key's randomart image is: +---[RSA 3072]----+ | *| | .*| | . o . . o=| | o + . o +E| | S o .**O| | . .o=OX=| | . o%*.| | O.o.| | .*.ooo| +----[SHA256]-----+
Copy your SSH private key, such as ~/.ssh/id_rsa, to all nodes in the deployment by running the following command (example):
$ ssh-copy-id depuser@192.168.222.111 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/depuser/.ssh/id_rsa.pub" The authenticity of host '192.168.222.111 (192.168.222.111)' can't be established. ECDSA key fingerprint is SHA256:6nhUgRlt9gY2Y2ofukUqE0ltH+derQuLsI39dFHe0Ag. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys depuser@192.168.222.111's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'depuser@192.168.222.111'" and check to make sure that only the key(s) you wanted were added.
Verify that you have passwordless SSH connectivity to all nodes in your deployment by running the following command (example):
$ ssh depuser@192.168.222.111
Kubespray Deployment and Configuration
General Setting
To install dependencies for running Kubespray with Ansible on the deployment node, please run following commands:
$ cd ~ $ sudo apt -y install python3-pip jq $ wget https://github.com/kubernetes-sigs/kubespray/archive/v2.15.0.tar.gz $ tar -zxf v2.15.0.tar.gz $ cd kubespray-2.15.0 $ sudo pip3 install -r requirements.txt
Deployment Customization
Create a new cluster configuration and host configuration file.
Replace the IP addresses below with your nodes' IP addresses:
$ cp -rfp inventory/sample inventory/mycluster $ declare -a IPS=(192.168.222.111 192.168.222.101 192.168.222.102) $ CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
As a result, the inventory/mycluster/hosts.yaml file will be created.
Review and change the host configuration in the file. Below is an example for this deployment:
all: hosts: node1: ansible_host: 192.168.222.111 ip: 192.168.222.111 access_ip: 192.168.222.111 node2: ansible_host: 192.168.222.101 ip: 192.168.222.101 access_ip: 192.168.222.101 node3: ansible_host: 192.168.222.102 ip: 192.168.222.102 access_ip: 192.168.222.102 children: kube-master: hosts: node1: kube-node: hosts: node2: node3: etcd: hosts: node1: k8s-cluster: children: kube-master: kube-node: calico-rr: hosts: {}
Review and change cluster installation parameters in the files:
- inventory/mycluster/group_vars/all/all.yml
- inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml
In inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml set a default Kubernetes CNI by setting the desired kube_network_plugin value (default: calico) parameter.
... # Choose network plugin (cilium, calico, contiv, weave or flannel. Use cni for generic cni plugin) # Can also be set to 'cloud', which lets the cloud provider setup appropriate routing kube_network_plugin: calico # Setting multi_networking to true will install Multus: https://github.com/intel/multus-cni kube_network_plugin_multus: false ...
Choice container runtime
In this guide containerd was chosen as the default container runtime in K8s cluster deployment because docker will be deprecated soon.
To use the containerd container runtime, set the following variables:
In inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml:
inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml... ## Container runtime ## docker for docker, crio for cri-o and containerd for containerd. container_manager: containerd ...
In inventory/mycluster/group_vars/all/all.yml:
inventory/mycluster/group_vars/all/all.yml... ## Experimental kubeadm etcd deployment mode. Available only for new deployment etcd_kubeadm_enabled: true ...
In inventory/mycluster/group_vars/etcd.yml:
inventory/mycluster/group_vars/etcd.yml... ## Settings for etcd deployment type etcd_deployment_type: host ...
Deploying the Cluster Using KubeSpray Ansible Playbook
Run the following line to start the deployment process:
$ ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
It takes a while for this deployment to complete, please make sure no errors are encountered.
A successful result should look something like the following:
... PLAY RECAP *********************************************************************************************************************************************************************************** localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 node1 : ok=554 changed=81 unreachable=0 failed=0 skipped=1152 rescued=0 ignored=2 node2 : ok=360 changed=42 unreachable=0 failed=0 skipped=633 rescued=0 ignored=1 node3 : ok=360 changed=42 unreachable=0 failed=0 skipped=632 rescued=0 ignored=1 Sunday 11 July 2021 22:36:04 +0000 (0:00:00.053) 0:06:51.785 ************ =============================================================================== kubernetes/kubeadm : Join to cluster ------------------------------------------------------------------------------------------------------------------------------------------------- 37.24s kubernetes/control-plane : kubeadm | Initialize first master ------------------------------------------------------------------------------------------------------------------------- 28.29s download_file | Download item -------------------------------------------------------------------------------------------------------------------------------------------------------- 16.57s kubernetes/control-plane : Master | wait for kube-scheduler -------------------------------------------------------------------------------------------------------------------------- 14.23s download_container | Download image if required -------------------------------------------------------------------------------------------------------------------------------------- 11.06s download_container | Download image if required --------------------------------------------------------------------------------------------------------------------------------------- 9.18s download_file | Download item --------------------------------------------------------------------------------------------------------------------------------------------------------- 8.61s kubernetes-apps/ansible : Kubernetes Apps | Start Resources --------------------------------------------------------------------------------------------------------------------------- 7.02s container-engine/crictl : download_file | Download item ------------------------------------------------------------------------------------------------------------------------------- 5.78s download_container | Download image if required --------------------------------------------------------------------------------------------------------------------------------------- 5.52s Configure | Check if etcd cluster is healthy ------------------------------------------------------------------------------------------------------------------------------------------ 5.24s download_file | Download item --------------------------------------------------------------------------------------------------------------------------------------------------------- 4.89s download_container | Download image if required --------------------------------------------------------------------------------------------------------------------------------------- 4.81s kubernetes-apps/ansible : Kubernetes Apps | Lay Down CoreDNS templates ---------------------------------------------------------------------------------------------------------------- 4.68s reload etcd --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 4.65s download_file | Download item --------------------------------------------------------------------------------------------------------------------------------------------------------- 4.24s kubernetes/preinstall : Get current calico cluster version ---------------------------------------------------------------------------------------------------------------------------- 3.70s network_plugin/calico : Start Calico resources ---------------------------------------------------------------------------------------------------------------------------------------- 3.42s container-engine/crictl : extract_file | Unpacking archive ---------------------------------------------------------------------------------------------------------------------------- 3.35s kubernetes-apps/cluster_roles : Apply workaround to allow all nodes with cert O=system:nodes to register ------------------------------------------------------------------------------ 3.32s
K8s Cluster Customization
Now that the K8S cluster is deployed, connect to the K8S master node with the root user account in order to customize deployment.
Label the worker nodes.
Master Node console# kubectl label nodes node2 node-role.kubernetes.io/worker= # kubectl label nodes node3 node-role.kubernetes.io/worker=
K8S Cluster Deployment Verification
Following is an output example of K8s cluster deployment information using the Calico CNI plugin.
To ensure that the Kubernetes cluster is installed correctly, run the following commands:
# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME node1 Ready master 44m v1.19.7 192.168.222.111 <none> Ubuntu 20.04.2 LTS 5.4.0-72-generic containerd://1.4.4 node2 Ready worker 42m v1.19.7 192.168.222.101 <none> Ubuntu 20.04.2 LTS 5.4.0-72-generic containerd://1.4.4 node3 Ready worker 42m v1.19.7 192.168.222.102 <none> Ubuntu 20.04.2 LTS 5.4.0-72-generic containerd://1.4.4 # kubectl -n kube-system get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-kube-controllers-8b5ff5d58-ph86x 1/1 Running 0 43m 192.168.222.101 node2 <none> <none> calico-node-l48qg 1/1 Running 0 43m 192.168.222.102 node3 <none> <none> calico-node-ldx7w 1/1 Running 0 43m 192.168.222.111 node1 <none> <none> calico-node-x9bh5 1/1 Running 0 43m 192.168.222.101 node2 <none> <none> coredns-85967d65-pslmm 1/1 Running 0 27m 10.233.96.1 node2 <none> <none> coredns-85967d65-qp2rl 1/1 Running 0 43m 10.233.90.230 node1 <none> <none> dns-autoscaler-5b7b5c9b6f-8wb67 1/1 Running 0 43m 10.233.90.229 node1 <none> <none> etcd-node1 1/1 Running 0 45m 192.168.222.111 node1 <none> <none> kube-apiserver-node1 1/1 Running 0 45m 192.168.222.111 node1 <none> <none> kube-controller-manager-node1 1/1 Running 0 45m 192.168.222.111 node1 <none> <none> kube-proxy-6p4rm 1/1 Running 0 44m 192.168.222.101 node2 <none> <none> kube-proxy-8bj6s 1/1 Running 0 44m 192.168.222.111 node1 <none> <none> kube-proxy-dj4l8 1/1 Running 0 44m 192.168.222.102 node3 <none> <none> kube-scheduler-node1 1/1 Running 0 45m 192.168.222.111 node1 <none> <none> nginx-proxy-node2 1/1 Running 0 44m 192.168.222.101 node2 <none> <none> nginx-proxy-node3 1/1 Running 0 44m 192.168.222.102 node3 <none> <none> nodelocaldns-8b6kf 1/1 Running 0 43m 192.168.222.102 node3 <none> <none> nodelocaldns-kzmmh 1/1 Running 0 43m 192.168.222.101 node2 <none> <none> nodelocaldns-zh9fz 1/1 Running 0 43m 192.168.222.111 node1 <none> <none>
NVIDIA Network Operator Installation for K8S Cluster
NVIDIA Network Operator leverages Kubernetes CRDs and Operator SDK to manage networking-related components in order to enable fast networking and RDMA for workloads in K8s cluster. The Fast Network is a secondary network of the K8s cluster for applications that require high bandwidth or low latency.
To make it work, several components need to be provisioned and configured. All operator configuration and installation steps should be performed from the K8S master node with the root user account.
Prerequisites
Install Helm.
Master Node console# snap install helm --classic
Install additional RDMA CNI plugin
RDMA CNI plugin allows network namespace isolation for RDMA workloads in a containerized environment.
Deploy CNI's using the following YAML files:Master Node console# kubectl apply -f https://raw.githubusercontent.com/Mellanox/rdma-cni/master/deployment/rdma-cni-daemonset.yaml
To ensure the plugin is installed correctly, run the following command:
Master Node console# kubectl -n kube-system get pods -o wide | egrep "rdma" kube-rdma-cni-ds-5zl8d 1/1 Running 0 11m 192.168.222.102 node3 <none> <none> kube-rdma-cni-ds-q74n5 1/1 Running 0 11m 192.168.222.101 node2 <none> <none> kube-rdma-cni-ds-rnqkr 1/1 Running 0 11m 192.168.222.111 node1 <none> <none>
Deployment
Add the NVIDIA Network Operator Helm repository:
# helm repo add mellanox https://mellanox.github.io/network-operator # helm repo update
Create the values.yaml file in user home folder (example):
nfd: enabled: true sriovNetworkOperator: enabled: true # NicClusterPolicy CR values: deployCR: true ofedDriver: deploy: false nvPeerDriver: deploy: false rdmaSharedDevicePlugin: deploy: false sriovDevicePlugin: deploy: false secondaryNetwork: deploy: true cniPlugins: deploy: true image: containernetworking-plugins repository: mellanox version: v0.8.7 imagePullSecrets: [] multus: deploy: true image: multus repository: nfvpe version: v3.6 imagePullSecrets: [] config: '' ipamPlugin: deploy: true image: whereabouts repository: mellanox version: v0.3 imagePullSecrets: []
Deploy the operator:
# helm install -f ./values.yaml -n network-operator --create-namespace --wait mellanox/network-operator --generate-name NAME: network-operator LAST DEPLOYED: Sun Jul 11 23:06:54 2021 NAMESPACE: network-operator STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: Get Network Operator deployed resources by running the following commands: $ kubectl -n network-operator get pods $ kubectl -n mlnx-network-operator-resources get pods
To ensure that the Operator is deployed correctly, run the following commands:
# kubectl -n network-operator get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES network-operator-1627211751-5bd467cbd9-2hwqx 1/1 Running 0 29h 10.233.90.5 node1 <none> <none> network-operator-1627211751-node-feature-discovery-master-dgs69 1/1 Running 0 29h 10.233.90.6 node1 <none> <none> network-operator-1627211751-node-feature-discovery-worker-7n6gs 1/1 Running 0 29h 10.233.90.3 node1 <none> <none> network-operator-1627211751-node-feature-discovery-worker-sjdxw 1/1 Running 1 29h 10.233.96.7 node2 <none> <none> network-operator-1627211751-node-feature-discovery-worker-vzpvg 1/1 Running 1 29h 10.233.92.5 node3 <none> <none> network-operator-1627211751-sriov-network-operator-5f869696sdzp 1/1 Running 0 29h 10.233.90.4 node1 <none> <none>
High-Speed Network Configuration
After installing the operator, please check the SriovNetworkNodeState CRs to see all SRIOV-enabled devices in your node.
In our deployment has been chosen network interface with name ens2f0. To review the interface status please use following command:
# kubectl -n network-operator get sriovnetworknodestates.sriovnetwork.openshift.io node2 -o yaml ... status: interfaces: - deviceID: 101d driver: mlx5_core linkSpeed: 100000 Mb/s linkType: ETH mac: 0c:42:a1:2b:74:ae mtu: 1500 name: ens2f0 pciAddress: "0000:07:00.0" totalvfs: 8 vendor: 15b3 - deviceID: 101d driver: mlx5_core linkType: ETH mac: 0c:42:a1:2b:74:af mtu: 1500 name: ens2f1 pciAddress: "0000:07:00.1" totalvfs: 8 vendor: 15b3 ...
Create SriovNetworkNodePolicy CR policy.yaml file, by specifying chosen interface in the 'nicSelector' (in this example, for the ens2f0 interface):
apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: mlnxnics namespace: network-operator spec: nodeSelector: feature.node.kubernetes.io/network-sriov.capable: "true" resourceName: mlnx2f0 priority: 98 mtu: 9000 numVfs: 8 nicSelector: vendor: "15b3" pfNames: [ "ens2f0" ] deviceType: netdevice isRdma: true
Deploy policy.yaml:
# kubectl apply -f policy.yaml
Create a SriovNetwork CR network.yaml file which refers to the 'resourceName' defined in SriovNetworkNodePolicy (in this example, referencing the mlnx2f0 resource and set 192.168.101.0/24 as CIDR range for the high-speed network):
apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetwork metadata: name: "netmlnx2f0" namespace: network-operator spec: ipam: | { "datastore": "kubernetes", "kubernetes": { "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" }, "log_file": "/tmp/whereabouts.log", "log_level": "debug", "type": "whereabouts", "range": "192.168.101.0/24" } vlan: 0 networkNamespace: "default" spoofChk: "off" resourceName: "mlnx2f0" linkState: "enable" metaPlugins: | { "type": "rdma" }
Deploy network.yaml:
# kubectl apply -f network.yaml
Validating the Deployment
Check if the deployment is finished successfully:
# kubectl -n nvidia-network-operator-resources get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cni-plugins-ds-f548q 1/1 Running 1 30m 192.168.222.101 node2 <none> <none> cni-plugins-ds-qw7hx 1/1 Running 1 30m 192.168.222.102 node3 <none> <none> kube-multus-ds-cjbf9 1/1 Running 1 30m 192.168.222.102 node3 <none> <none> kube-multus-ds-rgc95 1/1 Running 1 30m 192.168.222.101 node2 <none> <none> whereabouts-gwr7p 1/1 Running 1 30m 192.168.222.101 node2 <none> <none> whereabouts-n29nq 1/1 Running 1 30m 192.168.222.102 node3 <none> <none>
Check deployed network:
# kubectl get network-attachment-definitions.k8s.cni.cncf.io NAME AGE netmlnx2f0 4m56s
Check worker node resources:
# kubectl describe nodes node2 ... Addresses: InternalIP: 192.168.222.101 Hostname: node2 Capacity: cpu: 24 ephemeral-storage: 229698892Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 264030604Ki nvidia.com/mlnx2f0: 8 pods: 110 Allocatable: cpu: 23900m ephemeral-storage: 211690498517 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 242694540Ki nvidia.com/mlnx2f0: 8 pods: 110 ...
Manage HugePages
Kubernetes supports the allocation and consumption of pre-allocated HugePages by applications in a Pod. The nodes will automatically discover and report all HugePages resources as schedulable resources. For get additional information K8s HugePages management, please refer here.
In order to allocate, HugePages needs to modify GRUB_CMDLINE_LINUX_DEFAULT parameter in /etc/default/grub. This setting, below, allocates 1GB * 16 pages = 16GB and 2MB * 2048 pages= 4GB HugePages on boot time:
... GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=16 hugepagesz=2M hugepages=2048" ...
Run update-grub to apply the config to grub and reboot server:
# update-grub # reboot
After the server comes back, check hugepages allocation from master node by command:
# kubectl describe nodes node2 ... Capacity: cpu: 24 ephemeral-storage: 229698892Ki hugepages-1Gi: 16Gi hugepages-2Mi: 4Gi memory: 264030604Ki nvidia.com/mlnx2f0: 8 pods: 110 Allocatable: cpu: 23900m ephemeral-storage: 211690498517 hugepages-1Gi: 16Gi hugepages-2Mi: 4Gi memory: 242694540Ki nvidia.com/mlnx2f0: 8 pods: 110 ...
Enable CPU and Topology Management
CPU Manager manages groups of CPUs and constrains workloads to specific CPUs.
CPU Manager is useful for workloads that have some of these attributes:
- Require as much CPU time as possible
- Are sensitive to processor cache misses
- Are low-latency network applications
- Coordinate with other processes and benefit from sharing a single processor cache
Topology Manager uses topology information from collected hints to decide if a pod can be accepted or rejected on a node, based on the configured Topology Manager policy and Pod resources requested. In order to extract the best performance, optimizations related to CPU isolation and memory and device locality are required.
Topology Manager is useful for workloads that use hardware accelerators to support latency-critical execution and high throughput parallel computation.
To use Topology Manager, CPU Manager with static policy must be used.
For additional information, please refer to Control Topology Management Policies on a node and Control Topology Management Policies on a node.
In order to enable CPU Manager and Topology Manager, please add following lines to kubelet configuration file /etc/kubernetes/kubelet-config.yaml:
... cpuManagerPolicy: static cpuManagerReconcilePeriod: 10s topologyManagerPolicy: single-numa-node featureGates: CPUManager: true TopologyManager: true
Due to changes in cpuManagerPolicy, remove /var/lib/kubelet/cpu_manager_state and restart kubelet service on each affected K8s worker node.
# rm -f /var/lib/kubelet/cpu_manager_state # service kubelet restart
Application
DPDK traffic emulation is shown in Testbed Flow Diagram below. The traffic will be pushed from Trex Server via ens2f0 interface to TestPMD POD via SRIOV network interface net1. TestPMD POD will swap mac-address and re-routes ingress traffic via the same interface net1 to the same interface on Trex Server.
Verification
Create a sample deployment test-deployment.yaml (container image should include InfiniBand userspace drivers and performance tools):
test-deployment.yamlapiVersion: apps/v1 kind: Deployment metadata: name: mlnx-inbox-pod labels: app: sriov spec: replicas: 2 selector: matchLabels: app: sriov template: metadata: labels: app: sriov annotations: k8s.v1.cni.cncf.io/networks: netmlnx2f0 spec: containers: - image: < Container image > name: mlnx-inbox-ctr securityContext: capabilities: add: [ "IPC_LOCK" ] resources: requests: cpu: 4 nvidia.com/mlnx2f0: 1 limits: cpu: 4 nvidia.com/mlnx2f0: 1 command: - sh - -c - sleep inf
Deploy the sample deployment.
Master Node console# kubectl apply -f test-deployment.yaml
Verify the deployment is running.
Master Node console# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES mlnx-inbox-pod-599dc445c8-72x6g 1/1 Running 0 12s 10.233.96.5 node2 <none> <none> mlnx-inbox-pod-599dc445c8-v5lnx 1/1 Running 0 12s 10.233.92.4 node3 <none> <none>
Check available network interfaces in POD.
Master Node console# kubectl exec -it mlnx-inbox-pod-599dc445c8-72x6g -- bash root@mlnx-inbox-pod-599dc445c8-72x6g:/tmp# rdma link link rocep7s0f0v2/1 state ACTIVE physical_state LINK_UP netdev net1 root@mlnx-inbox-pod-599dc445c8-72x6g:/tmp# ip a s 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 4: eth0@if208: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 12:51:ab:b3:ef:26 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.233.96.5/32 brd 10.233.96.5 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::1051:abff:feb3:ef26/64 scope link valid_lft forever preferred_lft forever 201: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000 link/ether 02:40:7d:5e:5f:af brd ff:ff:ff:ff:ff:ff inet 192.168.101.2/24 brd 192.168.101.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::40:7dff:fe5e:5faf/64 scope link valid_lft forever preferred_lft forever
Run synthetic RDMA benchmark tests with ib_write_bw bandwidth and latency test using RDMA write transactions.
Server
ib_write_bw -F -d $IB_DEV_NAME --report_gbits
Client
ib_write_bw -F $SERVER_IP -d $IB_DEV_NAME --report_gbits
Please open two consoles to K8s master node—one for the server apps side and the second for the client apps side.
In a first console (server side) to K8s master node, run the following commands:Master Node console# kubectl exec -it mlnx-inbox-pod-599dc445c8-72x6g -- bash root@mlnx-inbox-pod-599dc445c8-72x6g:/tmp# ip a s net1 | grep inet inet 192.168.101.2/24 brd 192.168.101.255 scope global net1 inet6 fe80::40:7dff:fe5e:5faf/64 scope link root@mlnx-inbox-pod-599dc445c8-72x6g:/tmp# rdma link link rocep7s0f0v2/1 state ACTIVE physical_state LINK_UP netdev net1 root@mlnx-inbox-pod-599dc445c8-72x6g:/tmp# ib_write_bw -F -d rocep7s0f0v2 --report_gbits ************************************ * Waiting for client to connect... * ************************************
In a second console (client side) to K8s master node, run the following commands:
Master Node console# kubectl exec -it mlnx-inbox-pod-599dc445c8-v5lnx -- bash root@mlnx-inbox-pod-599dc445c8-v5lnx:/tmp# rdma link link rocep7s0f0v3/1 state ACTIVE physical_state LINK_UP netdev net1 root@mlnx-inbox-pod-599dc445c8-v5lnx:/tmp# ib_write_bw -F -d rocep7s0f0v3 192.168.101.2 --report_gbits --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : rocep7s0f0v3 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 4096[B] Link type : Ethernet GID index : 2 Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0000 QPN 0x01f2 PSN 0x75e7cf RKey 0x050e26 VAddr 0x007f51e51b9000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:101:01 remote address: LID 0000 QPN 0x00f2 PSN 0x13427f RKey 0x010e26 VAddr 0x007f1ecaac8000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:101:02 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] 65536 5000 94.26 92.87 0.169509 ---------------------------------------------------------------------------------------
TRex Server Deployment
In our guide used TRex package v2.87.
For detailed TRex installation and configuration guide, please refer to TRex Documentation.
TRex Installation and configuration steps done with the root user account.
Prerequisites
For the TRex server, a standard server with installed RDMA subsystem has been used.
Activate the network interfaces that been used by TRex application with netplan.
In our deployment, interfaces ens2f0 and ens2f1 are used:
# This is the network config written by 'subiquity' network: ethernets: ens4f0: dhcp4: true dhcp-identifier: mac ens2f0: {} ens2f1: {} version: 2
Then re-apply netplan and check link status for ens2f0/ens2f1 network interfaces.
# netplan apply # rdma link link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev ens2f0 link mlx5_1/1 state ACTIVE physical_state LINK_UP netdev ens2f1 link mlx5_2/1 state ACTIVE physical_state LINK_UP netdev ens4f0 link mlx5_3/1 state DOWN physical_state DISABLED netdev ens4f1
Updated MTU size for interfaces ens2f0 and ens2f1.
# ip link set ens2f0 mtu 9000 # ip link set ens2f1 mtu 9000
Installation
Create TRex working directory and obtaining the TRex package.
# cd /tmp # wget https://trex-tgn.cisco.com/trex/release/v2.87.tar.gz --no-check-certificate # mkdir /scratch # cd /scratch # tar -zxf /tmp/v2.87.tar.gz # chmod 777 -R /scratch
First-Time Scripts
The next step will continue from folder /scratch/v2.87.
Run TRex configuration script in interactive mode. Follow the instructions on the screen to create a basic config file /etc/trex_cfg.yaml:
# ./dpdk_setup_ports.py -i
The /etc/trex_cfg.yaml configuration file is created. Later we'll change it to suit our setup.
Appendix
Performance Testing
Below, a performance test is shown of DPDK traffic emulation between TRex traffic generator and TESTPMD application running on the K8s worker node, in accordance with the Testbed diagram presented above.
Prerequisites
Before starting the test, update TRex configuration file /etc/trex_cfg.yaml with a mac-address of the high-performance interface from the TESTPMD pod. Below are the steps to complete this update.
Run pod on K8s cluster with TESTPMD apps according to below presented YAML configuration file testpmd-inbox.yaml (container image should include InfiniBand userspace drivers and TESTPMD apps):
testpmd-inbox.yamlapiVersion: apps/v1 kind: Deployment metadata: name: test-deployment labels: app: test spec: replicas: 1 selector: matchLabels: app: test template: metadata: labels: app: test annotations: k8s.v1.cni.cncf.io/networks: netmlnx2f0 spec: containers: - image: < container image > name: test-pod securityContext: capabilities: add: [ "IPC_LOCK" ] volumeMounts: - mountPath: /hugepages name: hugepage resources: requests: hugepages-1Gi: 2Gi memory: 16Gi cpu: 8 nvidia.com/mlnx2f0: 1 limits: hugepages-1Gi: 2Gi memory: 16Gi cpu: 8 nvidia.com/mlnx2f0: 1 command: - sh - -c - sleep inf volumes: - name: hugepage emptyDir: medium: HugePages
Deploy the deployment with the following command:
Master Node console# kubectl apply -f testpmd-inbox.yaml
Get the network information from the deployed pod by running the following:
Master Node console# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES test-deployment-676476c78d-glbfs 1/1 Running 0 30s 10.233.92.5 node3 <none> <none> # kubectl exec -it test-deployment-676476c78d-glbfs -- ip a s net1 193: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000 link/ether 32:f9:3f:e3:dc:89 brd ff:ff:ff:ff:ff:ff inet 192.168.101.3/24 brd 192.168.101.255 scope global net1 valid_lft forever preferred_lft forever inet6 fe80::30f9:3fff:fee3:dc89/64 scope link valid_lft forever preferred_lft forever
Update TRex configuration file /etc/trex_cfg.yaml with mac-address if the NET1 network interface 32:f9:3f:e3:dc:89:
/etc/trex_cfg.yaml### Config file generated by dpdk_setup_ports.py ### - version: 2 interfaces: ['07:00.0', '0d:00.0'] port_info: - dest_mac: 32:f9:3f:e3:dc:89 # MAC OF NET1 INTERFACE src_mac: 0c:42:a1:24:05:1a - dest_mac: 32:f9:3f:e3:dc:89 # MAC OF NET1 INTERFACE src_mac: 0c:42:a1:24:05:1b platform: master_thread_id: 0 latency_thread_id: 12 dual_if: - socket: 0 threads: [1,2,3,4,5,6,7,8,9,10,11]
DPDK Emulation Test
Run TESTPMD apps in container:
Master Node console# kubectl exec -it test-deployment-676476c78d-glbfs -- bash root@test-deployment-676476c78d-glbfs:/tmp# dpdk-testpmd -c 0x1fe -m 1024 -w $PCIDEVICE_NVIDIA_COM_MLNX2F0 -- --burst=64 --txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --rss-udp --forward-mode=macswap -a -i ... testpmd>
Specific TESTPMD parameters:
$PCIDEVICE_NVIDIA_COM_MLNX2F0 - system variable PCI address of NET1
More information about additional TESTPMD parameters:
https://doc.dpdk.org/guides/testpmd_app_ug/run_app.html?highlight=testpmd
https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.htmlRun TRex traffic generator on TRex server:
TRex server console# cd /scratch/v2.87/ # ./t-rex-64 -v 7 -i -c 11 --no-ofed-check
Open second screen to TRex server and create a traffic generation file mlnx-trex.py in folder /scratch/v2.87:
mlnx-trex.pyfrom trex_stl_lib.api import * class STLS1(object): def create_stream (self): pkt = Ether()/IP(src="16.0.0.1",dst="48.0.0.1")/UDP(dport=12)/(22*'x') vm = STLScVmRaw( [ STLVmFlowVar(name="v_port", min_value=4337, max_value=5337, size=2, op="inc"), STLVmWrFlowVar(fv_name="v_port", pkt_offset= "UDP.sport" ), STLVmFixChecksumHw(l3_offset="IP",l4_offset="UDP",l4_type=CTRexVmInsFixHwCs.L4_TYPE_UDP), ] ) return STLStream(packet = STLPktBuilder(pkt = pkt ,vm = vm ) , mode = STLTXCont(pps = 8000000) ) def get_streams (self, direction = 0, **kwargs): # create 1 stream return [ self.create_stream() ] # dynamic load - used for trex console or simulator def register(): return STLS1()
After run TRex console and generate traffic to TESTPMD pod:TRex server console# cd /scratch/v2.87/ # ./trex-console Using 'python3' as Python interpeter Connecting to RPC server on localhost:4501 [SUCCESS] Connecting to publisher server on localhost:4500 [SUCCESS] Acquiring ports [0, 1]: [SUCCESS] Server Info: Server version: v2.87 @ STL Server mode: Stateless Server CPU: 11 x Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz Ports count: 2 x 100Gbps @ MT2892 Family [ConnectX-6 Dx] -=TRex Console v3.0=- Type 'help' or '?' for supported actions trex> tui<enter> ... tui> start -f mlnx-trex.py -m 45mpps -p 0 ... Global Statistitcs connection : localhost, Port 4501 total_tx_L2 : 23.9 Gbps version : STL @ v2.87 total_tx_L1 : 30.93 Gbps cpu_util. : 82.88% @ 11 cores (11 per dual port) total_rx : 25.31 Gbps rx_cpu_util. : 0.0% / 0 pps total_pps : 44.84 Mpps async_util. : 0.05% / 11.22 Kbps drop_rate : 0 bps total_cps. : 0 cps queue_full : 0 pkts ...
Summary
From the above test, it is evident that the desired traffic is 45mpps with SR-IOV network port in POD.
In order to get better results, additional application tuning is required for Trex and TESTPMD.
Done!
Authors
Vitaliy Razinkov Over the past few years, Vitaliy Razinkov has been working as a Solutions Architect on the NVIDIA Networking team, responsible for complex Kubernetes/OpenShift and Microsoft's leading solutions, research and design. He previously spent more than 25 years in senior positions at several companies. Vitaliy has written several reference design guides on Microsoft technologies, RoCE/RDMA accelerated machine learning in Kubernetes/OpenShift, and container solutions, all of which are available on the NVIDIA Networking Documentation website. |
Amir Zeidner For the past several years, Amir has worked as a Solutions Architect primarily in the Telco space, leading advanced solutions to answer 5G, NFV, and SDN networking infrastructures requirements. Amir’s expertise in data plane acceleration technologies, such as Accelerated Switching and Network Processing (ASAP²) and DPDK, together with a deep knowledge of open source cloud-based infrastructures, allows him to promote and deliver unique end-to-end NVIDIA Networking solutions throughout the Telco world. |
Related Documents