NVIDIA Docs Hub NVIDIA Networking Networking Solutions DPF Book Template - RDG for DPF with OVN-Kubernetes and HBN Services Demo Performance Tests

Performance Tests

RoCE Latency Test

Apply the following NetworkPolicy to enable stateless traffic:

stateless_netpolicy.yaml

Copy
Copied!

            
            apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: multi-port-egress
  namespace: default
  annotations:
    k8s.ovn.org/acl-stateless: "true"
spec:
  podSelector: {}
  policyTypes:
  - Egress
  - Ingress
  egress:
   - {}
  ingress:
   - {}

Jump Node Console

Copy
Copied!

            
            $ kubectl apply -f stateless_netpolicy.yaml

Create a test Deployment using the following YAML to create 2 replicas on 2 different worker nodes:

Note

The container image specified below must include NVIDIA user space drivers and perftest

testapp-performance-test-deployment.yaml

Copy
Copied!

            
            ---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: testapp-performance
  labels:
    app: testapp-performance
spec:
  replicas: 2
  selector:
    matchLabels:
      app: testapp-performance
  template:
    metadata:
      labels:
        app: testapp-performance
    spec:
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: testapp-performance
      containers:
      - name: testapp-pod
        image: <container_image>
        imagePullPolicy: Always
        command: ['sh', '-c', 'trap : TERM INT; sleep infinity & wait']
        securityContext:
          capabilities:
            add: [ "IPC_LOCK" ]
        resources:
          requests:
            cpu: '24'
            memory: '8Gi'
          limits:
            cpu: '24'
            memory: '8Gi'

Apply the resource:

Jump Node Console

Copy
Copied!

            
            $ kubectl apply -f testapp-performance-test-deployment.yaml

Validate that the deployment is running successfully:

Jump Node Console

Copy
Copied!

            
            $ kubectl get pods -o wide
NAME                                   READY   STATUS    RESTARTS   AGE   IP            NODE      NOMINATED NODE   READINESS GATES
testapp-performance-799bfd6767-4bp9h   1/1     Running   0          94s   10.233.68.3   worker2   <none>           <none>
testapp-performance-799bfd6767-gmz8f   1/1     Running   0          94s   10.233.67.3   worker1   <none>           <none>

Connect to one of the pods in the Deployment:

Jump Node Console

Copy
Copied!

            
            $ kubectl exec -it testapp-performance-799bfd6767-4bp9h -- bash

From within the container, check its IP address on its interface and see that it is recognizable as an RDMA device:

First Pod Console

Copy
Copied!

            
            root@testapp-performance-799bfd6767-4bp9h:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
132: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8940 qdisc mq state UP group default qlen 1000
    link/ether 0a:58:0a:e9:44:03 brd ff:ff:ff:ff:ff:ff permaddr 96:7d:4c:cd:19:09
    altname enp137s0f0v38
    inet 10.233.68.3/24 brd 10.233.68.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::947d:4cff:fecd:1909/64 scope link
       valid_lft forever preferred_lft forever
 
root@testapp-performance-799bfd6767-4bp9h:/# rdma link | grep eth0
link mlx5_40/1 state ACTIVE physical_state LINK_UP netdev eth0

Start the ib_read_lat server side:

First Pod Console

Copy
Copied!

            
            root@testapp-performance-799bfd6767-4bp9h:/# ib_read_lat -F -n 20000
 
************************************
* Waiting for client to connect... *
************************************

Using another console window , reconnect to the jump node and connect to the second pod in the deployment.
Jump Node Console

Copy

Copied!
```
            
            $ kubectl exec -it testapp-performance-799bfd6767-gmz8f -- bash
        
```

From within the container, start the ib_read_lat client (use the IP address from the server-side container) and check the latency results:

First Pod Console

Copy
Copied!

            
            root@testapp-performance-799bfd6767-gmz8f:/# ib_read_lat -F -n 20000 10.233.68.3
---------------------------------------------------------------------------------------
                    RDMA_Read Latency Test
 Dual-port       : OFF          Device         : mlx5_36
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 TX depth        : 1
 Mtu             : 4096[B]
 Link type       : Ethernet
 GID index       : 3
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x0a0b PSN 0x91cb30 OUT 0x10 RKey 0x075605 VAddr 0x006041c4ae0000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:233:67:03
 remote address: LID 0000 QPN 0x09eb PSN 0x6adcbb OUT 0x10 RKey 0x06a505 VAddr 0x0059cc9a982000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:233:68:03
---------------------------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]    t_avg[usec]    t_stdev[usec]   99% percentile[usec]   99.9% percentile[usec]
 2       20000          4.45           13.84        4.66               5.61             1.23            9.37                    11.25
---------------------------------------------------------------------------------------

iPerf TCP Bandwidth Test

Create a test Deployment using the YAML from the previous example to create a pod on each worker that you can use to test TCP connectivity and performance.

Note

The container image specified in the test must include iperf.

Connect to one of the pods in the deployment:

Jump Node Console

Copy
Copied!

            
            $ kubectl exec -it testapp-performance-799bfd6767-4bp9h -- bash

Before starting the iperf3 server listeners, and to be able to achieve good results, check in another tab the cores the pod is currently running on:

Note

To be able to bind to specific cores, make sure to schedule a pod in Guaranteed QoS class.

Check on which worker node the pod is running on:

Jump Node Console

Copy
Copied!

            
            $ kubectl get pods -o wide | grep 4bp9h
testapp-performance-799bfd6767-4bp9h   1/1     Running   0          12m   10.233.68.3   worker2   <none>           <none>

SSH to the worker:

Jump Node Console

Copy
Copied!

            
            depuser@jump:~$ ssh worker2  
depuser@worker2:~$ sudo -i
root@worker2:~#

Inspect the pod current cores:

Worker2 Console

Copy
Copied!

            
            root@worker2:~# crictl ps | grep testapp
805685f357d27       138fed3c17cc0       12 minutes ago      Running             testapp-pod                   0                   1c0cd6bc90cd5       testapp-performance-799bfd6767-4bp9h
root@worker2:~# crictl inspect 805685f357d27 | jq '.status.resources.linux.cpusetCpus'

Output example:

Worker2 Console

Copy
Copied!

            
            "28-51"

Back within the container of the pod, use the following script to start multiple iperf3 servers (1 for each core) on different ports:

iperf_server.sh

Copy
Copied!

            
            #!/bin/bash
 
# Cores to bind the iperf3 server processes to
CORES=$1
 
# Calculate the first_core and last_core to provide the CPU range
first_core=$(echo $CORES | cut -d "-" -f1)
last_core=$(echo $CORES | cut -d "-" -f2)
 
# Loop over the ports (5201 + i*2) for i in the given CPU range and run iperf3 servers
for i in $(seq $first_core $last_core); do
   echo "Running iperf3 server on core $i"
   taskset -c $i iperf3 -s -p $((5201 + i * 2)) > /dev/null 2>&1 &
done

Start the script using the previous CPU range (leave 1 core as a buffer):

First Pod Console

Copy
Copied!

            
            root@testapp-performance-799bfd6767-4bp9h:/# chmod +x iperf_server.sh
root@testapp-performance-799bfd6767-4bp9h:/# ./iperf_server.sh 28-50
Running iperf3 server on core 28
Running iperf3 server on core 29
 
...
...
Running iperf3 server on core 49
Running iperf3 server on core 50
 
root@testapp-performance-799bfd6767-4bp9h:/# ps -ef | grep iperf3
root          39       1  0 13:02 pts/1    00:00:00 iperf3 -s -p 5257
root          40       1  0 13:02 pts/1    00:00:00 iperf3 -s -p 5259
...
...
root          60       1  0 13:02 pts/1    00:00:00 iperf3 -s -p 5299
root          61       1  0 13:02 pts/1    00:00:00 iperf3 -s -p 5301

Connect to the second pod:

Jump Node Console

Copy
Copied!

            
            $ kubectl exec -it testapp-performance-799bfd6767-gmz8f -- bash

Follow the previously displayed method to identify the CPU cores the second pod is running on.

Use the following script to start multiple iperf3 clients that will connect to each iperf3 server in the first pod:

Note

The script receives 3 parameters: server IP to connect to, the cores it will spawn the iperf3 processes to, and the duration the iperf3 test will run. Make sure to pass all 3 when initiating the script and providing the CPU cores as a range (28-50 in this example).
jq and bc should be installed on the pod to properly run it.

iperf_client.sh

Copy
Copied!

            
            #!/bin/bash
 
# IP address of the server where iperf3 servers are running
SERVER_IP=$1  # Change to your server's IP
 
# Cores to bind the iperf3 client processes to
CORES=$2
 
# Duration to run the iperf3 test
DUR=$3
 
# Variable to accumulate the total bandwidth in Gbit/sec
total_bandwidth_Gbit=0
 
# Calculate the first_core and last_core to provide the CPU range
first_core=$(echo $CORES | cut -d "-" -f1)
last_core=$(echo $CORES | cut -d "-" -f2)
 
# Array to store the PIDs of background tasks
pids=()
 
# Loop over the ports (5201 + i*2) for i in the given CPU range
for i in $(seq $first_core $last_core); do
    port=$((5201 + i * 2))
    cpu_core=$i  # Assign CPU core based on the value of i
    output_file="iperf3_client_results_$port.log"
 
    # Run the iperf3 client in the background with CPU core binding
    timeout $(( DUR +5 )) taskset -c $cpu_core iperf3 -c $SERVER_IP -p $port -t $DUR -J > $output_file &
    pid=$!
    pids+=("$pid")
done
 
# Wait for all background tasks to complete and check their status
for pid in "${pids[@]}"; do
    wait $pid
    if [[ $? -ne 0 ]]; then
        echo "Process with PID $pid failed or timed out."
    fi
done
 
# Summarize the results from each log file
echo "Summary of iperf3 client results:"
for i in $(seq $first_core $last_core); do
    port=$((5201 + i * 2))
    output_file="iperf3_client_results_$port.log"
 
    if [[ -f $output_file ]]; then
        echo "Results for port $port:"
 
        # Parse the results and print a summary
        bandwidth_bps=$(jq '.end.sum_received.bits_per_second' $output_file)
 
        if [[ -n $bandwidth_bps ]]; then
           # Convert bandwidth from bps to Gbit/sec
           bandwidth_Gbit=$(echo "scale=3; $bandwidth_bps / 1000000000" | bc)
           echo "  Bandwidth: $bandwidth_Gbit Gbit/sec"
 
           # Accumulate the bandwidth for the total summary
           total_bandwidth_Gbit=$(echo "scale=3; $total_bandwidth_Gbit + $bandwidth_Gbit" | bc)
 
           # Delete current log file
           rm $output_file
        else
           echo "No bandwidth data found in $output_file"
        fi
 
    else
        echo "No results found for port $port"
    fi
done
 
# Print the total bandwidth summary
echo "Total Bandwidth across all streams: $total_bandwidth_Gbit Gbit/sec"

Run the script and check the performance results:

Second Pod Console

Copy
Copied!

            
            root@testapp-performance-799bfd6767-gmz8f:/# chmod +x iperf_client.sh
root@testapp-performance-799bfd6767-gmz8f:/# ./iperf_client.sh 10.233.68.3 28-50 30
Summary of iperf3 client results:
Results for port 5257:
  Bandwidth: 8.396 Gbit/sec
Results for port 5259:
  Bandwidth: 18.691 Gbit/sec
Results for port 5261:
  Bandwidth: 11.018 Gbit/sec
Results for port 5263:
  Bandwidth: 14.241 Gbit/sec
Results for port 5265:
  Bandwidth: 10.375 Gbit/sec
Results for port 5267:
  Bandwidth: 25.607 Gbit/sec
Results for port 5269:
  Bandwidth: 17.870 Gbit/sec
Results for port 5271:
  Bandwidth: 20.561 Gbit/sec
Results for port 5273:
  Bandwidth: 22.912 Gbit/sec
Results for port 5275:
  Bandwidth: 19.654 Gbit/sec
Results for port 5277:
  Bandwidth: 18.455 Gbit/sec
Results for port 5279:
  Bandwidth: 21.913 Gbit/sec
Results for port 5281:
  Bandwidth: 24.110 Gbit/sec
Results for port 5283:
  Bandwidth: 15.013 Gbit/sec
Results for port 5285:
  Bandwidth: 20.377 Gbit/sec
Results for port 5287:
  Bandwidth: 19.042 Gbit/sec
Results for port 5289:
  Bandwidth: 11.378 Gbit/sec
Results for port 5291:
  Bandwidth: 14.644 Gbit/sec
Results for port 5293:
  Bandwidth: 17.035 Gbit/sec
Results for port 5295:
  Bandwidth: 9.829 Gbit/sec
Results for port 5297:
  Bandwidth: 14.023 Gbit/sec
Results for port 5299:
  Bandwidth: 14.055 Gbit/sec
Results for port 5301:
  Bandwidth: 18.672 Gbit/sec
Total Bandwidth across all streams: 387.871 Gbit/sec

On This Page

Performance Tests

RoCE Latency Test

stateless_netpolicy.yaml

Jump Node Console

testapp-performance-test-deployment.yaml

Jump Node Console

Jump Node Console

Jump Node Console

First Pod Console

First Pod Console

Jump Node Console

First Pod Console

iPerf TCP Bandwidth Test

Jump Node Console

Jump Node Console

Jump Node Console

Worker2 Console

Worker2 Console

iperf_server.sh

First Pod Console

Jump Node Console

iperf_client.sh

Second Pod Console