Enable 25 Gigabit Ethernet on QSFP Port#
In the Jetson AGX Thor Developer Kit, the QSFP port can support both 10 Gigabit Ethernet (10GbE) and 25 Gigabit Ethernet (25GbE). The default is 10GbE.
The QSFP port can support only one configuration at a time. To enable 25GbE, edit the flashing configuration file jetson-agx-thor-devkit.conf and set ODMDATA as follows:
ODMDATA="uphy1-config-8,mgbe0-speed-3,mgbe1-speed-3,mgbe2-speed-3,mgbe3-speed-3";
For the Jetson Thor T4000 module, which has three MGBE ports, edit jetson-agx-thor-t4000.conf and set ODMDATA as follows:
ODMDATA="uphy1-config-8,mgbe0-speed-3,mgbe1-speed-3,mgbe2-speed-3";
Then reflash the device.
How to Optimize the Performance of 25GbE#
Run the following commands to enable NAPI for 25GbE on each port.
echo 1 > /sys/devices/platform/bus@0/a808a10000.ethernet/net/mgbe0_0/threaded
echo 1 > /sys/devices/platform/bus@0/a808e10000.ethernet/net/mgbe3_0/threaded
echo 1 > /sys/devices/platform/bus@0/a808b10000.ethernet/net/mgbe1_0/threaded
echo 1 > /sys/devices/platform/bus@0/a808d10000.ethernet/net/mgbe2_0/threaded
Optimizing Performance of 25 Gigabit Ethernet on QSFP Port#
It is possible to achieve up to 100 Gbps network throughput using the 4 QSFP ports available in the Jetson AGX Thor Developer Kit.
Connect two Jetson AGX Thor Developer Kits to each other back-to-back through the QSFP port. Make sure that it is a 4X25Gbps cable.
For testing 4×10Gbps, the QSFP cable used was:
6COMGIGA 40GBASE-SR4 QSFP+ 850nm 150m DDM MMF Transceiver, 40 Gigabit QSFP+ MPO Multi-Mode Module Compatible with Cisco QSFP-40G-SR4
https://www.amazon.com/dp/B0CXXC9LYV?ref_=pe_153780060_1227138240&th=1
For testing 4×25Gbps, the QSFP cable used was:
MCP1600 QSFP28 Copper Cable
Perform these steps after the Linux kernel boots up.
Set up 4 MGBE network interfaces with IP addresses and 9K MTU.
sudo ifconfig mgbe0_0 down mtu 9000 up sudo ifconfig mgbe1_0 down mtu 9000 up sudo ifconfig mgbe2_0 down mtu 9000 up sudo ifconfig mgbe3_0 down mtu 9000 up
Enable Linux kernel threaded NAPI mode.
sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808a10000.ethernet/net/mgbe0_0/threaded" sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808b10000.ethernet/net/mgbe1_0/threaded" sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808d10000.ethernet/net/mgbe2_0/threaded" sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808e10000.ethernet/net/mgbe3_0/threaded"
Boost clocks on the Jetson AGX Thor CPU.
sudo jetson_clocksEnable deferred DMA unmap for MGBE.
# MGBE0_0 GROUP=$(basename $(readlink /sys/class/net/mgbe0_0/device/iommu_group)) ls /sys/kernel/iommu_groups/$GROUP/devices echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type echo GROUP=$GROUP cat /sys/kernel/iommu_groups/$GROUP/type # MGBE1_0 GROUP=$(basename $(readlink /sys/class/net/mgbe1_0/device/iommu_group)) ls /sys/kernel/iommu_groups/$GROUP/devices echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type echo GROUP=$GROUP cat /sys/kernel/iommu_groups/$GROUP/type # MGBE2_0 GROUP=$(basename $(readlink /sys/class/net/mgbe2_0/device/iommu_group)) ls /sys/kernel/iommu_groups/$GROUP/devices echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type echo GROUP=$GROUP cat /sys/kernel/iommu_groups/$GROUP/type # MGBE3_0 GROUP=$(basename $(readlink /sys/class/net/mgbe3_0/device/iommu_group)) ls /sys/kernel/iommu_groups/$GROUP/devices echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type echo GROUP=$GROUP cat /sys/kernel/iommu_groups/$GROUP/type
If dynamic domain update does not work, unbind the driver, set the IOMMU group type, and bind the driver again.
Run the network receiver application on the 4 MGBE network interfaces. Replace
192.168.100.10with the IP address of the Jetson board that receives data.# Use -S on the iperf3 client to send using different TX DMA channels. sudo killall iperf3 iperf3 -s -B 192.168.100.10 -p 10010 & iperf3 -s -B 192.168.100.10 -p 10011 & iperf3 -s -B 192.168.100.10 -p 10012 & iperf3 -s -B 192.168.100.10 -p 10013 & iperf3 -s -B 192.168.100.10 -p 10014 & iperf3 -s -B 192.168.100.10 -p 10015 & iperf3 -s -B 192.168.100.10 -p 10016 & iperf3 -s -B 192.168.100.10 -p 10017 & iperf3 -s -B 192.168.101.10 -p 10110 & iperf3 -s -B 192.168.101.10 -p 10111 & iperf3 -s -B 192.168.101.10 -p 10112 & iperf3 -s -B 192.168.101.10 -p 10113 & iperf3 -s -B 192.168.101.10 -p 10114 & iperf3 -s -B 192.168.101.10 -p 10115 & iperf3 -s -B 192.168.101.10 -p 10116 & iperf3 -s -B 192.168.101.10 -p 10117 & iperf3 -s -B 192.168.102.10 -p 10210 & iperf3 -s -B 192.168.102.10 -p 10211 & iperf3 -s -B 192.168.102.10 -p 10212 & iperf3 -s -B 192.168.102.10 -p 10213 & iperf3 -s -B 192.168.102.10 -p 10214 & iperf3 -s -B 192.168.102.10 -p 10215 & iperf3 -s -B 192.168.102.10 -p 10216 & iperf3 -s -B 192.168.102.10 -p 10217 & iperf3 -s -B 192.168.103.10 -p 10310 & iperf3 -s -B 192.168.103.10 -p 10311 & iperf3 -s -B 192.168.103.10 -p 10312 & iperf3 -s -B 192.168.103.10 -p 10313 & iperf3 -s -B 192.168.103.10 -p 10314 & iperf3 -s -B 192.168.103.10 -p 10315 & iperf3 -s -B 192.168.103.10 -p 10316 & iperf3 -s -B 192.168.103.10 -p 10317 &
Run the network transmitter application on the 4 MGBE network interfaces. Use multiple DMA channels while transmitting, and use multiple sockets for each MGBE network interface and TX DMA channel combination. Replace
192.168.100.10with the IP address of the Jetson board that receives data.Create a shell script named
iperf3_tx.shwith the following commands:#!/bin/bash # iperf3 client (TCP TX) # Use -S to send over multiple TX DMA channels. # MGBE0_0 iperf3 -c 192.168.100.10 -p 10010 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.100.10 -p 10011 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.100.10 -p 10012 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.100.10 -p 10013 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.100.10 -p 10014 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.100.10 -p 10015 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.100.10 -p 10016 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.100.10 -p 10017 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 & # MGBE1_0 iperf3 -c 192.168.101.10 -p 10110 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.101.10 -p 10111 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.101.10 -p 10112 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.101.10 -p 10113 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.101.10 -p 10114 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.101.10 -p 10115 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.101.10 -p 10116 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.101.10 -p 10117 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 & # MGBE2_0 iperf3 -c 192.168.102.10 -p 10210 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.102.10 -p 10211 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.102.10 -p 10212 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.102.10 -p 10213 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.102.10 -p 10214 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.102.10 -p 10215 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.102.10 -p 10216 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.102.10 -p 10217 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 & # MGBE3_0 iperf3 -c 192.168.103.10 -p 10310 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.103.10 -p 10311 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.103.10 -p 10312 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.103.10 -p 10313 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.103.10 -p 10314 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.103.10 -p 10315 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.103.10 -p 10316 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 & iperf3 -c 192.168.103.10 -p 10317 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
For 100 Gbps total TCP throughput, run:
./iperf3_tx.sh -b 0 -P 24 | grep send | grep SUM
For 100 Gbps total UDP throughput, run:
./iperf3_tx.sh -u -b 0 -P 24 | grep send | grep SUM
Verifying 4×25 Gbps Aggregate Throughput on the QSFP Port#
Prerequisite: Follow the instructions in Optimizing Performance of 25 Gigabit Ethernet on QSFP Port.
UDP (4×25Gbps, 9K MTU)#
For UDP, with 4×25Gbps network interfaces set to 9K MTU, run the following command:
./iperf3_tx.sh -u -b 0 -P 24 | grep send | grep SUM
The approximate expected throughput is 102 Gbps, using 4 MGBE interfaces, 8 IP TOS traffic classes, and 24 sockets per MGBE/IP TOS combination, for a total of 768 sockets.
Add the Gbits/sec and Mbits/sec values for all 32 sender lines. For example, 3.49 + 2.88 + 3.44 + … + 3.49 should total approximately 100 Gbps.
Sample sender output:
[SUM] 0.00-10.00 sec 4.06 GBytes 3.49 Gbits/sec 0.000 ms 0/489270 (0%) sender
[SUM] 0.00-10.00 sec 3.35 GBytes 2.88 Gbits/sec 0.000 ms 0/403597 (0%) sender
[SUM] 0.00-10.00 sec 4.00 GBytes 3.44 Gbits/sec 0.000 ms 0/482017 (0%) sender
[SUM] 0.00-10.00 sec 3.11 GBytes 2.67 Gbits/sec 0.000 ms 0/374286 (0%) sender
[SUM] 0.00-10.00 sec 3.69 GBytes 3.17 Gbits/sec 0.000 ms 0/444346 (0%) sender
[SUM] 0.00-10.00 sec 3.31 GBytes 2.84 Gbits/sec 0.000 ms 0/398844 (0%) sender
[SUM] 0.00-10.00 sec 3.12 GBytes 2.68 Gbits/sec 0.000 ms 0/375272 (0%) sender
[SUM] 0.00-10.00 sec 3.87 GBytes 3.32 Gbits/sec 0.000 ms 0/465811 (0%) sender
[SUM] 0.00-10.00 sec 3.12 GBytes 2.68 Gbits/sec 0.000 ms 0/376248 (0%) sender
[SUM] 0.00-10.00 sec 2.95 GBytes 2.53 Gbits/sec 0.000 ms 0/354885 (0%) sender
[SUM] 0.00-10.00 sec 2.94 GBytes 2.53 Gbits/sec 0.000 ms 0/354731 (0%) sender
[SUM] 0.00-10.01 sec 3.67 GBytes 3.15 Gbits/sec 0.000 ms 0/441685 (0%) sender
[SUM] 0.00-10.01 sec 4.28 GBytes 3.67 Gbits/sec 0.000 ms 0/514980 (0%) sender
[SUM] 0.00-10.00 sec 3.44 GBytes 2.96 Gbits/sec 0.000 ms 0/414616 (0%) sender
[SUM] 0.00-10.01 sec 3.24 GBytes 2.78 Gbits/sec 0.000 ms 0/390280 (0%) sender
[SUM] 0.00-10.00 sec 3.38 GBytes 2.90 Gbits/sec 0.000 ms 0/407229 (0%) sender
[SUM] 0.00-10.01 sec 11.0 GBytes 9.42 Gbits/sec 0.000 ms 0/1321898 (0%) sender
[SUM] 0.00-10.01 sec 3.23 GBytes 2.77 Gbits/sec 0.000 ms 0/389316 (0%) sender
[SUM] 0.00-10.00 sec 3.60 GBytes 3.09 Gbits/sec 0.000 ms 0/433871 (0%) sender
[SUM] 0.00-10.00 sec 3.05 GBytes 2.62 Gbits/sec 0.000 ms 0/367969 (0%) sender
[SUM] 0.00-10.00 sec 3.24 GBytes 2.78 Gbits/sec 0.000 ms 0/390430 (0%) sender
[SUM] 0.00-10.00 sec 3.11 GBytes 2.67 Gbits/sec 0.000 ms 0/374839 (0%) sender
[SUM] 0.00-10.00 sec 3.31 GBytes 2.84 Gbits/sec 0.000 ms 0/398356 (0%) sender
[SUM] 0.00-10.01 sec 3.80 GBytes 3.26 Gbits/sec 0.000 ms 0/457882 (0%) sender
[SUM] 0.00-10.00 sec 3.24 GBytes 2.78 Gbits/sec 0.000 ms 0/389838 (0%) sender
[SUM] 0.00-10.00 sec 3.44 GBytes 2.95 Gbits/sec 0.000 ms 0/414366 (0%) sender
[SUM] 0.00-10.00 sec 3.14 GBytes 2.69 Gbits/sec 0.000 ms 0/378008 (0%) sender
[SUM] 0.00-10.01 sec 3.41 GBytes 2.93 Gbits/sec 0.000 ms 0/411320 (0%) sender
[SUM] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec 0.000 ms 0/739800 (0%) sender
[SUM] 0.00-10.01 sec 3.23 GBytes 2.77 Gbits/sec 0.000 ms 0/388815 (0%) sender
[SUM] 0.00-10.00 sec 3.68 GBytes 3.16 Gbits/sec 0.000 ms 0/443548 (0%) sender
[SUM] 0.00-10.00 sec 4.06 GBytes 3.49 Gbits/sec 0.000 ms 0/489385 (0%) sender
TCP (4×25 Gbps, 9K MTU)#
For TCP, with 4×25Gbps network interfaces set to 9K MTU, run the following command:
./iperf3_tx.sh -b 0 -P 24 | grep send | grep SUM
The approximate expected throughput is 109.36 Gbps, using 4 MGBE interfaces, 8 IP TOS traffic classes, and 24 sockets per MGBE/IP TOS combination, for a total of 768 sockets.
Add the Gbits/sec and Mbits/sec values for all 32 sender lines. For example, 3.35 + 2.21 + 2.69 + … + 2.26 should total approximately 100 Gbps.
Sample sender output:
[SUM] 0.00-10.00 sec 3.90 GBytes 3.35 Gbits/sec 18 sender
[SUM] 0.00-10.01 sec 2.57 GBytes 2.21 Gbits/sec 1 sender
[SUM] 0.00-10.00 sec 3.13 GBytes 2.69 Gbits/sec 7 sender
[SUM] 0.00-10.01 sec 4.47 GBytes 3.84 Gbits/sec 21 sender
[SUM] 0.00-10.00 sec 5.33 GBytes 4.58 Gbits/sec 17 sender
[SUM] 0.00-10.00 sec 7.02 GBytes 6.03 Gbits/sec 79 sender
[SUM] 0.00-10.01 sec 2.69 GBytes 2.30 Gbits/sec 2 sender
[SUM] 0.00-10.01 sec 2.78 GBytes 2.39 Gbits/sec 5 sender
[SUM] 0.00-10.00 sec 2.66 GBytes 2.28 Gbits/sec 2 sender
[SUM] 0.00-10.05 sec 2.88 GBytes 2.46 Gbits/sec 5 sender
[SUM] 0.00-10.02 sec 7.19 GBytes 6.17 Gbits/sec 93 sender
[SUM] 0.00-10.00 sec 5.02 GBytes 4.31 Gbits/sec 58 sender
[SUM] 0.00-10.00 sec 3.06 GBytes 2.63 Gbits/sec 0 sender
[SUM] 0.00-10.05 sec 2.88 GBytes 2.46 Gbits/sec 3 sender
[SUM] 0.00-10.01 sec 3.07 GBytes 2.63 Gbits/sec 6 sender
[SUM] 0.00-10.04 sec 2.60 GBytes 2.22 Gbits/sec 4 sender
[SUM] 0.00-10.01 sec 3.30 GBytes 2.84 Gbits/sec 6 sender
[SUM] 0.00-10.00 sec 3.27 GBytes 2.81 Gbits/sec 5 sender
[SUM] 0.00-10.00 sec 2.74 GBytes 2.35 Gbits/sec 4 sender
[SUM] 0.00-10.01 sec 4.17 GBytes 3.58 Gbits/sec 14 sender
[SUM] 0.00-10.01 sec 2.55 GBytes 2.19 Gbits/sec 3 sender
[SUM] 0.00-10.01 sec 3.02 GBytes 2.59 Gbits/sec 3 sender
[SUM] 0.00-10.00 sec 2.74 GBytes 2.35 Gbits/sec 8 sender
[SUM] 0.00-10.01 sec 2.64 GBytes 2.26 Gbits/sec 9 sender
[SUM] 0.00-10.04 sec 2.64 GBytes 2.26 Gbits/sec 0 sender
[SUM] 0.00-10.01 sec 3.15 GBytes 2.70 Gbits/sec 5 sender
[SUM] 0.00-10.04 sec 3.00 GBytes 2.56 Gbits/sec 2 sender
[SUM] 0.00-10.00 sec 2.63 GBytes 2.26 Gbits/sec 4 sender
[SUM] 0.00-10.01 sec 2.79 GBytes 2.39 Gbits/sec 8 sender
[SUM] 0.00-10.00 sec 2.67 GBytes 2.29 Gbits/sec 4 sender
[SUM] 0.00-10.01 sec 3.28 GBytes 2.82 Gbits/sec 2 sender
[SUM] 0.00-10.01 sec 2.64 GBytes 2.26 Gbits/sec 11 sender