Enable 25 Gigabit Ethernet on QSFP Port#

In the Jetson AGX Thor Developer Kit, the QSFP port can support both 10 Gigabit Ethernet (10GbE) and 25 Gigabit Ethernet (25GbE). The default is 10GbE.

The QSFP port can support only one configuration at a time. To enable 25GbE, edit the flashing configuration file jetson-agx-thor-devkit.conf and set ODMDATA as follows:

ODMDATA="uphy1-config-8,mgbe0-speed-3,mgbe1-speed-3,mgbe2-speed-3,mgbe3-speed-3";

For the Jetson Thor T4000 module, which has three MGBE ports, edit jetson-agx-thor-t4000.conf and set ODMDATA as follows:

ODMDATA="uphy1-config-8,mgbe0-speed-3,mgbe1-speed-3,mgbe2-speed-3";

Then reflash the device.

How to Optimize the Performance of 25GbE#

Run the following commands to enable NAPI for 25GbE on each port.

echo 1 > /sys/devices/platform/bus@0/a808a10000.ethernet/net/mgbe0_0/threaded
echo 1 > /sys/devices/platform/bus@0/a808e10000.ethernet/net/mgbe3_0/threaded
echo 1 > /sys/devices/platform/bus@0/a808b10000.ethernet/net/mgbe1_0/threaded
echo 1 > /sys/devices/platform/bus@0/a808d10000.ethernet/net/mgbe2_0/threaded

Optimizing Performance of 25 Gigabit Ethernet on QSFP Port#

It is possible to achieve up to 100 Gbps network throughput using the 4 QSFP ports available in the Jetson AGX Thor Developer Kit.

Connect two Jetson AGX Thor Developer Kits to each other back-to-back through the QSFP port. Make sure that it is a 4X25Gbps cable.

For testing 4×10Gbps, the QSFP cable used was:

For testing 4×25Gbps, the QSFP cable used was:

  • MCP1600 QSFP28 Copper Cable

Perform these steps after the Linux kernel boots up.

  1. Set up 4 MGBE network interfaces with IP addresses and 9K MTU.

    sudo ifconfig mgbe0_0 down mtu 9000 up
    sudo ifconfig mgbe1_0 down mtu 9000 up
    sudo ifconfig mgbe2_0 down mtu 9000 up
    sudo ifconfig mgbe3_0 down mtu 9000 up
    
  2. Enable Linux kernel threaded NAPI mode.

    sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808a10000.ethernet/net/mgbe0_0/threaded"
    sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808b10000.ethernet/net/mgbe1_0/threaded"
    sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808d10000.ethernet/net/mgbe2_0/threaded"
    sudo sh -c "echo 1 > /sys/devices/platform/bus@0/a808e10000.ethernet/net/mgbe3_0/threaded"
    
  3. Boost clocks on the Jetson AGX Thor CPU.

    sudo jetson_clocks
    
  4. Enable deferred DMA unmap for MGBE.

    # MGBE0_0
    GROUP=$(basename $(readlink /sys/class/net/mgbe0_0/device/iommu_group))
    ls /sys/kernel/iommu_groups/$GROUP/devices
    echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type
    echo GROUP=$GROUP
    cat /sys/kernel/iommu_groups/$GROUP/type
    
    # MGBE1_0
    GROUP=$(basename $(readlink /sys/class/net/mgbe1_0/device/iommu_group))
    ls /sys/kernel/iommu_groups/$GROUP/devices
    echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type
    echo GROUP=$GROUP
    cat /sys/kernel/iommu_groups/$GROUP/type
    
    # MGBE2_0
    GROUP=$(basename $(readlink /sys/class/net/mgbe2_0/device/iommu_group))
    ls /sys/kernel/iommu_groups/$GROUP/devices
    echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type
    echo GROUP=$GROUP
    cat /sys/kernel/iommu_groups/$GROUP/type
    
    # MGBE3_0
    GROUP=$(basename $(readlink /sys/class/net/mgbe3_0/device/iommu_group))
    ls /sys/kernel/iommu_groups/$GROUP/devices
    echo DMA-FQ | sudo tee /sys/kernel/iommu_groups/$GROUP/type
    echo GROUP=$GROUP
    cat /sys/kernel/iommu_groups/$GROUP/type
    

    If dynamic domain update does not work, unbind the driver, set the IOMMU group type, and bind the driver again.

  5. Run the network receiver application on the 4 MGBE network interfaces. Replace 192.168.100.10 with the IP address of the Jetson board that receives data.

    # Use -S on the iperf3 client to send using different TX DMA channels.
    sudo killall iperf3
    
    iperf3 -s -B 192.168.100.10 -p 10010 &
    iperf3 -s -B 192.168.100.10 -p 10011 &
    iperf3 -s -B 192.168.100.10 -p 10012 &
    iperf3 -s -B 192.168.100.10 -p 10013 &
    iperf3 -s -B 192.168.100.10 -p 10014 &
    iperf3 -s -B 192.168.100.10 -p 10015 &
    iperf3 -s -B 192.168.100.10 -p 10016 &
    iperf3 -s -B 192.168.100.10 -p 10017 &
    
    iperf3 -s -B 192.168.101.10 -p 10110 &
    iperf3 -s -B 192.168.101.10 -p 10111 &
    iperf3 -s -B 192.168.101.10 -p 10112 &
    iperf3 -s -B 192.168.101.10 -p 10113 &
    iperf3 -s -B 192.168.101.10 -p 10114 &
    iperf3 -s -B 192.168.101.10 -p 10115 &
    iperf3 -s -B 192.168.101.10 -p 10116 &
    iperf3 -s -B 192.168.101.10 -p 10117 &
    
    iperf3 -s -B 192.168.102.10 -p 10210 &
    iperf3 -s -B 192.168.102.10 -p 10211 &
    iperf3 -s -B 192.168.102.10 -p 10212 &
    iperf3 -s -B 192.168.102.10 -p 10213 &
    iperf3 -s -B 192.168.102.10 -p 10214 &
    iperf3 -s -B 192.168.102.10 -p 10215 &
    iperf3 -s -B 192.168.102.10 -p 10216 &
    iperf3 -s -B 192.168.102.10 -p 10217 &
    
    iperf3 -s -B 192.168.103.10 -p 10310 &
    iperf3 -s -B 192.168.103.10 -p 10311 &
    iperf3 -s -B 192.168.103.10 -p 10312 &
    iperf3 -s -B 192.168.103.10 -p 10313 &
    iperf3 -s -B 192.168.103.10 -p 10314 &
    iperf3 -s -B 192.168.103.10 -p 10315 &
    iperf3 -s -B 192.168.103.10 -p 10316 &
    iperf3 -s -B 192.168.103.10 -p 10317 &
    
  6. Run the network transmitter application on the 4 MGBE network interfaces. Use multiple DMA channels while transmitting, and use multiple sockets for each MGBE network interface and TX DMA channel combination. Replace 192.168.100.10 with the IP address of the Jetson board that receives data.

    Create a shell script named iperf3_tx.sh with the following commands:

    #!/bin/bash
    
    # iperf3 client (TCP TX)
    # Use -S to send over multiple TX DMA channels.
    
    # MGBE0_0
    iperf3 -c 192.168.100.10 -p 10010 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.100.10 -p 10011 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.100.10 -p 10012 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.100.10 -p 10013 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.100.10 -p 10014 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.100.10 -p 10015 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.100.10 -p 10016 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.100.10 -p 10017 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    
    # MGBE1_0
    iperf3 -c 192.168.101.10 -p 10110 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.101.10 -p 10111 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.101.10 -p 10112 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.101.10 -p 10113 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.101.10 -p 10114 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.101.10 -p 10115 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.101.10 -p 10116 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.101.10 -p 10117 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    
    # MGBE2_0
    iperf3 -c 192.168.102.10 -p 10210 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.102.10 -p 10211 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.102.10 -p 10212 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.102.10 -p 10213 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.102.10 -p 10214 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.102.10 -p 10215 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.102.10 -p 10216 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.102.10 -p 10217 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    
    # MGBE3_0
    iperf3 -c 192.168.103.10 -p 10310 -i 1 -t 10 -S 0 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.103.10 -p 10311 -i 1 -t 10 -S 32 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.103.10 -p 10312 -i 1 -t 10 -S 64 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.103.10 -p 10313 -i 1 -t 10 -S 96 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.103.10 -p 10314 -i 1 -t 10 -S 128 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.103.10 -p 10315 -i 1 -t 10 -S 160 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.103.10 -p 10316 -i 1 -t 10 -S 192 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    iperf3 -c 192.168.103.10 -p 10317 -i 1 -t 10 -S 224 $1 $2 $3 $4 $5 $6 $7 $8 $9 &
    

    For 100 Gbps total TCP throughput, run:

    ./iperf3_tx.sh -b 0 -P 24 | grep send | grep SUM
    

    For 100 Gbps total UDP throughput, run:

    ./iperf3_tx.sh -u -b 0 -P 24 | grep send | grep SUM
    

Verifying 4×25 Gbps Aggregate Throughput on the QSFP Port#

Prerequisite: Follow the instructions in Optimizing Performance of 25 Gigabit Ethernet on QSFP Port.

UDP (4×25Gbps, 9K MTU)#

For UDP, with 4×25Gbps network interfaces set to 9K MTU, run the following command:

./iperf3_tx.sh -u -b 0 -P 24 | grep send | grep SUM

The approximate expected throughput is 102 Gbps, using 4 MGBE interfaces, 8 IP TOS traffic classes, and 24 sockets per MGBE/IP TOS combination, for a total of 768 sockets.

Add the Gbits/sec and Mbits/sec values for all 32 sender lines. For example, 3.49 + 2.88 + 3.44 + … + 3.49 should total approximately 100 Gbps.

Sample sender output:

[SUM]   0.00-10.00  sec  4.06 GBytes  3.49 Gbits/sec  0.000 ms  0/489270 (0%)  sender
[SUM]   0.00-10.00  sec  3.35 GBytes  2.88 Gbits/sec  0.000 ms  0/403597 (0%)  sender
[SUM]   0.00-10.00  sec  4.00 GBytes  3.44 Gbits/sec  0.000 ms  0/482017 (0%)  sender
[SUM]   0.00-10.00  sec  3.11 GBytes  2.67 Gbits/sec  0.000 ms  0/374286 (0%)  sender
[SUM]   0.00-10.00  sec  3.69 GBytes  3.17 Gbits/sec  0.000 ms  0/444346 (0%)  sender
[SUM]   0.00-10.00  sec  3.31 GBytes  2.84 Gbits/sec  0.000 ms  0/398844 (0%)  sender
[SUM]   0.00-10.00  sec  3.12 GBytes  2.68 Gbits/sec  0.000 ms  0/375272 (0%)  sender
[SUM]   0.00-10.00  sec  3.87 GBytes  3.32 Gbits/sec  0.000 ms  0/465811 (0%)  sender
[SUM]   0.00-10.00  sec  3.12 GBytes  2.68 Gbits/sec  0.000 ms  0/376248 (0%)  sender
[SUM]   0.00-10.00  sec  2.95 GBytes  2.53 Gbits/sec  0.000 ms  0/354885 (0%)  sender
[SUM]   0.00-10.00  sec  2.94 GBytes  2.53 Gbits/sec  0.000 ms  0/354731 (0%)  sender
[SUM]   0.00-10.01  sec  3.67 GBytes  3.15 Gbits/sec  0.000 ms  0/441685 (0%)  sender
[SUM]   0.00-10.01  sec  4.28 GBytes  3.67 Gbits/sec  0.000 ms  0/514980 (0%)  sender
[SUM]   0.00-10.00  sec  3.44 GBytes  2.96 Gbits/sec  0.000 ms  0/414616 (0%)  sender
[SUM]   0.00-10.01  sec  3.24 GBytes  2.78 Gbits/sec  0.000 ms  0/390280 (0%)  sender
[SUM]   0.00-10.00  sec  3.38 GBytes  2.90 Gbits/sec  0.000 ms  0/407229 (0%)  sender
[SUM]   0.00-10.01  sec  11.0 GBytes  9.42 Gbits/sec  0.000 ms  0/1321898 (0%)  sender
[SUM]   0.00-10.01  sec  3.23 GBytes  2.77 Gbits/sec  0.000 ms  0/389316 (0%)  sender
[SUM]   0.00-10.00  sec  3.60 GBytes  3.09 Gbits/sec  0.000 ms  0/433871 (0%)  sender
[SUM]   0.00-10.00  sec  3.05 GBytes  2.62 Gbits/sec  0.000 ms  0/367969 (0%)  sender
[SUM]   0.00-10.00  sec  3.24 GBytes  2.78 Gbits/sec  0.000 ms  0/390430 (0%)  sender
[SUM]   0.00-10.00  sec  3.11 GBytes  2.67 Gbits/sec  0.000 ms  0/374839 (0%)  sender
[SUM]   0.00-10.00  sec  3.31 GBytes  2.84 Gbits/sec  0.000 ms  0/398356 (0%)  sender
[SUM]   0.00-10.01  sec  3.80 GBytes  3.26 Gbits/sec  0.000 ms  0/457882 (0%)  sender
[SUM]   0.00-10.00  sec  3.24 GBytes  2.78 Gbits/sec  0.000 ms  0/389838 (0%)  sender
[SUM]   0.00-10.00  sec  3.44 GBytes  2.95 Gbits/sec  0.000 ms  0/414366 (0%)  sender
[SUM]   0.00-10.00  sec  3.14 GBytes  2.69 Gbits/sec  0.000 ms  0/378008 (0%)  sender
[SUM]   0.00-10.01  sec  3.41 GBytes  2.93 Gbits/sec  0.000 ms  0/411320 (0%)  sender
[SUM]   0.00-10.00  sec  6.14 GBytes  5.27 Gbits/sec  0.000 ms  0/739800 (0%)  sender
[SUM]   0.00-10.01  sec  3.23 GBytes  2.77 Gbits/sec  0.000 ms  0/388815 (0%)  sender
[SUM]   0.00-10.00  sec  3.68 GBytes  3.16 Gbits/sec  0.000 ms  0/443548 (0%)  sender
[SUM]   0.00-10.00  sec  4.06 GBytes  3.49 Gbits/sec  0.000 ms  0/489385 (0%)  sender

TCP (4×25 Gbps, 9K MTU)#

For TCP, with 4×25Gbps network interfaces set to 9K MTU, run the following command:

./iperf3_tx.sh -b 0 -P 24 | grep send | grep SUM

The approximate expected throughput is 109.36 Gbps, using 4 MGBE interfaces, 8 IP TOS traffic classes, and 24 sockets per MGBE/IP TOS combination, for a total of 768 sockets.

Add the Gbits/sec and Mbits/sec values for all 32 sender lines. For example, 3.35 + 2.21 + 2.69 + … + 2.26 should total approximately 100 Gbps.

Sample sender output:

[SUM]   0.00-10.00  sec  3.90 GBytes  3.35 Gbits/sec   18             sender
[SUM]   0.00-10.01  sec  2.57 GBytes  2.21 Gbits/sec    1             sender
[SUM]   0.00-10.00  sec  3.13 GBytes  2.69 Gbits/sec    7             sender
[SUM]   0.00-10.01  sec  4.47 GBytes  3.84 Gbits/sec   21             sender
[SUM]   0.00-10.00  sec  5.33 GBytes  4.58 Gbits/sec   17             sender
[SUM]   0.00-10.00  sec  7.02 GBytes  6.03 Gbits/sec   79             sender
[SUM]   0.00-10.01  sec  2.69 GBytes  2.30 Gbits/sec    2             sender
[SUM]   0.00-10.01  sec  2.78 GBytes  2.39 Gbits/sec    5             sender
[SUM]   0.00-10.00  sec  2.66 GBytes  2.28 Gbits/sec    2             sender
[SUM]   0.00-10.05  sec  2.88 GBytes  2.46 Gbits/sec    5             sender
[SUM]   0.00-10.02  sec  7.19 GBytes  6.17 Gbits/sec   93             sender
[SUM]   0.00-10.00  sec  5.02 GBytes  4.31 Gbits/sec   58             sender
[SUM]   0.00-10.00  sec  3.06 GBytes  2.63 Gbits/sec    0             sender
[SUM]   0.00-10.05  sec  2.88 GBytes  2.46 Gbits/sec    3             sender
[SUM]   0.00-10.01  sec  3.07 GBytes  2.63 Gbits/sec    6             sender
[SUM]   0.00-10.04  sec  2.60 GBytes  2.22 Gbits/sec    4             sender
[SUM]   0.00-10.01  sec  3.30 GBytes  2.84 Gbits/sec    6             sender
[SUM]   0.00-10.00  sec  3.27 GBytes  2.81 Gbits/sec    5             sender
[SUM]   0.00-10.00  sec  2.74 GBytes  2.35 Gbits/sec    4             sender
[SUM]   0.00-10.01  sec  4.17 GBytes  3.58 Gbits/sec   14             sender
[SUM]   0.00-10.01  sec  2.55 GBytes  2.19 Gbits/sec    3             sender
[SUM]   0.00-10.01  sec  3.02 GBytes  2.59 Gbits/sec    3             sender
[SUM]   0.00-10.00  sec  2.74 GBytes  2.35 Gbits/sec    8             sender
[SUM]   0.00-10.01  sec  2.64 GBytes  2.26 Gbits/sec    9             sender
[SUM]   0.00-10.04  sec  2.64 GBytes  2.26 Gbits/sec    0             sender
[SUM]   0.00-10.01  sec  3.15 GBytes  2.70 Gbits/sec    5             sender
[SUM]   0.00-10.04  sec  3.00 GBytes  2.56 Gbits/sec    2             sender
[SUM]   0.00-10.00  sec  2.63 GBytes  2.26 Gbits/sec    4             sender
[SUM]   0.00-10.01  sec  2.79 GBytes  2.39 Gbits/sec    8             sender
[SUM]   0.00-10.00  sec  2.67 GBytes  2.29 Gbits/sec    4             sender
[SUM]   0.00-10.01  sec  3.28 GBytes  2.82 Gbits/sec    2             sender
[SUM]   0.00-10.01  sec  2.64 GBytes  2.26 Gbits/sec   11             sender