Installing Tools on Aerial Devkit
This chapter describes how to install the required kernel, driver, and tools on the host. This is a one-time installation and can be skipped if the system has been configured already.
In the following sequence of steps, we assume the target host is the Aerial Devkit.
Depending on the release, tools that are installed in this section may need to be upgraded in the Installing and Upgrading cuBB SDK section.
Once everything is installed and updated, refer to the cuBB Quick Start Guide on how to use the cuBB SDK.
Install the GPU card and CX6-DX NIC.
Connect the CX6-DX port 0 on both servers using a 100GbE cable.
Connect the Internet port to the local network.
Change the following system BIOS settings to improve network performance:
Set the Power Policy to Best Performance.
Set the Power Policy SpeedStep (Pstates) option to “Disabled”.
Disable HyperThreading. Ensure you have performed Step 1 above before doing this step.
Save the BIOS settings, and then reboot the system.
After installing Ubuntu 22.04 Server, please check the following:
Check if the system time is correct to avoid apt update error. If not, see How to fix system time.
Check if the LVM volume uses the whole disk space. If not, see How to resize LVM volume.
Check if the GPU and NIC are detected by the OS:
Use the following commands to determine whether the GPU and NIC are detected by the OS:
$ lspci |grep -i nvidia
# If the system has A100 40G GPU installed
b6:00.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
# If the system has A100 80G GPU installed
b6:00.0 3D controller: NVIDIA Corporation Device 20b5 (rev a1)
# If the system has A100X GPU installed
bb:00.0 3D controller: NVIDIA Corporation Device 20b8 (rev a1)
$ lspci |grep -i mellanox
b5:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
b5:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
Run the following command to verify that the BIOS setting update took effect:
$ lscpu
Verify that hyperthreading is disabled:
# Thread(s) per core: 1
The following installation steps need an Internet connection. Ensure that you have the proper netplan config for your local network.
Note that the network interface names could change after reboot. To ensure persistent network interface names after reboot, create persistent net link files under /etc/systemd/network, one for each interface.
To find the macaddresses of the CX6-DX NIC, run the ip a
command, then look for the MAC address that
starts with “b8:ce:f6”:
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 18:c0:4d:79:49:b6 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.100/24 brd 192.168.1.1 scope global eno1
valid_lft forever preferred_lft forever
inet6 fe80::1ac0:4dff:fe79:49b6/64 scope link
valid_lft forever preferred_lft forever
3: enp6s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 18:c0:4d:79:49:b7 brd ff:ff:ff:ff:ff:ff
13: ens6f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether b8:ce:f6:33:fd:ee brd ff:ff:ff:ff:ff:ff
inet6 fe80::bace:f6ff:fe33:fdee/64 scope link
valid_lft forever preferred_lft forever
14: ens6f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether b8:ce:f6:33:fd:ef brd ff:ff:ff:ff:ff:ff
inet6 fe80::bace:f6ff:fe33:fdef/64 scope link
valid_lft forever preferred_lft forever
Then create files at /etc/systemd/network/11-persistent-net.link and /etc/systemd/network/12-persistent-net.link with the desired name for the interface and the mac address found in the previous step.
$ sudo nano /etc/systemd/network/11-persistent-net.link
# Update the MAC address to match the converged accelerator port 0 MAC address
[Match]
MACAddress=b8:ce:f6:xx:xx:xx
[Link]
Name=ens6f0
$ sudo nano /etc/systemd/network/12-persistent-net.link
# Update the MAC address to match the converged accelerator port 1 MAC address
[Match]
MACAddress=b8:ce:f6:yy:yy:yy
[Link]
Name=ens6f1
To apply the change:
$ sudo netplan apply
Edit the /etc/apt/apt.conf.d/20auto-upgrades
system file, and change the “1” to “0” for both lines.
This prevents the installed version of the low latency kernel from being accidentally changed with a
subsequent software upgrade.
$ sudo nano /etc/apt/apt.conf.d/20auto-upgrades
APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Unattended-Upgrade "0";
If the low latency kernel is not installed, you should remove the old kernels and keep only the latest generic kernel. Enter the following command to list the installed kernels:
$ dpkg --list | grep -i 'linux-image' | awk '/ii/{ print $2}'
# To remove old kernel
$ sudo apt-get purge linux-image-<old kernel version>
$ sudo apt-get autoremove
Next, install the low-latency kernel with the specific version listed in the releasee manifest.
$ sudo apt-get install -y linux-image-5.15.0-72-lowlatency
Then, update the grub to change the default boot kernel:
# Update grub to change the default boot kernel
$ sudo sed -i 's/^GRUB_DEFAULT=.*/GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.15.0-72-lowlatency"/' /etc/default/grub
To set kernel command-line parameters, edit the GRUB_CMDLINE_LINUX_DEFAULT
parameter in the grub
file /etc/default/grub
and append/update the parameters described below. The following kernel parameters
are optimized for Aerial DevKit with 24 cores Xeon Gold 6240R and 96GB memory.
To automatically append the grub file with these changes, enter this command:
$ sudo sed -i 's/^GRUB_CMDLINE_LINUX_DEFAULT="[^"]*/& default_hugepagesz=1G hugepagesz=1G hugepages=16 tsc=reliable clocksource=tsc intel_idle.max_cstate=0 mce=ignore_ce processor.max_cstate=0 intel_pstate=disable audit=0 idle=poll isolcpus=2-21 nohz_full=2-21 rcu_nocbs=2-21 rcu_nocb_poll nosoftlockup iommu=off intel_iommu=off irqaffinity=0-1,22-23/' /etc/default/grub
Note that the CPU-cores-related parameters need to be adjusted depending on the number of CPU cores
on the system. In the example above, the “2-21” value represents CPU core numbers 2 to 21; you may
need to adjust this parameter depending on the HW configuration. By default,only one DPDK thread is
used. The isolated CPUs are used by the entire cuBB software stack. Use the nproc --all
command
to see how many cores are available. Do not use core numbers that are beyond the number of available
cores.
These instructions are specific to Ubuntu 22.04 with a 5.15.0-72-lowlatency kernel provided by Canonical. Please make sure the kernel commands provided here are suitable for your OS and kernel versions and revise these settings to match your system if necessary.
$ sudo update-grub
$ sudo reboot
After rebooting, enter the following command to check whether the system has booted into the low-latency kernel:
$ uname -r
5.15.0-72-lowlatency
Enter this command to check that the kernel command-line parameters are configured properly:
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.15.0-72-lowlatency root=/dev/mapper/ubuntu--vg-ubuntu--lv ro default_hugepagesz=1G hugepagesz=1G hugepages=16 tsc=reliable clocksource=tsc intel_idle.max_cstate=0 mce=ignore_ce processor.max_cstate=0 intel_pstate=disable audit=0 idle=poll isolcpus=2-21 nohz_full=2-21 rcu_nocbs=2-21 rcu_nocb_poll nosoftlockup iommu=off intel_iommu=off irqaffinity=0-1,22-23
Enter this command to check if hugepages are enabled:
$ grep -i huge /proc/meminfo
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 16
HugePages_Free: 16
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 16777216 kB
Enter this command to disable nouveau:
$ cat <<EOF | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
EOF
Regenerate the kernel initramfs and reboot the system:
$ sudo update-initramfs -u
$ sudo reboot
Enter these commands to install prerequisite packages:
$ sudo apt-get update
$ sudo apt-get install -y build-essential linux-headers-$(uname -r) dkms unzip linuxptp pv
This section describes how to install the Mellanox OFED and Firmware Tools which include a set of firmware management tools. This tool will be used in a later step for updating the firmware image and for configuring NIC settings. It consists of mst, mlxburn, flint, and debug utilities.
# Install MOFED
$ export OFED_VERSION=23.07-0.5.0.0
$ export UBUNTU_VERSION=22.04
$ wget http://www.mellanox.com/downloads/ofed/MLNX_OFED-$OFED_VERSION/MLNX_OFED_LINUX-$OFED_VERSION-ubuntu$UBUNTU_VERSION-x86_64.tgz
$ tar xvf MLNX_OFED_LINUX-$OFED_VERSION-ubuntu$UBUNTU_VERSION-x86_64.tgz
$ cd MLNX_OFED_LINUX-$OFED_VERSION-ubuntu$UBUNTU_VERSION-x86_64
$ sudo ./mlnxofedinstall --dpdk --without-mft --with-rshim --add-kernel-support --force --without-ucx-cuda --without-fw-update
$ sudo rmmod nv_peer_mem nvidia_peermem
$ sudo /etc/init.d/openibd restart
# Verify the installed MOFED version
$ ofed_info -s
MLNX_OFED_LINUX-23.07-0.5.0.0:
# Install Mellanox Firmware Tools
$ export MFT_VERSION=4.25.0-62
$ wget https://www.mellanox.com/downloads/MFT/mft-$MFT_VERSION-x86_64-deb.tgz
$ tar xvf mft-$MFT_VERSION-x86_64-deb.tgz
$ cd mft-$MFT_VERSION-x86_64-deb
$ sudo ./install.sh
# Verify the install Mellanox firmware tool version
$ sudo mst version
mst, mft 4.25.0-62, built on Aug 03 2023, 12:15:13. Git SHA Hash: c14a8d9
$ sudo mst start
# check NIC PCIe bus addresses and network interface names
$ sudo mst status -v
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET NUMA
ConnectX6DX(rev:0) /dev/mst/mt4125_pciconf0.1 b5:00.1 mlx5_1 net-ens6f1 0
ConnectX6DX(rev:0) /dev/mst/mt4125_pciconf0 b5:00.0 mlx5_0 net-ens6f0 0
Set the above PCIe bus addresses and network interface names in the bashrc:
$ echo "export MLX0PCIEADDR=b5:00.0" | tee -a ~/.bashrc
$ echo "export MLX0IFNAME=ens6f0" | tee -a ~/.bashrc
$ echo "export MLX1PCIEADDR=b5:00.1" | tee -a ~/.bashrc
$ echo "export MLX1IFNAME=ens6f1" | tee -a ~/.bashrc
$ source ~/.bashrc
Enter these commands to check the link status of port 0:
# Here is an example if port 0 is connected to another server via a
# 100GbE cable.
$ sudo mlxlink -d $MLX0PCIEADDR
Operational Info
----------------
State : Active
Physical state : LinkUp
Speed : 100G
Width : 4x
FEC : Standard RS-FEC - RS(528,514)
Loopback Mode : No Loopback
Auto Negotiation : ON
Supported Info
--------------
Enabled Link Speed (Ext.) : 0x000007f2 (100G_2X,100G_4X,50G_1X,50G_2X,40G,25G,10G,1G)
Supported Cable Speed (Ext.) : 0x000002f2 (100G_4X,50G_2X,40G,25G,10G,1G)
Troubleshooting Info
--------------------
Status Opcode : 0
Group Opcode : N/A
Recommendation : No issue was observed
Tool Information
----------------
Firmware Version : 22.37.1014
amBER Version : 2.17
MFT Version : mft 4.25.0-62
Run the following commands to install CUDA driver. If the system has older version installed, see Removing Old CUDA Driver to remove the old driver first.
CUDA driver should be installed after MOFED.
# Check the installed CUDA driver version
$ apt list --installed | grep cuda-drivers
# Remove the driver if you have the older version installed.
# For example, cuda-drivers-520 was installed on the system.
$ sudo apt purge cuda-drivers-520
$ sudo apt autoremove
# Remove the driver if it was installed by runfile installer before.
$ sudo /usr/bin/nvidia-uninstall
# Install CUDA driver
$ wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
$ sudo sh cuda_12.2.0_535.54.03_linux.run --driver --silent
The full official instructions for installing Docker CE can be found here: https://docs.docker.com/engine/install/ubuntu/#install-docker-engine. The following instructions is one supported way of installing Docker CE:
The CUDA driver must be installed before the Docker CE and nvidia-container-toolkit installation will work correctly. It is recommended to install the CUDA driver before both Docker CE and the nvidia-container-toolkit.
$ sudo apt-get update
$ sudo apt-get install -y ca-certificates curl gnupg
$ sudo install -m 0755 -d /etc/apt/keyrings
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
$ sudo chmod a+r /etc/apt/keyrings/docker.gpg
$ echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update
$ sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
$ sudo docker run hello-world
Locate and follow the nvidia-container-toolkit install instructions.
Or use the following instructions as an alternate way to install the nvidia-container-toolkit. Version 1.14.1-1 is supported.
The CUDA driver must be installed before the Docker CE and nvidia-container-toolkit installation will work correctly. It is recommended to install the CUDA driver before both Docker CE and the nvidia-container-toolkit.
$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
&& \
sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
$ sudo nvidia-ctk runtime configure --runtime=docker
$ sudo systemctl restart docker
$ sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Enter these commands to configure PTP4L assuming the ens6f0
NIC interface and CPU core 21 are used for PTP:
cat <<EOF | sudo tee /etc/ptp.conf
[global]
priority1 128
priority2 128
domainNumber 24
tx_timestamp_timeout 30
dscp_event 46
dscp_general 46
logging_level 6
verbose 1
use_syslog 0
logMinDelayReqInterval 1
[ens6f0]
logAnnounceInterval -3
announceReceiptTimeout 3
logSyncInterval -4
logMinDelayReqInterval -4
delay_mechanism E2E
network_transport L2
EOF
cat <<EOF | sudo tee /lib/systemd/system/ptp4l.service
[Unit]
Description=Precision Time Protocol (PTP) service
Documentation=man:ptp4l
[Service]
Restart=always
RestartSec=5s
Type=simple
ExecStart=taskset -c 21 /usr/sbin/ptp4l -f /etc/ptp.conf
[Install]
WantedBy=multi-user.target
EOF
$ sudo systemctl daemon-reload
$ sudo systemctl restart ptp4l.service
$ sudo systemctl enable ptp4l.service
Include ‘slaveOnly 1’ in the beginning of the ptp.conf file on the server that runs cuphycontroller, as shown below:
[global]
slaveOnly 1
The server without the ‘slaveOnly 1’ configuration will become the master clock, as shown below:
$ sudo systemctl status ptp4l.service
• ptp4l.service - Precision Time Protocol (PTP) service
Loaded: loaded (/lib/systemd/system/ptp4l.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2022-02-03 22:41:35 UTC; 4min 47s ago
Docs: man:ptp4l
Main PID: 1112 (ptp4l)
Tasks: 1 (limit: 94582)
Memory: 904.0K
CGroup: /system.slice/ptp4l.service
└─1112 /usr/sbin/ptp4l -f /etc/ptp.conf
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[135.371]: selected local clock b8cef6.fffe.33fdee as best master
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[135.371]: assuming the grand master role
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[135.745]: selected local clock b8cef6.fffe.33fdee as best master
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[135.745]: assuming the grand master role
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[135.780]: port 1: link up
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[135.804]: port 1: FAULTY to LISTENING on INIT_COMPLETE
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[135.855]: port 1: new foreign master b8cef6.fffe.33fe16-1
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[136.105]: selected best master clock b8cef6.fffe.33fe16
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[136.105]: assuming the grand master role
Feb 03 22:41:44 devkit-01 taskset[1112]: ptp4l[136.105]: port 1: LISTENING to GRAND_MASTER on RS_GRAND_MASTER
The other server running cuphycontroller will become the secondary, follower clock, as shown below:
$ sudo systemctl status ptp4l.service
• ptp4l.service - Precision Time Protocol (PTP) service
Loaded: loaded (/lib/systemd/system/ptp4l.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2023-02-17 17:01:46 UTC; 37s ago
Docs: man:ptp4l
Main PID: 2225310 (ptp4l)
Tasks: 1 (limit: 598864)
Memory: 624.0K
CGroup: /system.slice/ptp4l.service
└─2225310 /usr/sbin/ptp4l -f /etc/ptp.conf
Feb 17 17:02:14 devkit-02 taskset[2225310]: ptp4l[1992342.927]: rms 6 max 9 freq -8277 +/- 5 delay 220 +/- 0
Feb 17 17:02:15 devkit-02 taskset[2225310]: ptp4l[1992343.927]: rms 4 max 8 freq -8265 +/- 5 delay 219 +/- 1
Feb 17 17:02:16 devkit-02 taskset[2225310]: ptp4l[1992344.927]: rms 4 max 7 freq -8260 +/- 4 delay 219 +/- 1
Feb 17 17:02:17 devkit-02 taskset[2225310]: ptp4l[1992345.927]: rms 4 max 7 freq -8268 +/- 5 delay 219 +/- 1
Feb 17 17:02:18 devkit-02 taskset[2225310]: ptp4l[1992346.927]: rms 4 max 11 freq -8268 +/- 6 delay 221 +/- 1
Feb 17 17:02:19 devkit-02 taskset[2225310]: ptp4l[1992347.927]: rms 4 max 9 freq -8259 +/- 5 delay 219 +/- 1
Feb 17 17:02:20 devkit-02 taskset[2225310]: ptp4l[1992348.927]: rms 9 max 17 freq -8280 +/- 10 delay 220 +/- 1
Feb 17 17:02:21 devkit-02 taskset[2225310]: ptp4l[1992349.927]: rms 5 max 13 freq -8282 +/- 6 delay 219 +/- 1
Feb 17 17:02:22 devkit-02 taskset[2225310]: ptp4l[1992350.927]: rms 6 max 9 freq -8270 +/- 8 delay 218 +/- 0
Feb 17 17:02:23 devkit-02 taskset[2225310]: ptp4l[1992351.927]: rms 4 max 9 freq -8269 +/- 6 delay 217 +/- 1
Enter the commands to turn off NTP:
$ sudo timedatectl set-ntp false
$ timedatectl
Local time: Thu 2022-02-03 22:30:58 UTC
Universal time: Thu 2022-02-03 22:30:58 UTC
RTC time: Thu 2022-02-03 22:30:58
Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: no
NTP service: inactive
RTC in local TZ: no
Run PHC2SYS as service:
PHC2SYS is used to synchronize the system clock to the PTP hardware clock (PHC) on the NIC. Here are the two examples of PHC2SYS configurations.
Example 1: Specify the network interface used for PTP and system clock as the slave clock. This is the default mode when the devkit is configured as a gNB or RU emlator to run cuBB test.
# If more than one instance is already running, kill the existing # PHC2SYS sessions. # Command used can be found in /lib/systemd/system/phc2sys.service # Update the ExecStart line to the following $ sudo nano /lib/systemd/system/phc2sys.service [Unit] Description=Synchronize system clock or PTP hardware clock (PHC) Documentation=man:phc2sys After=ntpdate.service Requires=ptp4l.service After=ptp4l.service [Service] Restart=always RestartSec=5s Type=simple ExecStart=/bin/sh -c "taskset -c 21 /usr/sbin/phc2sys -s /dev/ptp$(ethtool -T ens6f0 | grep PTP | awk '{print $4}')-c CLOCK_REALTIME -n 24 -O 0 -R 256 -u 256" [Install] WantedBy=multi-user.target
Example 2: Synchronize time automatically according to the current ptp4l state and synchronize the system clock to the remote master. This configuration is usually performed when the devkit is configured as a gNB to run E2E test in the LLS-C3 topology.
# If more than one instance is already running, kill the existing # PHC2SYS sessions. # Command used can be found in /lib/systemd/system/phc2sys.service # Update the ExecStart line to the following, the -a option will use the same interface as ptp4l and -r will synchronize the system clock $ sudo nano /lib/systemd/system/phc2sys.service [Unit] Description=Synchronize system clock or PTP hardware clock (PHC) Documentation=man:phc2sys After=ntpdate.service Requires=ptp4l.service After=ptp4l.service [Service] Restart=always RestartSec=5s Type=simple ExecStart=/usr/sbin/phc2sys -a -r -n 24 -R 256 -u 256 [Install] WantedBy=multi-user.target
Once the PHC2SYS config file is changed, run the following:
$ sudo systemctl daemon-reload
$ sudo systemctl restart phc2sys.service
# Set to start automatically on reboot
$ sudo systemctl enable phc2sys.service
# check that the service is active and has converged to a low rms value (<30) and that the correct NIC has been selected (ens6f0):
$ sudo systemctl status phc2sys.service
● phc2sys.service - Synchronize system clock or PTP hardware clock (PHC)
Loaded: loaded (/lib/systemd/system/phc2sys.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2023-02-17 17:02:35 UTC; 7s ago
Docs: man:phc2sys
Main PID: 2225556 (phc2sys)
Tasks: 1 (limit: 598864)
Memory: 372.0K
CGroup: /system.slice/phc2sys.service
└─2225556 /usr/sbin/phc2sys -a -r -n 24 -R 256 -u 256
Feb 17 17:02:35 devkit-02 phc2sys[2225556]: [1992363.445] reconfiguring after port state change
Feb 17 17:02:35 devkit-02 phc2sys[2225556]: [1992363.445] selecting CLOCK_REALTIME for synchronization
Feb 17 17:02:35 devkit-02 phc2sys[2225556]: [1992363.445] selecting ens6f0 as the master clock
Feb 17 17:02:36 devkit-02 phc2sys[2225556]: [1992364.457] CLOCK_REALTIME rms 15 max 37 freq -19885 +/- 116 delay 1944 +/- 6
Feb 17 17:02:37 devkit-02 phc2sys[2225556]: [1992365.473] CLOCK_REALTIME rms 16 max 42 freq -19951 +/- 103 delay 1944 +/- 7
Feb 17 17:02:38 devkit-02 phc2sys[2225556]: [1992366.490] CLOCK_REALTIME rms 13 max 31 freq -19909 +/- 81 delay 1944 +/- 6
Feb 17 17:02:39 devkit-02 phc2sys[2225556]: [1992367.506] CLOCK_REALTIME rms 9 max 27 freq -19918 +/- 40 delay 1945 +/- 6
Feb 17 17:02:40 devkit-02 phc2sys[2225556]: [1992368.522] CLOCK_REALTIME rms 8 max 24 freq -19925 +/- 11 delay 1945 +/- 9
Feb 17 17:02:41 devkit-02 phc2sys[2225556]: [1992369.538] CLOCK_REALTIME rms 9 max 23 freq -19915 +/- 36 delay 1943 +/- 8
Verify if the system clock is synchronized:
$ timedatectl
Local time: Thu 2022-02-03 22:30:58 UTC
Universal time: Thu 2022-02-03 22:30:58 UTC
RTC time: Thu 2022-02-03 22:30:58
Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
NTP service: inactive
RTC in local TZ: no
Create the directory /user/local/bin
and create the /usr/local/bin/nvidia.sh
file to run the commands with every reboot.
The network interfaces here match those used above - ens6f0 and ens6f1.
Also, the command for “nvidia-smi lgc” expects just one GPU device (-i 0). This needs to be modified if the system uses more than one GPU.
cat <<EOF | sudo tee /usr/local/bin/nvidia.sh
#!/bin/bash
mst start
ifconfig ens6f0 up
ifconfig ens6f1 up
ethtool --set-priv-flags ens6f0 tx_port_ts on
ethtool --set-priv-flags ens6f1 tx_port_ts on
ethtool -A ens6f0 rx off tx off
ethtool -A ens6f1 rx off tx off
sysctl -w kernel.numa_balancing=0
nvidia-smi -pm 1
nvidia-smi -i 0 -lgc $(nvidia-smi -i 0 --query-supported-clocks=graphics --format=csv,noheader,nounits | sort -h | tail -n 1)
nvidia-smi -mig 0
modprobe nvidia-peermem
echo -1 > /proc/sys/kernel/sched_rt_runtime_us
EOF
Create a system service file to be loaded after network interfaces are up.
cat <<EOF | sudo tee /lib/systemd/system/nvidia.service
[Unit]
After=network.target
[Service]
ExecStart=/usr/local/bin/nvidia.sh
[Install]
WantedBy=default.target
EOF
Then set the file permissions, reload the systemd daemon, enable the service, restart the service when installing the first time, and check status
sudo chmod 744 /usr/local/bin/nvidia.sh
sudo chmod 664 /lib/systemd/system/nvidia.service
sudo systemctl daemon-reload
sudo systemctl enable nvidia.service
sudo systemctl restart nvidia.service
systemctl status nvidia.service
The output of the last command should look like this:
aerial@devkit:~$ systemctl status nvidia.service
○ nvidia.service
Loaded: loaded (/lib/systemd/system/nvidia.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Tue 2023-09-19 19:19:23 UTC; 1 week 0 days ago
Main PID: 1307 (code=exited, status=0/SUCCESS)
CPU: 784ms
Sep 19 19:19:22 devkit nvidia.sh[713963]: Create devices
Sep 19 19:19:22 devkit nvidia.sh[713963]: Unloading MST PCI module (unused) - Success
Sep 19 19:19:23 devkit nvidia.sh[714843]: kernel.numa_balancing = 0
Sep 19 19:19:23 devkit nvidia.sh[714844]: Persistence mode is already Enabled for GPU 00000000:B6:00.0.
Sep 19 19:19:23 devkit nvidia.sh[714844]: All done.
Sep 19 19:19:23 devkit nvidia.sh[714849]: GPU clocks set to "(gpuClkMin 1410, gpuClkMax 1410)" for GPU 00000000:B6:00.0
Sep 19 19:19:23 devkit nvidia.sh[714849]: All done.
Sep 19 19:19:23 devkit nvidia.sh[714850]: Disabled MIG Mode for GPU 00000000:B6:00.0
Sep 19 19:19:23 devkit nvidia.sh[714850]: All done.
Sep 19 19:19:23 devkit systemd[1]: nvidia.service: Deactivated successfully.
This step is optional. Matlab is not required to generate TV files if using Aerial Python mcore module. See Generating TV and Launch Pattern Files section in cuBB Quick Start Guide.
Refer to https://www.mathworks.com/downloads/ for downloading and installing Matlab. Follow the established IT process for the license and installation at your site.
Matlab is used to run the test-vector generator script. This can be run on any machine that has a graphical display and a graphical UI capable operating system that Matlab supports. The generated test-vector files can then be copied to the cuBB server.
If you would like to run the test-vector generator Matlab script on the same Ubuntu server machine
that runs cuBB, then Matlab should be run in console mode matlab -nosplash -nodesktop
.
The following Matlab components are required:
Matlab 2020b or later
Matlab licenses:
MATLAB
Communications Toolbox
DSP System Toolbox
Signal Processing Toolbox
Fixed-Point Designer (optional)
Call half function to accelerate testing/simulation
Can be disabled by setting SimCtrl.fp16AlgoSel = 1
Parallel Computing Toolbox (optional)
Accelerate testing/simulation automatically
5G Toolbox
Not required for TV generation (can be disabled by setting Chan.use5Gtoolbox = 0 in /nr_matlab/config/cfgChan.m)
Required for waveform compliance test and performance simulation