Aerial CUDA-Accelerated RAN 24-1
Aerial CUDA-Accelerated RAN 24-1 (Archive)

Troubleshooting

This page documents solutions to common issues that you might encounter.

Normally the hugepages settings are updated through the /etc/default/grub configuration file. However, depending on the version of operating system, the settings changes may become overwritten by another configuration file: /etc/grub.

If the system has an old version installed, run the following to remove the CUDA Toolkit and driver :

Copy
Copied!
            

sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*" "*nvidia*" sudo apt-get autoremove

You may see the apt update error if the system time is incorrect.

Copy
Copied!
            

E: Release file for https://download.docker.com/linux/ubuntu/dists/focal/InRelease is not valid yet (invalid for another 2d 10h 51min 11s). Updates for this repository will not be applied.

Run the following commands to set the date and time via NTP once (this will not enable the NTP service):

Copy
Copied!
            

sudo apt-get install ntpdate sudo ntpdate -s pool.ntp.org

When installing Ubuntu 22.04 server, it partitions the whole disk but only creates a 200GB logical volume. This is what you will see on a newly installed devkit:

Copy
Copied!
            

# Devkit has 1TB SSD but default lv uses only 200GB lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/2246 loop1 7:1 0 55.5M 1 loop /snap/core18/2253 loop2 7:2 0 67.3M 1 loop /snap/lxd/21545 loop3 7:3 0 67.2M 1 loop /snap/lxd/21835 loop4 7:4 0 61.9M 1 loop /snap/core20/1242 loop5 7:5 0 61.9M 1 loop /snap/core20/1169 loop6 7:6 0 32.5M 1 loop /snap/snapd/13640 loop7 7:7 0 42.2M 1 loop /snap/snapd/14066 sda 8:0 0 894.3G 0 disk ├─sda1 8:1 0 512M 0 part /boot/efi ├─sda2 8:2 0 1G 0 part /boot └─sda3 8:3 0 892.8G 0 part └─ubuntu--vg-ubuntu--lv 253:0 0 200G 0 lvm /

The following commands resize the logic volume to use the entire disk, then resize the file system to use the entire logic volume.

Copy
Copied!
            

# Test mode first sudo lvresize -t -v -l +100%FREE /dev/mapper/ubuntu--vg-ubuntu--lv # Remove -t if test mode succeeds sudo lvresize -v -l +100%FREE /dev/mapper/ubuntu--vg-ubuntu--lv lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 55.5M 1 loop /snap/core18/2246 loop1 7:1 0 55.5M 1 loop /snap/core18/2253 loop2 7:2 0 67.3M 1 loop /snap/lxd/21545 loop3 7:3 0 67.2M 1 loop /snap/lxd/21835 loop4 7:4 0 61.9M 1 loop /snap/core20/1242 loop5 7:5 0 61.9M 1 loop /snap/core20/1169 loop6 7:6 0 32.5M 1 loop /snap/snapd/13640 loop7 7:7 0 42.2M 1 loop /snap/snapd/14066 sda 8:0 0 894.3G 0 disk ├─sda1 8:1 0 512M 0 part /boot/efi ├─sda2 8:2 0 1G 0 part /boot └─sda3 8:3 0 892.8G 0 part └─ubuntu--vg-ubuntu--lv 253:0 0 892.8G 0 lvm / # Resize file system sudo resize2fs -p /dev/mapper/ubuntu--vg-ubuntu--lv df -h -T Filesystem Type Size Used Avail Use% Mounted on udev devtmpfs 39G 0 39G 0% /dev tmpfs tmpfs 9.4G 2.0M 9.4G 1% /run /dev/mapper/ubuntu--vg-ubuntu--lv ext4 878G 77G 764G 10% / tmpfs tmpfs 47G 0 47G 0% /dev/shm tmpfs tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs tmpfs 47G 0 47G 0% /sys/fs/cgroup /dev/sda2 ext4 976M 460M 450M 51% /boot /dev/loop0 squashfs 56M 56M 0 100% /snap/core18/2246 /dev/sda1 vfat 511M 5.3M 506M 2% /boot/efi /dev/loop1 squashfs 56M 56M 0 100% /snap/core18/2253 /dev/loop5 squashfs 62M 62M 0 100% /snap/core20/1169 /dev/loop2 squashfs 68M 68M 0 100% /snap/lxd/21545 /dev/loop4 squashfs 62M 62M 0 100% /snap/core20/1242 /dev/loop6 squashfs 33M 33M 0 100% /snap/snapd/13640 /dev/loop3 squashfs 68M 68M 0 100% /snap/lxd/21835 /dev/loop7 squashfs 43M 43M 0 100% /snap/snapd/14066 overlay overlay 878G 77G 764G 10% /var/lib/docker/overlay2/851cbfd83b022a24f61fb0f87a007c56da8065a7528f6b661bf45d3d65ccc787/merged tmpfs tmpfs 9.4G 4.0K 9.4G 1% / run/user/1000

Use the sudo lshw -c network |grep -i 'product\|bus info\|name\|serial command to find the bus address and MAC address of each NIC on the system. Here is an example:

Copy
Copied!
            

$ sudo lshw -c network |grep -i 'product\|bus info\|name\|serial' product: I210 Gigabit Network Connection bus info: pci@0000:05:00.0 logical name: eno1 serial: 18:c0:4d:79:49:b6 product: I210 Gigabit Network Connection bus info: pci@0000:06:00.0 logical name: enp6s0 serial: 18:c0:4d:79:49:b7 product: MT2892 Family [ConnectX-6 Dx] bus info: pci@0000:b5:00.0 logical name: ens6f0 serial: b8:ce:f6:33:fd:ee product: MT2892 Family [ConnectX-6 Dx] bus info: pci@0000:b5:00.1 logical name: ens6f1 serial: b8:ce:f6:33:fd:ef

Previous Aerial System Scripts
Next cuBB Quickstart Guide
© Copyright 2024, NVIDIA. Last updated on Jul 15, 2024.