Troubleshooting
This page documents solutions to common issues that you might encounter.
Normally the hugepages settings are updated through the /etc/default/grub configuration file, as
described earlier. However, depending on the version of operating system, the settings changes
may become overwritten by another configuration file: /etc/grub.
Run below to remove CUDA Toolkit and driver if the system already has old version installed:
            
            sudo apt-get --purge remove "*cublas*" "*cufft*" "*curand*" "*cusolver*" "*cusparse*" "*npp*" "*nvjpeg*" "cuda*" "nsight*" "*nvidia*"
sudo apt-get autoremove
        
    
You may see the apt update error if the system time is incorrect.
            
            E: Release file for https://download.docker.com/linux/ubuntu/dists/focal/InRelease is not valid yet (invalid for another 2d 10h 51min 11s).
Updates for this repository will not be applied.
        
    
Run the folllowing commands to set the date and time via NTP once (This will not enable the NTP service):
            
            sudo apt-get install ntpdate
sudo ntpdate -s pool.ntp.org
        
    
When installing Ubuntu 20.04 server, it partitions the whole disk but only creates a 200GB logical volume. This is what you will see on a newly installed devkit:
            
            # Devkit has 1TB SSD but default lv uses only 200GB
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 55.5M 1 loop /snap/core18/2246
loop1 7:1 0 55.5M 1 loop /snap/core18/2253
loop2 7:2 0 67.3M 1 loop /snap/lxd/21545
loop3 7:3 0 67.2M 1 loop /snap/lxd/21835
loop4 7:4 0 61.9M 1 loop /snap/core20/1242
loop5 7:5 0 61.9M 1 loop /snap/core20/1169
loop6 7:6 0 32.5M 1 loop /snap/snapd/13640
loop7 7:7 0 42.2M 1 loop /snap/snapd/14066
sda 8:0 0 894.3G 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 892.8G 0 part
└─ubuntu--vg-ubuntu--lv 253:0 0 200G 0 lvm /
        
    
The following commands will resize the logic volume to use the entire disk, then resize the filesystem to use the entire logic volume.
            
            # Test mode first
sudo lvresize -t -v -l +100%FREE /dev/mapper/ubuntu--vg-ubuntu--lv
# Remove -t if test mode succeeds
sudo lvresize -v -l +100%FREE /dev/mapper/ubuntu--vg-ubuntu--lv
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 55.5M 1 loop /snap/core18/2246
loop1 7:1 0 55.5M 1 loop /snap/core18/2253
loop2 7:2 0 67.3M 1 loop /snap/lxd/21545
loop3 7:3 0 67.2M 1 loop /snap/lxd/21835
loop4 7:4 0 61.9M 1 loop /snap/core20/1242
loop5 7:5 0 61.9M 1 loop /snap/core20/1169
loop6 7:6 0 32.5M 1 loop /snap/snapd/13640
loop7 7:7 0 42.2M 1 loop /snap/snapd/14066
sda 8:0 0 894.3G 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 892.8G 0 part
└─ubuntu--vg-ubuntu--lv 253:0 0 892.8G 0 lvm /
# Resize filesystem
sudo resize2fs -p /dev/mapper/ubuntu--vg-ubuntu--lv
df -h -T
Filesystem                        Type      Size  Used Avail Use% Mounted on
udev                              devtmpfs   39G     0   39G   0% /dev
tmpfs                             tmpfs     9.4G  2.0M  9.4G   1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv ext4      878G   77G  764G  10% /
tmpfs                             tmpfs      47G     0   47G   0% /dev/shm
tmpfs                             tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs                             tmpfs      47G     0   47G   0% /sys/fs/cgroup
/dev/sda2                         ext4      976M  460M  450M  51% /boot
/dev/loop0                        squashfs   56M   56M     0 100% /snap/core18/2246
/dev/sda1                         vfat      511M  5.3M  506M   2% /boot/efi
/dev/loop1                        squashfs   56M   56M     0 100% /snap/core18/2253
/dev/loop5                        squashfs   62M   62M     0 100% /snap/core20/1169
/dev/loop2                        squashfs   68M   68M     0 100% /snap/lxd/21545
/dev/loop4                        squashfs   62M   62M     0 100% /snap/core20/1242
/dev/loop6                        squashfs   33M   33M     0 100% /snap/snapd/13640
/dev/loop3                        squashfs   68M   68M     0 100% /snap/lxd/21835
/dev/loop7                        squashfs   43M   43M     0 100% /snap/snapd/14066
overlay                           overlay   878G   77G  764G  10% /var/lib/docker/overlay2/851cbfd83b022a24f61fb0f87a007c56da8065a7528f6b661bf45d3d65ccc787/merged
tmpfs                             tmpfs     9.4G  4.0K  9.4G   1% / run/user/1000
        
    
Use the lspci command to find the bus address of the NIC, then use the lshw command to find
the interface name from the bus address. Then use the ip -a command to find the MAC address
from the interface name. Here is an example:
            
            $ lspci|grep -i ether
04:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
04:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
31:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
31:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
4b:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
4b:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
98:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
98:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
        
    
            
            $ lshw -c network -businfo
Bus info          Device          Class          Description
============================================================
pci@0000:04:00.0  eno8303         network        NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
pci@0000:04:00.1  eno8403         network        NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
pci@0000:31:00.0  eno12399np0     network        MT27800 Family [ConnectX-5]
pci@0000:31:00.1  eno12409np1     network        MT27800 Family [ConnectX-5]
pci@0000:4b:00.0  ens3f0np0       network        MT2892 Family [ConnectX-6 Dx]
pci@0000:4b:00.1  ens3f1np1       network        MT2892 Family [ConnectX-6 Dx]
pci@0000:98:00.0  ens6f0np0       network        MT2892 Family [ConnectX-6 Dx]
pci@0000:98:00.1  ens6f1np1       network        MT2892 Family [ConnectX-6 Dx]
$ ip a