Prerequisites for arm64 / aarch64 DGX Systems#

Ensure the prerequisites for arm64 / aarch64 DGX Systems are met, including the following:

NVIDIA Driver
CUDA
Docker
NVIDIA Container Toolkit

Introduction#

If your compute host has arm64 or aarch64 CPU architecture, such as a DGX GB200 Compute Tray, follow the steps below to install.

The steps below follow NVIDIA DGX OS 7 User Guide: Installing the GPU Driver.

These steps are customized for the DGX GB200 Compute Tray, which
- has 2x Grace CPUs (arm64 / aarch64)
- has 4x Blackwell GPUs
- runs one OS image
- uses the DGX Software Stack
This workflow was verified with
- Ubuntu 24.04
- Linux kernel version 6.8.0-1044-nvidia-64k

Installation Steps#

Check state of NVIDIA (GPU) Driver and related tools

Check the running driver version
```
nvidia-smi
```

Example successful output is shown below.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.105.08             Driver Version: 580.105.08     CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GB200                   On  |   00000008:01:00.0 Off |                    0 |
| N/A   29C    P0            130W / 1200W |       0MiB / 189471MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GB200                   On  |   00000009:01:00.0 Off |                    0 |
| N/A   29C    P0            127W / 1200W |       0MiB / 189471MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GB200                   On  |   00000018:01:00.0 Off |                    0 |
| N/A   30C    P0            127W / 1200W |       0MiB / 189471MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GB200                   On  |   00000019:01:00.0 Off |                    0 |
| N/A   30C    P0            139W / 1200W |       0MiB / 189471MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

If the running driver version is 580+, skip to Step 9.
If nvidia-smi fails, proceed to Step 2. Output like Command 'nvidia-smi' not found indicates failure.

Confirm that the OS sees NVIDIA GPUs

Run the command below, and look for NVIDIA entries.

sudo lshw -class display -json | jq '.[] | select(.description=="3D controller")'

Product-specific information may be visible with

sudo lshw -class system -json | jq '.[0]'

Verify that your Linux distribution, kernel version, and gcc version meet the following requirements:

Use the output from the commands below to find your system’s Linux distribution, kernel version, and gcc version, respectively.
- Check these versions against the validated configurations at Table 3: Supported Linux Distributions.
```
. /etc/os-release && echo "$PRETTY_NAME"   # for Linux distribution
uname -r  # for kernel version
gcc --version  # for gcc version
```

See example output:

# Example output for . /etc/os-release && echo "$PRETTY_NAME"
Ubuntu 24.04.2 LTS

# Example output for uname -r
6.8.0-1044-nvidia-64k

# Example output for gcc --version
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Update Linux kernel version if needed
- For a GB200 system, this workflow was verified with kernel version 6.8.0-1044-nvidia-64k.
- If your systems has kernel version 6.8.0-1043-nvidia-64k or 6.8.0-1044-nvidia-64k, go to Step 5.
- If your system has a different kernel version, configure grub (GRand Unified Bootloader) so that your system starts up with the verified kernel version.
  
  a. Update grub default ‘menu entry’
  - In the file /etc/default/grub, set the variable GRUB_DEFAULT to the verified kernel version
```
sudo sed --in-place=.bak \
  '/^[[:space:]]*GRUB_DEFAULT=/c\GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 6.8.0-1044-nvidia-64k"' \
  /etc/default/grub
```
  b. Verify that etc/default/grub is updated
```
cat /etc/default/grub
```
  c. Update grub and reboot
```
```
  sudo update-grub
  sudo reboot
``` 
```

Remove NVIDIA libraries to avoid version conflicts

See reference Removing the Driver

Check for NVIDIA libraries, using the command

ls /usr/lib/aarch64-linux-gnu/ | grep -i nvidia

If the output from the command above is not empty, run the command below.

sudo apt remove --autoremove --purge -Vy \
  cuda-compat\* \
  cuda-drivers\*  \
  libnvidia-cfg1\* \
  libnvidia-compute\* \
  libnvidia-decode\* \
  libnvidia-encode\* \
  libnvidia-extra\* \
  libnvidia-fbc1\* \
  libnvidia-gl\* \
  libnvidia-gpucomp\* \
  libnvidia-nscq\* \
  libnvsdm\* \
  libxnvctrl\* \
  nvidia-dkms\* \
  nvidia-driver\* \
  nvidia-fabricmanager\* \
  nvidia-firmware\* \
  nvidia-headless\* \
  nvidia-imex\* \
  nvidia-kernel\* \
  nvidia-modprobe\* \
  nvidia-open\* \
  nvidia-persistenced\* \
  nvidia-settings\* \
  nvidia-xconfig\* \
  xserver-xorg-video-nvidia\*

Ignore errors for non-matching patterns.

Download package repositories and install DGX tools
- For your system and architecture, GB200 and arm64, follow the steps as described in Installing DGX System Configurations and Tools
- Accept default options when prompted.
- The recommended steps can be skipped.
a. Download and unpack ARM64-specific packages
```
curl https://repo.download.nvidia.com/baseos/ubuntu/noble/arm64/dgx-repo-files.tgz | sudo tar xzf - -C /
```
b. Update local APT database
```
sudo apt update
```
c. Install DGX system tools
```
sudo apt install -y nvidia-system-core
sudo apt install -y nvidia-system-utils
sudo apt install -y nvidia-system-extra
```
d. Install linux-tools for your Linux kernel
```
sudo apt install -y linux-tools-nvidia-64k
```
e. NVIDIA peermem loader package
```
sudo apt install -y nvidia-peermem-loader
```
Install GPU Driver
- For your system and architecture, such as GB200 and arm64, follow the steps as described in https://docs.nvidia.com/dgx/dgx-os-7-user-guide/installing_on_ubuntu.html#installing-the-gpu-driver
a. Do not update the Linux kernel version

b. Pin the driver version
```
sudo apt install nvidia-driver-pinning-580
```
c. Install the open GPU kernel module
```
sudo apt install --allow-downgrades \
  nvidia-driver-580-open \
  libnvidia-nscq \
  nvidia-modprobe \
  nvidia-imex \
  datacenter-gpu-manager-4-cuda13 \
  nv-persistence-mode
```
- Ignore build errors for modules built for 6.14.0-1015-nvidia-64k
e. Enable the peristence daemon
```
sudo systemctl enable nvidia-persistenced nvidia-dcgm nvidia-imex
```
f. Reboot
```
sudo reboot
```
After reboot, repeat Step 1 to check NVIDIA Driver and related tools
Install Docker and the NVIDIA Container Toolkit
- Follow the instructions at Installing Docker and the NVIDIA Container Toolkit
- Ignore build errors for modules built for 6.14.0-1015-nvidia-64k
- Verify the NVIDIA Driver, Docker, NVIDIA Container Toolkit stack
```
sudo docker run --rm --gpus=all nvcr.io/nvidia/cuda:12.6.2-base-ubuntu24.04 nvidia-smi
```
To run docker as a non-root user, see Manage Docker as non-root user

Verify the NVIDIA Driver, CUDA, Docker, NVIDIA Container Toolkit, Torch stack

a. Log into the NVIDIA Container Registry, using your NGC key as the password.

docker login nvcr.io --username '$oauthtoken'

b. Run

sudo docker run --rm --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
  nvcr.io/nvidia/pytorch:25.12-py3 \
  python -c \
"import torch, pynvml;
pynvml.nvmlInit();
print('Driver:', pynvml.nvmlSystemGetDriverVersion());
print('CUDA:', torch.version.cuda);
print('GPU count:', torch.cuda.device_count())"

The sudo prefix can be omitted if you completed the previous step, ‘To run docker as a non-root user’.

Example output is

=============
== PyTorch ==
=============

NVIDIA Release 25.12 (build 245654591)
PyTorch Version 2.10.0a0+b4e4ee8
Container image Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copyright (c) 2014-2024 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

GOVERNING TERMS: The software and materials are governed by the NVIDIA Software License Agreement
(found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/)
and the Product-Specific Terms for NVIDIA AI Products
(found at https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/).

NOTE: CUDA Forward Compatibility mode ENABLED.
  Using CUDA 13.1 driver version 590.44.01 with kernel driver version 580.105.08.
  See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.

Driver: 580.105.08
CUDA: 13.1
GPU count: 4

If the number of CUDA devices accessible is as expected, your system is verified for
- (a) GPU setup
- (b) NVIDIA Driver
- (c) CUDA
- (d) Docker
- (e) NVIDIA Container Toolkit