Prerequisites for arm64 / aarch64 DGX Systems#
Ensure the prerequisites for arm64 / aarch64 DGX Systems are met, including the following:
NVIDIA Driver
CUDA
Docker
NVIDIA Container Toolkit
Introduction#
If your compute host has arm64 or aarch64 CPU architecture, such as a DGX GB200 Compute Tray, follow the steps below to install.
The steps below follow NVIDIA DGX OS 7 User Guide: Installing the GPU Driver.
These steps are customized for the DGX GB200 Compute Tray, which
has 2x Grace CPUs (arm64 / aarch64)
has 4x Blackwell GPUs
runs one OS image
uses the DGX Software Stack
This workflow was verified with
Ubuntu 24.04Linux kernel version
6.8.0-1044-nvidia-64k
Installation Steps#
Check state of NVIDIA (GPU) Driver and related tools
Check the running driver version
nvidia-smi
Example successful output is shown below.
+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.105.08 Driver Version: 580.105.08 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GB200 On | 00000008:01:00.0 Off | 0 | | N/A 29C P0 130W / 1200W | 0MiB / 189471MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GB200 On | 00000009:01:00.0 Off | 0 | | N/A 29C P0 127W / 1200W | 0MiB / 189471MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 2 NVIDIA GB200 On | 00000018:01:00.0 Off | 0 | | N/A 30C P0 127W / 1200W | 0MiB / 189471MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 3 NVIDIA GB200 On | 00000019:01:00.0 Off | 0 | | N/A 30C P0 139W / 1200W | 0MiB / 189471MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+
If the running driver version is 580+, skip to Step 9.
If
nvidia-smifails, proceed to Step 2. Output likeCommand 'nvidia-smi' not foundindicates failure.
Confirm that the OS sees NVIDIA GPUs
Run the command below, and look for NVIDIA entries.
sudo lshw -class display -json | jq '.[] | select(.description=="3D controller")'
Product-specific information may be visible with
sudo lshw -class system -json | jq '.[0]'
Verify that your Linux distribution, kernel version, and gcc version meet the following requirements:
Use the output from the commands below to find your system’s Linux distribution, kernel version, and gcc version, respectively.
Check these versions against the validated configurations at Table 3: Supported Linux Distributions.
. /etc/os-release && echo "$PRETTY_NAME" # for Linux distribution uname -r # for kernel version gcc --version # for gcc version
See example output:
# Example output for . /etc/os-release && echo "$PRETTY_NAME" Ubuntu 24.04.2 LTS # Example output for uname -r 6.8.0-1044-nvidia-64k # Example output for gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Update Linux kernel version if needed
For a GB200 system, this workflow was verified with kernel version
6.8.0-1044-nvidia-64k.If your systems has kernel version
6.8.0-1043-nvidia-64kor6.8.0-1044-nvidia-64k, go to Step 5.If your system has a different kernel version, configure grub (GRand Unified Bootloader) so that your system starts up with the verified kernel version.
a. Update grub default ‘menu entry’
In the file
/etc/default/grub, set the variableGRUB_DEFAULTto the verified kernel version
sudo sed --in-place=.bak \ '/^[[:space:]]*GRUB_DEFAULT=/c\GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 6.8.0-1044-nvidia-64k"' \ /etc/default/grub
b. Verify that
etc/default/grubis updatedcat /etc/default/grubc. Update grub and reboot
``` sudo update-grub sudo reboot ```
Remove NVIDIA libraries to avoid version conflicts
See reference Removing the Driver
Check for NVIDIA libraries, using the command
ls /usr/lib/aarch64-linux-gnu/ | grep -i nvidia
If the output from the command above is not empty, run the command below.
sudo apt remove --autoremove --purge -Vy \ cuda-compat\* \ cuda-drivers\* \ libnvidia-cfg1\* \ libnvidia-compute\* \ libnvidia-decode\* \ libnvidia-encode\* \ libnvidia-extra\* \ libnvidia-fbc1\* \ libnvidia-gl\* \ libnvidia-gpucomp\* \ libnvidia-nscq\* \ libnvsdm\* \ libxnvctrl\* \ nvidia-dkms\* \ nvidia-driver\* \ nvidia-fabricmanager\* \ nvidia-firmware\* \ nvidia-headless\* \ nvidia-imex\* \ nvidia-kernel\* \ nvidia-modprobe\* \ nvidia-open\* \ nvidia-persistenced\* \ nvidia-settings\* \ nvidia-xconfig\* \ xserver-xorg-video-nvidia\*
Ignore errors for non-matching patterns.
Download package repositories and install DGX tools
For your system and architecture, GB200 and
arm64, follow the steps as described in Installing DGX System Configurations and ToolsAccept default options when prompted.
The recommended steps can be skipped.
a. Download and unpack ARM64-specific packages
curl https://repo.download.nvidia.com/baseos/ubuntu/noble/arm64/dgx-repo-files.tgz | sudo tar xzf - -C /
b. Update local APT database
sudo apt update
c. Install DGX system tools
sudo apt install -y nvidia-system-core sudo apt install -y nvidia-system-utils sudo apt install -y nvidia-system-extra
d. Install
linux-toolsfor your Linux kernelsudo apt install -y linux-tools-nvidia-64k
e. NVIDIA peermem loader package
sudo apt install -y nvidia-peermem-loader
Install GPU Driver
For your system and architecture, such as GB200 and
arm64, follow the steps as described in https://docs.nvidia.com/dgx/dgx-os-7-user-guide/installing_on_ubuntu.html#installing-the-gpu-driver
a. Do not update the Linux kernel version
b. Pin the driver version
sudo apt install nvidia-driver-pinning-580
c. Install the
openGPU kernel modulesudo apt install --allow-downgrades \ nvidia-driver-580-open \ libnvidia-nscq \ nvidia-modprobe \ nvidia-imex \ datacenter-gpu-manager-4-cuda13 \ nv-persistence-mode
Ignore build errors for modules built for
6.14.0-1015-nvidia-64k
e. Enable the peristence daemon
sudo systemctl enable nvidia-persistenced nvidia-dcgm nvidia-imex
f. Reboot
sudo rebootAfter reboot, repeat Step 1 to check NVIDIA Driver and related tools
Install Docker and the NVIDIA Container Toolkit
Follow the instructions at Installing Docker and the NVIDIA Container Toolkit
Ignore build errors for modules built for
6.14.0-1015-nvidia-64kVerify the NVIDIA Driver, Docker, NVIDIA Container Toolkit stack
sudo docker run --rm --gpus=all nvcr.io/nvidia/cuda:12.6.2-base-ubuntu24.04 nvidia-smi
To run docker as a non-root user, see Manage Docker as non-root user
Verify the NVIDIA Driver, CUDA, Docker, NVIDIA Container Toolkit, Torch stack
a. Log into the NVIDIA Container Registry, using your NGC key as the password.
docker login nvcr.io --username '$oauthtoken'
b. Run
sudo docker run --rm --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \ nvcr.io/nvidia/pytorch:25.12-py3 \ python -c \ "import torch, pynvml; pynvml.nvmlInit(); print('Driver:', pynvml.nvmlSystemGetDriverVersion()); print('CUDA:', torch.version.cuda); print('GPU count:', torch.cuda.device_count())"
The
sudoprefix can be omitted if you completed the previous step, ‘To run docker as a non-root user’.Example output is
============= == PyTorch == ============= NVIDIA Release 25.12 (build 245654591) PyTorch Version 2.10.0a0+b4e4ee8 Container image Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. Copyright (c) 2014-2024 Facebook Inc. Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert) Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu) Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu) Copyright (c) 2011-2013 NYU (Clement Farabet) Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston) Copyright (c) 2006 Idiap Research Institute (Samy Bengio) Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz) Copyright (c) 2015 Google Inc. Copyright (c) 2015 Yangqing Jia Copyright (c) 2013-2016 The Caffe contributors All rights reserved. Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved. GOVERNING TERMS: The software and materials are governed by the NVIDIA Software License Agreement (found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and the Product-Specific Terms for NVIDIA AI Products (found at https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/). NOTE: CUDA Forward Compatibility mode ENABLED. Using CUDA 13.1 driver version 590.44.01 with kernel driver version 580.105.08. See https://docs.nvidia.com/deploy/cuda-compatibility/ for details. Driver: 580.105.08 CUDA: 13.1 GPU count: 4
If the number of CUDA devices accessible is as expected, your system is verified for
(a) GPU setup
(b) NVIDIA Driver
(c) CUDA
(d) Docker
(e) NVIDIA Container Toolkit