Installing Software on Air-Gapped DGX A100 Systems

For security purposes, some installations require that systems be isolated from the internet or outside networks. Since most DGX A100 software updates are accomplished through an over- the-network process with NVIDIA servers, this section explains how updates can be made when using an over-the-network method is not an option. It also includes a process for installing Docker containers.

Installing NVDIA DGX A100 Software

There are two ways to install DGX A100 software on an air-gapped DGX A100 system.

One method to update DGX A100 software on an air-gapped DGX A100 system is to download the ISO image, copy it to removable media, and reimage the DGX A100 System from the media. This method is available only for software versions that are available as ISO images for download.

Alternately, you can update the DGX A100 software by performing a network update from a local repository. This method is available only for software versions that are available for over-the- network updates.

Reimaging the System

Here is some information about how you can reimage your DGX A100 system.

Caution

This process destroys all data and software customizations that you have made on the DGX A100 System. Be sure to back up any data that you want to preserve and push any Docker images that you want to keep to a trusted registry.

  1. Obtain the ISO image from the NVIDIA Enterprise Services.

    1. Log in to the NVIDIA Enterprise Support site, and on the Announcements tab, locate the DGX OS Server image ISO file.

    2. Download the image ISO file.

  2. Refer to Restoring the DGX A100 Software Image for additional instructions.

Creating a Local Mirror of the NVIDIA and Canonical Repositories

The procedure below describes how to download all the necessary packages to create a mirror of the repositories that are needed to update NVIDIA DGX systems. The steps are specific to versions 4.0.X and 4.1.X, but they can be edited to work with other versions. For more information on DGX OS versions and the release notes available, visit https://docs.nvidia.com/dgx/dgx-os-server-release-notes/index.html#dgx-os-release-number-scheme. For more information on how to upgrade from versions 4.0.x to 4.1.x, review the release notes: https://docs.nvidia.com/dgx/pdf/DGX-OS-server-4.1-relnotes-update-guide.pdf.

Note

These procedures apply only to upgrades within the same major release, such as 4.x → 4.y. It does not support upgrades across major releases, such as 3.x → 4.x.

Creating the Local Mirror

The instructions in this section are to be performed on a system with network access.

Here are the prerequisites:

  • A system installed with Ubuntu OS is needed to create the mirror because there are several Ubuntu tools that need to be used.

  • You must be logged in to the system installed with Ubuntu OS as an administrator user because this procedure requires sudo privileges.

  • The system must contain enough storage space to replicate the repositories to a file system. The space requirement could be as high as 250 GB.

  • An efficient way to move large amount of data is needed, for example, shared storage in a DMZ, or portable USB drives that can be brought into the air-gapped area.

    The data will need to be moved to the systems that need to be updated. Make sure that any portable drives are formatted using ext4 or FAT32.

  1. Make sure the storage device is attached to the system with network access and identify the mount point of the device.

    Example mount point used in these instructions: /media/usb/repository

  2. Install the apt-mirror package.

    $ sudo apt update
    $ sudo apt install apt-mirror
    
  3. Change the ownership of the target directory to the apt-mirror user in the apt-mirror group.

    $ sudo chown apt-mirror:apt-mirror /media/usb/repository
    

    The target directory must be owned by the user apt-mirror or the replication will not work.

  4. Configure the path of the destination directory in /etc/apt/mirror.list and use the included list of repositories below to retrieve the packages for both Ubuntu base OS and the NVIDIA DGX OS packages.

    DGX OS 5
    ############# config ##################
    #
    set base_path /media/usb/repository #/your/path/here
    #
    # set mirror_path $base_path/mirror
    # set skel_path $base_path/skel
    # set var_path $base_path/var
    # set cleanscript $var_path/clean.sh
    # set defaultarch <running host architecture>
    # set postmirror_script $var_path/postmirror.sh
    set run_postmirror 0
    set nthreads 20
    set _tilde 0
    #
    ############# end config ##############
    # Standard Canonical package repositories:
    deb http://security.ubuntu.com/ubuntu focal-security main multiverse universe restricted
    deb http://archive.ubuntu.com/ubuntu/ focal main multiverse universe restricted
    deb http://archive.ubuntu.com/ubuntu/ focal-updates main multiverse universe restricted
    #
    deb-i386 http://security.ubuntu.com/ubuntu focal-security main universe multiverse restricted
    deb-i386 http://archive.ubuntu.com/ubuntu/ focal main multiverse universe restricted
    deb-i386 http://archive.ubuntu.com/ubuntu/ focal-updates main multiverse universe restricted
    #
    # CUDA specific repositories:
    deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /
    #
    # DGX specific repositories:
    deb http://repo.download.nvidia.com/baseos/ubuntu/focal/x86_64/ focal common dgx
    deb http://repo.download.nvidia.com/baseos/ubuntu/focal/x86_64/ focal-updates common dgx
    #
    deb-i386 http://repo.download.nvidia.com/baseos/ubuntu/focal/x86_64/ focal common dgx deb-i386
    http://repo.download.nvidia.com/baseos/ubuntu/focal/x86_64/ focal-updates common dgx
    #
    # Clean unused items
    clean http://archive.ubuntu.com/ubuntu
    clean http://security.ubuntu.com/ubuntu
    
    DGX OS 4
    ############# config ##################
    #
    set base_path /media/usb/repository #/your/path/here
    #
    # set mirror_path $base_path/mirror
    # set skel_path $base_path/skel
    # set var_path $base_path/var
    # set cleanscript $var_path/clean.sh
    # set defaultarch <running host architecture>
    # set postmirror_script $var_path/postmirror.sh
    set run_postmirror 0
    set nthreads 20
    set _tilde 0
    #
    ############# end config ##############
    # Standard Canonical package repositories:
    deb http://security.ubuntu.com/ubuntu bionic-security main
    deb http://security.ubuntu.com/ubuntu bionic-security universe
    deb http://security.ubuntu.com/ubuntu bionic-security multiverse
    deb http://archive.ubuntu.com/ubuntu/ bionic main multiverse universe
    deb http://archive.ubuntu.com/ubuntu/ bionic-updates main multiverse universe
    #
    deb-i386 http://security.ubuntu.com/ubuntu bionic-security main
    deb-i386 http://security.ubuntu.com/ubuntu bionic-security universe
    deb-i386 http://security.ubuntu.com/ubuntu bionic-security multiverse
    deb-i386 http://archive.ubuntu.com/ubuntu/ bionic main multiverse universe
    deb-i386 http://archive.ubuntu.com/ubuntu/ bionic-updates main multiverse universe
    #
    # DGX specific repositories:
    deb http://international.download.nvidia.com/dgx/repos/bionic bionic main restricted universe multiverse
    deb http://international.download.nvidia.com/dgx/repos/bionic bionic-updates main restricted universe multiverse
    deb http://international.download.nvidia.com/dgx/repos/bionic bionic-r418+cuda10.1 main multiverse restricted universe
    deb http://international.download.nvidia.com/dgx/repos/bionic bionic-r450+cuda11.0 main multiverse restricted universe
    #
    deb-i386 http://international.download.nvidia.com/dgx/repos/bionic bionic main restricted universe multiverse
    deb-i386 http://international.download.nvidia.com/dgx/repos/bionic bionic-updates main restricted universe multiverse
    # Only for DGX OS 4.1.0
    deb-i386 http://international.download.nvidia.com/dgx/repos/bionic bionic-r418+cuda10.1 main multiverse restricted universe
    # Clean unused items
    clean http://archive.ubuntu.com/ubuntu
    clean http://security.ubuntu.com/ubuntu
    
  5. Run apt-mirror and wait for it to finish downloading content.

    This will take a long time depending on the network connection speed.

    $ sudo apt-mirror
    
  6. Eject the removable storage with all packages.

    $ sudo eject /media/usb/repository
    

Configuring the Target Air-Gapped DGX OS 4 System

The instructions in this section are to be performed on the target air-gapped DGX system.

Here are the prerequisites:

  • The target air-gapped DGX system is installed, has gone through the first boot process, and is ready to be updated with the latest packages.

  • The USB storage device on which the mirrors were created is attached to the target DGX system.

    There are other ways to transfer the data that are not covered in this document as they will depend on the data center policies for the air-gapped environment.

  1. Mount the storage device on the air-gapped system to /media/usb/repository for consistency.

  2. Configure the apt command to use the file system as the repository in the file /etc/apt/sources.list by modifying the following lines.

    deb file:///media/usb/repository/mirror/security.ubuntu.com/ubuntu bionic-security main
    deb file:///media/usb/repository/mirror/security.ubuntu.com/ubuntu bionic-security universe
    deb file:///media/usb/repository/mirror/security.ubuntu.com/ubuntu bionic-security multiverse
    deb file:///media/usb/repository/mirror/archive.ubuntu.com/ubuntu/ bionic main multiverse universe
    deb file:///media/usb/repository/mirror/archive.ubuntu.com/ubuntu/ bionic-updates main multiverse universe
    
  3. Configure apt to use the NVIDIA DGX OS packages in the file /etc/apt/sources.list.d/dgx.list.

    deb file:///media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic main multiverse restricted universe
    
  4. If present, remove the file /etc/apt/sources.list.d/docker.list as it is no longer needed and removing it will eliminate error messages during the update process.

  5. (For DGX OS Release 4.1 and later only) Configure apt to use the NVIDIA DGX OS packages in the file /etc/apt/sources.list.d/dgx-bionic-r418-cuda10-1-repo.list.

    $ echo "deb file:///media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic/ bionic-r418+cuda10.1 main multiverse restricted universe" | sudo tee /etc/apt/sources.list.d/dgx-bionic-r418-cuda10-1-repo.list
    
  6. (For DGX OS Release 4.5 and later only) If you want to use the R450 NVIDIA graphics driver and CUDA Toolkit 11.0, configure apt to use the NVIDIA DGX OS packages in the file /etc/apt/sources.list.d/dgx-bionic-r450-cuda11-0-repo.list.

    $ echo "deb file:///media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic/ bionic-r450+cuda11.0 main multiverse restricted universe" | sudo tee /etc/apt/sources.list.d/dgx-bionic-r450-cuda11-0-repo.list
    

    Note

    If you want to continue using earlier releases, for example the R418 NVIDIA graphic driver and CUDA Toolkit 10.1, omit this step.

  7. Edit the file /etc/apt/preferences.d/nvidia to update the Pin parameter as follows.

    Package: *
    #Pin: origin international.download.nvidia.com
    Pin: release o=apt-pin-parameter-air-gap
    Pin-Priority: 600
    
  8. Update the apt repositoryand confirm there are no errors.

    $ sudo apt update
    

    Output from this command is similar to the following example.

    Get:1 file:/media/usb/repository/mirror/security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
    Get:1 file:/media/usb/repository/mirror/security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
    Get:2 file:/media/usb/repository/mirror/archive.ubuntu.com/ubuntu bionic InRelease [242 kB]
    Get:2 file:/media/usb/repository/mirror/archive.ubuntu.com/ubuntu bionic InRelease [242 kB]
    Get:3 file:/media/usb/repository/mirror/archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
    Get:4 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r418+cuda10.1 InRelease [13.0 kB]
    Get:5 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r450+cuda11.0 InRelease [7070 B]
    Get:5 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r450+cuda11.0 InRelease [7070 B]
    Get:6 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic InRelease [13.1 kB]
    Get:3 file:/media/usb/repository/mirror/archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
    Get:4 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r418+cuda10.1 InRelease [13.0 kB]
    Get:6 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic InRelease [13.1 kB]
    Hit:7 https://download.docker.com/linux/ubuntu bionic InRelease
    Get:8 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r418+cuda10.1/multiverse amd64 Packages [10.1 kB]
    Get:9 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r450+cuda11.0/multiverse amd64 Packages [17.4 kB]
    Get:10 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r418+cuda10.1/restricted amd64 Packages [10.3 kB]
    Get:11 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r450+cuda11.0/restricted amd64 Packages [26.4 kB]
    Get:12 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic-r418+cuda10.1/restricted i386 Packages [516 B]
    Get:13 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic/multiverse amd64 Packages [44.5 kB]
    Get:14 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic/multiverse i386 Packages [8,575 B]
    Get:15 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic/restricted i386 Packages [745 B]
    Get:16 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic/restricted amd64 Packages [8,379 B]
    Get:17 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic/universe amd64 Packages [2,946 B]
    Get:18 file:/media/usb/repository/mirror/international.download.nvidia.com/dgx/repos/bionic bionic/universe i386 Packages [496 B]
    Reading package lists... Done
    Building dependency tree
    Reading state information... Done
    249 packages can be upgraded. Run 'apt list --upgradable' to see them.
    $
    
  9. Upgrade the system using the newly configured local repositories.

    $ sudo apt full-upgrade
    

    If you configured apt to use the NVIDIA DGX OS packages in the file /etc/apt/sources.list.d/dgx-bionic-r450-cuda11-0-repo.list, the NVIDIA graphics driver is upgraded to the R450 driver and the package sources are updated to obtain future updates from the R450 driver repositories.

  10. (For DGX OS Release 4.5 and later only) If you configured apt to use the NVIDIA DGX OS packages in the file /etc/apt/sources.list.d/dgx-bionic-r450-cuda11-0-repo.list and want to use CUDA Toolkit 11.0, install it.

    $ sudo apt install cuda-toolkit-11-0
    

    Note

    If you did not configure apt to use the NVIDIA DGX OS packages in the file /etc/apt/sources.list.d/dgx-bionic-r450-cuda11-0-repo.list, omit this step. If you try to install CUDA Toolkit 11.0, the attempt fails.

Configuring the Target Air-Gapped DGX OS 5 System

The instructions in this section are to be performed on the target air-gapped DGX system.

The following are the prerequisites.

  • The target air-gapped DGX system is installed, has gone through the first boot process, and is ready to be updated with the latest packages.

  • The USB storage device on which the mirrors were created is attached to the target DGX system.

    There are other ways to transfer the data that are not covered in this document as they will depend on the data center policies for the air-gapped environment.

  1. Mount the storage device on the air-gapped system to /media/usb/repository for consistency.

  2. Configure the apt command to use the file system as the repository in the file /etc/apt/sources.list by modifying the following lines.

    deb file:///media/usb/repository/mirror/security.ubuntu.com/ubuntu focal-security main multiverse universe restricted
    deb file:///media/usb/repository/mirror/archive.ubuntu.com/ubuntu/ focal main multiverse universe restricted
    deb file:///media/usb/repository/mirror/archive.ubuntu.com/ubuntu/ focal-updates main multiverse universe restricted
    
  3. Configure apt to use the NVIDIA DGX OS packages in the /etc/apt/sources.list.d/dgx.list file.

    deb file:///media/usb/repository/mirror/repo.download.nvidia.com/baseos/ubuntu/focal/x86_64/ focal main dgx
    deb file:///media/usb/repository/mirror/repo.download.nvidia.com/baseos/ubuntu/focal/x86_64/ focal-updates main dgx
    
  4. Configure apt to use the NVIDIA CUDA packages in the /etc/apt/sources.list.d/cuda-compute-repo.list file.

    deb file:///media/usb/repository/mirror/developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /
    
  5. Update the apt repository.

    $ sudo apt update
    

    Output from this command is similar to the following example.

    Get:1 file:/media/usb/repository/mirror/security.ubuntu.com/ubuntu focal-security InRelease [107 kB]
    Get:2 file:/media/usb/repository/mirror/archive.ubuntu.com/ubuntu focal InRelease [265 kB]
    Get:3 file:/media/usb/repository/mirror/archive.ubuntu.com/ubuntu focal-updates InRelease [111 kB]
    Get:4 file:/media/usb/repository/mirror/developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease
    Get:5 file:/media/usb/repository/mirror/repo.download.nvidia.com/baseos/ubuntu/focal/x86_64 focal InRelease [12.5 kB]
    Get:6 file:/media/usb/repository/mirror/repo.download.nvidia.com/baseos/ubuntu/focal/x86_64 focal-updates InRelease [12.4 kB]
    Get:7 file:/media/usb/repository/mirror/developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release [697 B]
    Get:8 file:/media/usb/repository/mirror/developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release.gpg [836 B]
    Reading package lists... Done
    
  6. Upgrade the system using the newly configured local repositories.

    $ sudo apt full-upgrade
    

Installing Docker Containers

This method applies to Docker containers hosted on the NVIDIA NGC Container Registry, and requires that you have an active NGC account.

  1. On a system with internet access, log in to the NGC Container Registry by entering the following command and credentials.

    $ docker login nvcr.io
    

    Username:

    $oauthtoken
    

    Password:

    apikey
    
  2. Type $oauthtoken exactly as shown for the Username.

    This is a special username that enables API key authentication. In place of apikey, paste in the API Key text that you obtained from the NGC website.

  3. Enter the docker pull command, specifying the image registry, image repository, and tag:

    $ docker pull nvcr.io/nvidia/repository:tag
    
  4. Verify the image is on your system using docker images.

    $ docker images
    
  5. Save the Docker image as an archive.

    $ docker save nvcr.io/nvidia/repository:tag > framework.tar
    
  6. Transfer the image to the air-gapped system using removable media such as a USB flash drive.

  7. Load the NVIDIA Docker image.

    $ docker load –i framework.tar
    
  8. Verify the image is on your system.

    $ docker images