Updating the Software

These instructions explain how to update the DGX OS server software through an internet connection to the NVIDIA public repository. The process updates a DGX system image to the latest versions of the entire DGX software stack, including the drivers.

Perform the updates using commands on the DGX server console.

** Preparing for Software Update**

Connecting to the DGX server Console

Connect to the DGX server console using a direct connection or a remote connection through the BMC.

Note

SSH can be used to perform the update. However, if the Ethernet port is configured for DHCP, there is the potential that the IP address can change after the DGX server is rebooted during the update, resulting in loss of connection. If this happens, connect using either a direct connection or through the BMC to continue the update process.

Warning

Connect directly to the DGX server console if the DGX is connected to a 172.17.xx.xx subnet. DGX OS Server software installs Docker CE, which uses the 172.17.xx.xx subnet by default for Docker containers. If the DGX server is on the same subnet, you will not be able to establish a network connection to the DGX server. Refer to the appropriate DGX-1 or DGX-2 User Guide for instructions on how to change the default Docker network settings after performing the update.

Direct Connection

  1. Connect a display to the VGA connector and a keyboard to any one of the USB ports.

  2. Power on the DGX server.

Remote Connection through the BMC

Refer to the appropriate user guide (DGX-1 or DGX-2) for instructions on establishing a remote connection to the BMC.

Verifying the DGX Server Connection to the Repositories

Before attempting to perform the update, verify that the DGX server network connection can access the public repositories and that the connection is not blocked by a firewall or proxy.

On DGX-1 Systems if Upgrading from Version 2.x.

Enter the following on the DGX-1 system:

wget -O f1-changelogshttp://changelogs.ubuntu.com/meta-release-lts
wget -O f2-archive \ http://archive.ubuntu.com/ubuntu/dists/xenial/Release
wget -O f3-usarchive \ http://us.archive.ubuntu.com/ubuntu/dists/xenial/Release
wget -O f4-security \ http://security.ubuntu.com/ubuntu/dists/xenial/Release
wget -O f5-download \ https://download.docker.com/linux/ubuntu/dists/xenial/Release
wget -O f6-international \ http://international.download.nvidia.com/dgx/repos/dists/xenial/Release

All the ``wget`` commands should be successful and there should be six files in the directory with non-zero content.

On DGX-2 and DGX-1 Systems

Enter the following on the DGX system:

wget -O f1-changelogs http://changelogs.ubuntu.com/meta-release-lts
wget -O f2-archive http://archive.ubuntu.com/ubuntu/dists/bionic/Release
wget -O f3-usarchive \ http://us.archive.ubuntu.com/ubuntu/dists/bionic/Release
wget -O f4-security \ http://security.ubuntu.com/ubuntu/dists/bionic/Release
wget -O f5-international \ http://international.download.nvidia.com/dgx/repos/bionic/dists/bionic/Release
wget -O f6-international \ http://international.download.nvidia.com/dgx/repos/bionic/dists/bionic-\ r418+cuda10.1/Release
wget -O f7-international \ http://international.download.nvidia.com/dgx/repos/bionic/dists/bionic-\ r450+cuda11.0/Release

All the wget commands should be successful and there should be seven files in the directory with non-zero content

Update Path Instructions

Follow the instructions corresponding to your current DGX OS server software.

  • Updating from Release 4.1 and later

Follow the instructions at Updating from Release 4.1 and later.

  • Updating from Release 4.0 (Version 4.0.1 or later only)

Follow the instructions at Updating from 4.0.1 (or Later).

  • Updating from Release 3.1

Follow the instructions at Updating from Release 3.1.

  • Updating from Release 2.x

    1. Update from Release 2.x to the latest Release 3.1 as described in the DGX OS 3.1.8 Release Notes.

    2. Update from Release 3.1

Updating from Release 4.1 and Later

See the section Connecting to the DGX Console for guidance on connecting to the console to perform the update.

Note

These instructions update all software for which updates are available from your configured software sources, including applications that you installed yourself. If you want to prevent an application from being updated, you can instruct the Ubuntu package manager to keep the current version. For more information, see Introduction to Holding Packages on the Ubuntu Community Help Wiki.

Update Instructions

  1. If you have not already done so, verify that your DGX system can access the public repositories as explained in Verifying the DGX Server Connection to the Repositories.

    Note

    R418 package/repository users NVIDIA strongly recommends that all users migrate to the R450 branch as R418 has reached end-of-life support. To upgrade, run the following commands:

    sudo apt update
    sudo apt install -y dgx-bionic-r450+cuda11.0-repo
    
    1. Update the list of available packages and their versions.

      sudo apt update
      
    2. Review the packages that will be updated.

      sudo apt full-upgrade -s
      

To prevent an application from being updated, instruct the Ubuntu package manager to keep the current version. See Introduction to Holding Packages.

  1. Upgrade to version 4.14.0

    sudo apt full-upgrade
    
  • Answer any questions that appear.

  • Most questions require a Yes or No response. When asked to select the grub configuration to use, select the current one on the system.

  • Other questions will depend on what other packages were installed before the update and how those packages interact with the update.

  • If a message appears indicating that nvidia-docker.service failed to start, you can disregard it and continue with the next step. The service will start normally at that time.

  • Reboot the system.

Recovering from an Interrupted or Failed Update

If the script is interrupted during the update, such as from a loss of power or loss of network connection, then restore power or restore the network connection, whichever caused the interruption.

If the system encounters a kernel panic after you restore power and reboot the

DGX-2, you will not be able to perform the over-the-network update. You will need to re-image the DGX-2 with the latest image (see the DGX-2 User Guide for instructions) and then perform the network update.

If you are successfully returned to the Linux command line, continue following the instructions from step 2 in the Updating from Release 4.1 and later update instructions

Updating from 4.0.1 (or later)

For Release 4.0, only updates from versions 4.0.1 and later are supported with these instructions. To update from version 4.0.0, you must re-image the system.

See the section “Connecting to the DGX Console” for guidance on connecting to the console to perform the update.

Note

Note: These instructions update all software for which updates are available from your configured software sources, including applications that you installed yourself. If you want to prevent an application from being updated, you can instruct the Ubuntu package manager to keep the current version. For more information, see Introduction to Holding Packages on the Ubuntu Community Help Wiki.

Update Instructions

  1. If you have not already done so, verify that your DGX system can access the public repositories as explained in Verifying the DGX Server Connection to the Repositories.

  2. Update the list of available packages and their versions.

sudo apt update
  1. Install the 4.1.0 components from the repository.

sudo apt install -y dgx-bionic-r418+cuda10.1-repo
  1. (Optional) Skip this step to stay with the R418 package; however, to move to the R450 package, issue the following.

sudo apt install -y dgx-bionic-r450+cuda11.0-repo
  1. Update the new list of packages and their versions.

sudo apt update
  1. Review the packages that will be updated.

sudo apt full-upgrade -s

To prevent an application from being updated, instruct the Ubuntu package manager to keep the current version. See Introduction to Holding Packages.

  1. Upgrade to version 4.14.0.

sudo apt full-upgrade
  • Answer any questions that appear.

  • Most questions require a Yes or No response. When asked to select the grub configuration to use, select the current one on the system.

  • Other questions will depend on what other packages were installed before the update and how those packages interact with the update.

  • If a message appears indicating that nvidia-docker.service failed to start, you can disregard it and continue with the next step. The service will start normally at that time.

  • Reboot the system.

Recovering from an Interrupted or Failed Update

If the script is interrupted during the update, such as from a loss of power or loss of network connection, then restore power or restore the network connection, whichever caused the interruption.

If the system encounters a kernel panic after you restore power and reboot the

DGX-2, you will not be able to perform the over-the-network update. You will need to re-image the DGX-2 with the latest image (see the DGX-2 User Guide for instructions) and then perform the network update.

If you are successfully returned to the Linux command line, continue following the instructions from step 2 in Updating from Version 4.0.1 (or Later) instructions.

Updating from 3.1.x

See the section “Connecting to the DGX Console” for guidance on connecting to the console to perform the update.

!

CAUTION: These instructions update all software for which updates are available from your configured software sources, including applications that you installed yourself. If you want to prevent an application from being updated, you can instruct the Ubuntu package manager to keep the current version. For more information, see Introduction to Holding Packages on the Ubuntu Community Help Wiki.

Update Instructions

  1. If you have not already done so, verify that your DGX-1 system can access the public repositories as explained in >Verifying the DGX Server Connection to the Repositories.

  2. Update the list of available packages and their versions.

sudo apt update
  • Install any updates.

sudo apt -y full-upgrade
  • Install dgx-release-upgrade.

sudo apt install -y dgx-release-upgrade

Begin the update process.

sudo dgx-release-upgrade

If you are using a proxy server, then add the ``-E`` option to keep your proxy environment variables.

Example:

sudo -E dgx-release-upgrade

After starting the update process, respond to the presented options as follows:

  • Press y if you are logged in to the DGX server remotely through secure shell (SSH) and are asked if you want to continue running under SSH.

Continue running under SSH?

This session appears to be running under ssh. It is not recommended to perform a upgrade over ssh currently because in case of failure it is harder to recover.

If you continue, an additional ssh daemon will be started at port ‘1022’.

Do you want to continue?

Continue [yN]

An additional sshd daemon is started.

  • Press Enter in response to the following message.

Starting additional sshd

To make recovery in case of failure easier, an additional sshd will be started on port ‘1022’. If anything goes wrong with the running ssh you can still connect to the additional one.

If you run a firewall, you may need to temporarily open this port. As this is potentially dangerous it’s not done automatically. You can open the port with e.g.:

‘iptables -I INPUT -p tcp –dport 1022 -j ACCEPT’

To continue please press [ENTER]

  • Press Enter in response to the message warning you that third-party sources are disabled.

Third party sources disabled

Some third party entries in your sources.list were disabled. You can re-enable them after the upgrade with the ‘software-properties’ tool or your package manager.

To continue please press [ENTER]

Press N if prompted about dgx.list configuration choices.

Configuration file ‘/etc/apt/sources.list.d/dgx.list’

==> Modified (by you or by a script) since installation.

==> Package distributor has shipped an updated version.

What would you like to do about it ? Your options are:

Y or I : install the package maintainer’s version

N or O : keep your currently-installed version

D : show the differences between the versions

Z : start a shell to examine the situation

The default action is to keep your current version.

*** dgx.list (Y/I/N/O/D/Z) [default=N] ?

  1. When prompted to resolve other configuration files, evaluate the changes before accepting the package maintainer’s version, keeping the local version, or manually resolving the difference. You are also asked to confirm that you want to remove obsolete packages.

  2. At the prompt to confirm starting the upgrade, press Y to begin.

Do you want to start the upgrade?

Installing the upgrade can take several hours. Once the download has finished, the process cannot be canceled.

Continue [yN] Details [d]

Press Y to proceed with the final reboot.

System upgrade is complete.

Restart required

To finish the upgrade, a restart is required.

If you select ‘y’ the system will be restarted.

Continue [yN]

After this reboot, the update process will take several minutes to perform some final installation steps.

Your system is now updated to the latest DGX OS 4 release.

  • (Optional) Follow the instructions at Updating from Release 4.1 and Later if you want to install the R450 driver package.