Version EL8-22.08

Note

If your system is running a version earlier than EL8-22.05, you need to update the keys on the system. Refer to Rotating the GPG Key for more information about how to rotate the keys.

The DGX Software for Red Hat Enterprise Linux 8, EL8-22.08, is available.

EL8-22.08 supports all DGX products - DGX A100/A800, DGX-2, DGX-1, DGX Station, DGX Station A100, and DGX Station A800.

Change Highlights

March 2023

  • Updated NVIDIA Drivers:

    • R525 NVIDIA GPU Driver to 525.125.06

    • R515 NVIDIA GPU Driver to 515.105.01

    • R470 NVIDIA GPU Driver to 470.182.03

    • R450 NVIDIA GPU Driver to 450.236.01

  • Updated NVSM to 22.09.08

  • Updated DCGM to 2.4.8

  • Updated Docker to 23.0.0

  • Updated DLFW to 22.12

  • Updated NGC CLI to 3.17.0

  • Updated NVIDIA Container Toolkit:

    • libnvidia-container-tools to 1.12.0-1

    • libnvidia-container1 to 1.12.0-1

    • nvidia-container-toolkit to 1.12.0-1

December 19, 2022

  • Updated R515 NVIDIA GPU Driver to 515.86.01

  • Updated NVSM to 22.09.03

Important

The NVIDIA GPU driver branch R515 (and future releases) require a newer NSCQ version. Before installing R515 or if you intend to downgrade later from a branch R515 or newer to R510 or older, refer to Chapter 10 for additional instructions.

November 22, 2022

  • Updated R470 NVIDIA GPU Driver to 470.161.03

  • Updated R450 NVIDIA GPU Driver to 450.216.04

October 14, 2022

Added GPUDirect storage 1.0

Note

When upgrading DGX OS, the system remains on the installed GPU driver branch. For example, the GPU driver branch on the system does not automatically switch from R450 to R470. Refer to the Changing you GPU branch section of the DGX OS User Guide for instructions on switching GPU driver branches.

  • Updated R470 NVIDIA GPU Driver to 470.129.06

  • Updated R450 NVIDIA GPU Driver to 450.203.03

  • Updated NCCL to 2.15.1

  • Updated DCGM to 2.4.7

  • Updated NVSM to 22.06.02

  • Updated Docker-ce to 20.10.18

  • Updated MIG Configuration Tool to 0.4.3

Software Contents

The following table provides version information for software included in the DGX Software Stack for Red Hat Enterprise Linux 8.

Note

Unlike the DGX OS shipped with the NVIDIA DGX system, the DGX software stack for Red Hat does not include the Mellanox OpenFabrics Enterprise Distribution (MLNX_OFED) for Linux. When using MLNX_OFED with Red Hat, ensure you install a supported MLNX_OFED kernel version to avoid incompatibilities with the Red Hat distribution kernel.

Refer to the DGX Software for Red Hat Enterprise Linux 8 Installation Guide for instructions.

Contents of the Repositories

Component

Version

Additional Information

OS

RHEL 8.7

Kernel

4.18.0-372.13.1 or later

GPU Driver

525.105.17

515.105.01

470.182.03

450.236.01

Refer to the NVIDIA Data Center GPU documentation

CUDA Toolkit

11.4

Note: The CUDA Toolkit is only installed for DGX Stations and option for DGX servers. Refer also to the latest CUDA Release Notes for driver compatibility information.

NCCL

2.15.1

cuDNN

8.4.1

DCGM

2.4.7

Mellanox OFED

5.4-3.7.5.0 or 5.8-3.0.7.0

DGX A100 Systems with ConnectX-7 use v5.4.

DGX A100 Systems with ConnectX-6 use v5.8.

DGX-1 and DGX-2 with ConnectX-4 or ConnectX-5 use v5.8.

The drivers are compatible with RHEL 8.8. Refer to DGX Software for Red Hat Enterprise Linux 8 Installation Guide for installation information.

For information about LTS software versions for related networking components, refer to the Networking Long-Term Support Releases page.

MLNX FW

ConnectX-7: 28.34.4000

ConnectX-6: 20.35.3006

ConnectX-5: 16.35.3006

ConnectX-4: 12.28.2006

The firmware is compatible with RHEL 8.8.

NVSM

22.09.08

Refer to NVIDIA System Management Documentation

Docker Engine

23.0

Refer to Docker Engine

DLFW

22.12

NGC CLI

3.17.0

Refer to NGC CLI Documentation

NVIDIA Container Toolkit

1.12

NVIDIA Container Toolkit includes the following packages:

  • libnvidia-container-tools: 1.12.0-1

  • libnvidia-container1: 1.12.0-1

  • nvidia-container-toolkit: 1.12.0-1

  • nvidia-docker2: 2.11.0

GPUDirect Storage (GDS)

1.0

Refer to GDS Documentation

MIG Configuration Tool

nvidia-mig-manager 0.4.3

Refer to NVIDIA mig-parted github pages: and deployments

nvipmitool

1.0.6.0

nvidia-peer-memory

nvidia-peer-memory-dkms

1.3.0

Compatibility

NVIDIA has validated and tested DGX Software version EL8-22.08 on the following systems:

  • Linux Distribution and kernel:

    • Red Hat Enterprise Linux 8.6

    • Rocky Linux 8

    • Kernel 4.18.0-372.13.1

  • NVIDIA DGX systems

    • NVIDIA DGX A100/A800 with Red Hat Enterprise Linux 8.6 and Rocky Linux 8

    • NVIDIA DGX-2 with Red Hat Enterprise Linux 8.6 and Rocky Linux 8

    • NVIDIA DGX-1 (V100) with Red Hat Enterprise Linux 8.6 and Rocky Linux 8

    • NVIDIA DGX Station with Red Hat Enterprise Linux 8.6 and Rocky Linux 8

    • NVIDIA DGX Station A100 with Red Hat Enterprise Linux 8.6 and Rocky Linux 8

    • NVIDIA DGX Station A800 with Red Hat Enterprise Linux 8.6 and Rocky Linux 8

  • 22.08 Deep Learning Framework containers

  • NVIDIA GPUDirect Storage v1.0 - refer to the GDS documentation for additional information.

  • MLNX OFED version 5.4-3.5.8.0

  • ConnectX Firmware: see table above

Update Instructions

See the section Installing and Updating the Software for instructions.