DGX OS 5 / Ubuntu 20.04
- Introduction
- Release Guidance
- Release Notes
- Current Software Versions
- DGX OS Releases
- Update History
- DGX OS 5.5 Release: April, 2023
- Update: November 22, 2022
- Update: October 14, 2022
- DGX OS 5.4 Release: August 8, 2022
- Update: June 7, 2022
- Update: May 17, 2022
- DGX OS 5.3 Release: April 28, 2022
- DGX OS 5.2 Release: February 17, 2022
- Update: December 14, 2021
- Update: October 26 , 2021
- DGX OS 5.1 Release: August 26, 2021
- Update: June 30 , 2021
- Update: June 20 , 2021
- Update: June 2, 2021
- Update: May 27, 2021
- Update: May 06, 2021
- Update: April 20, 2021
- Update: April 13, 2021
- Update: March 30, 2021
- Update: March 2, 2021
- Update: February 23, 2021
- Update: January 20, 2021
- Update: December 11, 2020
- Update: October 31, 2020 (DGX OS 5.0 Release)
- DGX OS ISO Releases
- Initial Setup
- Reimaging
- Installing on Ubuntu
- Prerequisites
- Installation Considerations
- Installing Ubuntu
- Installing the DGX Software Stack
- Installing DGX System Configurations and Tools
- Configuring Data Drives
- Installing the GPU Driver
- Installing the Mellanox OpenFabrics Enterprise Distribution (MLNX_OFED)
- Installing Docker and the NVIDIA Container Toolkit
- Installing the NVIDIA System Management (NVSM) Tool [Recommended]
- Additional Software Installed By DGX OS
- Next Steps and Additional Information
- Upgrading
- System Configurations
- Additional Software
- Upgrading the System
- Checking the Currently Installed Driver Branch
- Determining the New Available Driver Branches
- Upgrading Your GPU Branch
- CUDA Compatibility Matrix and Forward Compatibility
- Checking the Currently Installed CUDA Toolkit Release
- Installing or Upgrading the CUDA Toolkit
- Configuring IOMMU
- Installing GPUDirect Storage
- Installing nvidia_peermem
- Installing nvidia-gds
- Known Issues
- Known Issue Overview
- Known Issues Details
- DGX A800 Station/Server: DCGM Diagnostics may return Skip - All
- DGX A800 Station/Server: mig-parted config
- Regression of CUDA application startup performance
- NVSM Stress Test Logs Do Not Contain Summary Information
- nvidia-release-upgrade May Report That Not All Updates Have Been Installed and Exit
- Duplicate EFI Variable May Cause efibootmgr to Fail
- Erroneous Insufficient Power Error May Occur for PCIe Slots
- AMD Crypto Coprocessor is not Supported
- nvsm show alerts Reports NVSwitch PCIe Link Width is Degraded
- nvsm show health Reports Firmware as Not Authenticated
- Running NGC Containers Older than 20.10 May Produce “Incompatible MOFED Driver” Message
- System May Slow Down When Using mpirun
- Forced Reboot Hangs the OS
- Applications that call the cuCTXCreate API Might Experience a Performance Drop
- NVIDIA Desktop Shortcuts Not Updated After a DGX OS Release Upgrade
- Unable to Set a Separate/xinerama Mode through the xorg.conf File or through nvidia-settings
- Known Limitations Details
- No RAID Partition Created After ISO Install
- System Services Startup Messages Appear Upon Completion of First-Boot Setup
- [DGX A100]: Hot-plugging of Storage Drives not Supported
- [DGX A100]: Syslog Contains Numerous “SM LID is 0, maybe no SM is running” Error Messages
- [DGX-2]: Serial Over LAN Does not Work After Cold Resetting the BMC
- [DGX-2]: Some BMC Dashboard Quick Links Appear Erroneously
- [DGX-2]: Applications Cannot be Run Immediately Upon Powering on the DGX-2
- [DGX-1]: Script Cannot Recreate RAID Array After Re-inserting a Known Good SSD
- [DGX Station A100] Suspend and Power Button Section Appears in Power Settings
- [DGX-2] NVSM Does not Detect Downgraded GPU PCIe Link
- Resolved Issues Details
Appendices
- DGX OS Connectivity Requirements
- Downgrading ConnectX Firmware
- DGX Software Stack
- PXE Boot Setup
- Overview of the PXE Server
- Setting Up dnsmasq for DHCP
- Configure the FTP Side
- Configuring the HTTP File Directory and ISO Image
- Configure your DHCP Server
- Configuring the syslinux Bootloader
- Live Boot Parameters
- Setup GRUB PXE Server
- Optional: Configure the NVIDIA ConnextX cards to PXE boot
- Query the UEFI PXE ROM State
- Enable the UEFI PXE ROM
- Curtin Customizations
- Customizing the Curtin YAML
- Boot the DGX System over PXE
- Reference: Other IPMI Boot Options
- Overview of the PXE Server
- Air-Gapped Installations
- Cloud-init Configuration File