System Configurations

This section provides information about less common configuration options once a system has been installed.

Refer also to DGX OS Connectivity Requirements for a list of network ports used by various services.

Network Configuration

This section provides information about you can configure the network in your DGX system.

Configuring Network Proxies

If your network needs to use a proxy server, you need to set up configuration files to ensure the DGX system communicates through the proxy.

For the OS and Most Applications

Here is some information about configuring the network for the OS and other applications.

Edit the /etc/environment file and add the following proxy addresses to the file, below the PATH line.

http_proxy="http://<username>:<password>@<host>:<port>/"
ftp_proxy="ftp://<username>:<password>@<host>:<port>/"
https_proxy="https://<username>:<password>@<host>:<port>/"
no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"
HTTP_PROXY="http://<username>:<password>@<host>:<port>/"
FTP_PROXY="ftp://<username>:<password>@<host>:<port>/"
HTTPS_PROXY="https://<username>:<password>@<host>:<port>/"
NO_PROXY="localhost,127.0.0.1,localaddress,.localdomain.com"

Where username and password are optional.

For example, for the HTTP proxy (both, upper and lower case versions must be changed):

http_proxy="http://myproxy.server.com:8080/"
HTTP_PROXY="http://myproxy.server.com:8080/"

For the apt Package Manager

Here is some information about configuring the network for the apt package manager.

Edit or create the /etc/apt/apt.conf.d/myproxy proxy configuration file and include the following lines:

Acquire::http::proxy "http://<username>:<password>@<host>:<port>/";
Acquire::ftp::proxy "ftp://<username>:<password>@<host>:<port>/";
Acquire::https::proxy "https://<username>:<password>@<host>:<port>/";

For example:

Acquire::http::proxy "http://myproxy.server.com:8080/";
Acquire::ftp::proxy "ftp://myproxy.server.com:8080>/";
Acquire::https::proxy "https://myproxy.server.com:8080/";

Configuring ConnectX from InfiniBand to Ethernet

Many DGX Systems are equipped with NVIDIA ConnectX network controllers and are typically used for cluster communications. By default, the controllers are configured as InfiniBand ports. Optionally, you can configure the ports for Ethernet.

Before or after you reconfigure the port, make sure that the network switch that is connected to the port is also reconfigured to Ethernet or that the port is connected to a different switch that is configured for Ethernet.

The code samples in the following sections show the mlxconfig command. The mlxconfig command applies to hosts that use the MLNX_OFED drivers. If your host uses the Inbox OFED drivers, then substitute the mstconfig command.

You can determine if your host uses the MLNX_OFED drivers by running the sudo nvidia-manage-ofed.py -s command. If the output indicates package names beneath the Mellanox OFED Packages Installed: field, then the MLNX_OFED drivers are installed.

Determining the Current Port Configuration

Perform the following steps to determine the current configuration for the port.

  • Query the devices:

    sudo mlxconfig -e query | egrep -e Device\|LINK_TYPE
    

    The following example shows the output for one of the port devices on an NVIDIA DGX 100 System. The output shows the device path and the default, current, and next boot configuration that are all set to IB(1).

    Device #9:
    Device type:    ConnectX6
    Device:         0000:e1:00.0
    *        LINK_TYPE_P1                  IB(1)           IB(1)          IB(1)
    *        LINK_TYPE_P2                  IB(1)           IB(1)          IB(1)
    
    • IB(1) indicates the port is configured for InfiniBand.

    • ETH(2) indicates the port is configured for Ethernet.

Determine the device path bus numbers for the slot number of the port that you want to configure. Refer to the following documents for more information:

Configuring the Port

  1. Use the mlxconfig command with the set LINK_TYPE_P<x> argument for each port you want to configure.

    The following sample command sets port 1 of the controller with PCI ID e1:00.0 to Ethernet (2):

    sudo mlxconfig -y -d e1:00.0 set LINK_TYPE_P1=2
    

    The following example output is from an NVIDIA DGX A100 System.

    Device #1:
    ----------
    
    Device type:    ConnectX6
    Name:           MCX653106A-HDA_Ax
    Description:    ConnectX-6 VPI adapter card; HDR IB (200Gb/s) and 200GbE; dual-port QSFP56; PCIe4.0 x16; tall bracket; ROHS R6
    Device:         e1:00.0
    
    Configurations:                                      Next Boot       New
             LINK_TYPE_P1                                ETH(2)          IB(1)
    
     Apply new Configuration? (y/n) [n] : y
    Applying... Done!
    -I- Please reboot machine to load new configurations.
    

    Here is an example that sets port 2 to Ethernet:

    sudo mlxconfig -y -d e1:00.0 set LINK_TYPE_P2=2
    
  2. (Optional) Run mlxconfig again to confirm the change:

    sudo mlxconfig -e query |egrep -e Device\|LINK_TYPE
    

    In the following output, port LINK_TYPE_2 is set to ETH(2) for the next boot. The output shows the device path and the default, current, and next boot configuration.

    ...
    Device #9:
    Device type:    ConnectX6
    Device:         0000:e1:00.0
    *        LINK_TYPE_P1                  IB(1)           IB(1)          IB(1)
    *        LINK_TYPE_P2                  IB(1)           IB(1)          ETH(2)
    
  3. Perform an AC power cycle on the system for the change to take effect.

    Wait for the operating system to boot.

Docker Configuration

To ensure that Docker can access the NGC container registry through a proxy, Docker uses environment variables.

For best practice recommendations on configuring proxy environment variables for Docker, refer to Control Docker with systemd.

Preparing the DGX System to be Used With Docker

Some initial setup of the DGX system is required to ensure that users have the required privileges to run Docker containers and to prevent IP address conflicts between Docker and the DGX system.

Enabling Users To Run Docker Containers

To prevent the docker daemon from running without protection against escalation of privileges, the Docker software requires sudo privileges to run containers. Meeting this requirement involves enabling users who will run Docker containers to run commands with sudo privileges.

You should ensure that only users whom you trust and who are aware of the potential risks to the DGX system of running commands with sudo privileges can run Docker containers.

Before you allow multiple users to run commands with sudo privileges, consult your IT department to determine whether you might be violating your organization’s security policies. For the security implications of enabling users to run Docker containers, see Docker daemon attack surface.

You can enable users to run the Docker containers in one of the following ways:

  • Add each user as an administrator user with sudo privileges.

  • Add each user as a standard user without sudo privileges and then add the user to the docker group.

This approach is inherently insecure because any user who can send commands to the docker engine can escalate privilege and run root-user operations.

To add an existing user to the docker group, run this command:

sudo usermod -aG docker user-login-id

Where `user-login-id is the user login ID of the existing user that you are adding to the docker group.

Configuring Docker IP Addresses

To ensure that your DGX system can access the network interfaces for Docker containers, Docker should be configured to use a subnet distinct from other network resources used by the DGX system.

By default, Docker uses the 172.17.0.0/16 subnet. Consult your network administrator to find out which IP addresses are used by your network. If your network does not conflict with the default Docker IP address range, no changes are needed and you can skip this section. However, if your network uses the addresses in this range for the DGX system, you should change the default Docker network addresses.

You can change the default Docker network addresses by modifying the /etc/docker/daemon.json file or modifying the /etc/systemd/system/docker.service.d/docker-override.conf file. These instructions provide an example of modifying the /etc/systemd/system/docker.service.d/docker-override.conf to override the default Docker network addresses.

  1. Open the docker-override.conf file for editing.

    sudo vi /etc/systemd/system/docker.service.d/docker-override.conf
    [Service]
    ExecStart=
    ExecStart=/usr/bin/dockerd -H fd:// -s overlay2
    LimitMEMLOCK=infinity
    LimitSTACK=67108864
    
  2. Make the changes indicated in bold below, setting the correct bridge IP address and IP address ranges for your network.

    Consult your IT administrator for the correct addresses.

    [Service]
    ExecStart=
    ExecStart=/usr/bin/dockerd -H fd:// -s overlay2 --bip=192.168.127.1/24
    --fixed-cidr=192.168.127.128/25
    LimitMEMLOCK=infinity
    LimitSTACK=67108864
    
  3. Save and close the /etc/systemd/system/docker.service.d/docker-override.conf file.

  4. Reload the systemctl daemon.

    sudo systemctl daemon-reload
    
  5. Restart Docker.

    sudo systemctl restart docker
    

Connectivity Requirements for NGC Containers

To run NVIDIA NGC containers from the NGC container registry, your network must be able to access the following URLs:

To verify connection to nvcr.io, run

wget https://nvcr.io/v2

You should see connecting verification followed by a 401 error:

--2018-08-01 19:42:58-- https://nvcr.io/v2
Resolving nvcr.io (nvcr.io) --> 52.8.131.152, 52.9.8.8
Connecting to nvcr.io (nvcr.io)|52.8.131.152|:443. --> connected.
HTTP request sent, awaiting response. --> 401 Unauthorized

Configuring Static IP Addresses for the Network Ports

Here are the steps to configure static IP addresses for network ports.

During the initial boot set up process for your DGX system, one of the steps was to configure static IP addresses for a network interface. If you did not configure the addresses at that time, you can configure the static IP addresses from the Ubuntu command line using the following instructions.

Note

If you are connecting to the DGX console remotely, connect by using the BMC remote console. If you connect using SSH, your connection will be lost when you complete the final step. Also, if you encounter issues with the configuration file, the BMC connection will help with troubleshooting.

If you cannot remotely access the DGX system, connect a display with a 1440x900 or lower resolution, and a keyboard directly to the DGX system.

  1. Determine the port designation that you want to configure, based on the physical Ethernet port that you have connected to your network.

    If your network needs to use a proxy server, you need to set up configuration files to ensure the DGX system communicates through the proxy.

    Refer to Configuring Network Proxies for the port designation of the connection that you want to configure.

  2. Edit the network configuration YAML file, /etc/netplan/01-netcfg.yaml, and make the following edits.

    Note

    Ensure that your file is identical to the following sample and use spaces and not tabs.

    # This file describes the network interfaces available on your system
    # For more information, see netplan(5).
    
    network:
      version: 2
      renderer: networkd
      ethernets:
        enp226s0:
          dhcp4: no
          addresses:
            - 10.10.10.2/24
          routes:
            - to: default
              via: 10.10.10.1
          nameservers:
            addresses: [ 8.8.8.8 ]
    

    Consult your network administrator for your site-specific values such as network, gateway, and nameserver addresses. Replace enp226s0 with the designations that you determined in the preceding step.

  3. Save the file.

  4. Apply the changes.

    sudo netplan apply
    

Note

If you are not returned to the command line prompt after a information, see Changes, errors, and bugs in the Ubuntu Server Guide.

Managing CPU Mitigations

DGX OS software includes security updates to mitigate CPU speculative side-channel vulnerabilities. These mitigations can decrease the performance of deep learning and machine learning workloads.

If your DGX system installation incorporates other measures to mitigate these vulnerabilities, such as measures at the cluster level, you can disable the CPU mitigations for individual DGX nodes and increase performance.

Determining the CPU Mitigation State of the DGX System

Here is information about how you can determine the CPU mitigation state of your DGX system.

If you do not know whether CPU mitigations are enabled or disabled, issue the following.

cat /sys/devices/system/cpu/vulnerabilities/*

CPU mitigations are enabled when the output consists of multiple lines prefixed with Mitigation:.

For example:

KVM: Mitigation: Split huge pages
Mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable Mitigation: Clear CPU buffers; SMT vulnerable
Mitigation: PTI
Mitigation: Speculative Store Bypass disabled via prctl and seccomp Mitigation: usercopy/swapgs barriers and __user pointer sanitization Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP:
conditional, RSB filling
Mitigation: Clear CPU buffers; SMT vulnerable

CPU mitigations are disabled if the output consists of multiple lines prefixed with Vulnerable.

KVM: Vulnerable
Mitigation: PTE Inversion; VMX: vulnerable Vulnerable; SMT vulnerable
Vulnerable
Vulnerable
Vulnerable: user pointer sanitization and usercopy barriers only; no swapgs barriers
Vulnerable, IBPB: disabled, STIBP: disabled Vulnerable

Disabling CPU Mitigations

Here are the steps to disable CPU mitigations.

Caution: Performing the following instructions will disable the CPU mitigations provided by the DGX OS software.

  1. Install the nv-mitigations-off package.

    sudo apt install nv-mitigations-off -y
    
  2. Reboot the system.

  3. Verify that the CPU mitigations are disabled.

    cat /sys/devices/system/cpu/vulnerabilities/*
    

The output should include several vulnerable lines. See Determining the CPU Mitigation State of the DGX System Here is information about how you can determine the CPU mitigation state of your DGX system example output.

Re-enable CPU Mitigations

Here are the steps to enable CPU mitigations again.

  1. Remove the nv-mitigations-off package.

    sudo apt purge nv-mitigations-off
    
  2. Reboot the system.

  3. Verify that the CPU mitigations are enabled.

    cat /sys/devices/system/cpu/vulnerabilities/*
    

The output should include several Mitigations lines. See Determining the CPU Mitigation State of the DGX System for example output.

Managing the DGX Crash Dump Feature

This section provides information about managing the DGX Crash Dump feature. You can use the script that is included in the DGX OS to manage this feature.

Using the Script

Here are commands that help you complete the necessary tasks with the script.

  • To enable only dmesg crash dumps, run:

    /usr/sbin/nvidia-kdump-config enable-dmesg-dump
    

    This option reserves memory for the crash kernel.

  • To enable both dmesg and vmcore crash dumps, run:

    /usr/sbin/nvidia-kdump-config enable-vmcore-dump
    

    This option reserves memory for the crash kernel.

  • To disable crash dumps, run:

    /usr/sbin/nvidia-kdump-config disable
    

This option disables the use of kdump and ensures that no memory is reserved for the crash kernel.

Connecting to Serial Over LAN

You can connect to serial over a LAN.

Warning

This applies only to systems that have the BMC.

While dumping vmcore, the BMC screen console goes blank approximately 11 minutes after the crash dump is started. To view the console output during the crash dump, connect to serial over LAN as follows:

ipmitool -I lanplus -H -U -P sol activate

Filesystem Quotas

Here is some information about filesystem quotas.

When running NGC containers you might need to limit the amount of disk space that is used on a filesystem to avoid filling up the partition. Refer to How to Set Filesystem Quotas on Ubuntu 18.04 about how to set filesystem quotas on Ubuntu 18.04 and later.

Running Workloads on Systems with Mixed Types of GPUs

The DGX Station A100 comes equipped with four high performance NVIDIA A100 GPUs and one DGX Display GPU. The NVIDIA A100 GPU is used to run high performance and AI workloads, and the DGX Display card is used to drive a high-quality display on a monitor.

When running applications on this system, it is important to identify the best method to launch applications and workloads to make sure the high performance NVIDIA A100 GPUs are used. You can achieve this in one of the following ways:

When you log into the system and check which GPUs are available, you find the following:

nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
GPU 2: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 3: DGX Display (UUID: GPU-91b9d8c8-e2b9-6264-99e0-b47351964c52)
GPU 4: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)

A total of five GPUs are listed by nvidia-smi. This is because nvidia-smi is including the DGX Display GPU that is used to drive the monitor and high-quality graphics output.

When running an application or workload, the DGX Display GPU can get in the way because it does not have direct NVlink connectivity, sufficient memory, or the performance characteristics of the NVIDIA A100 GPUs that are installed on the system. As a result you should ensure that the correct GPUs are being used.

Running with Docker Containers

On the DGX OS, because Docker has already been configured to identify the high performance NVIDIA A100 GPUs and assign the GPUs to the container, this method is the simplest.

A simple test is to run a small container with the [–gpus all] flag in the command and once in the container that is running nvidia-smi. The output shows that only the high-performance GPUs are available to the container:

docker run --gpus all --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
GPU 2: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 3: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)

This step will also work when the --gpus n flag is used, where n can be 1, 2, 3, or 4. These values represent the number of GPUs that should be assigned to that container. For example:

docker run --gpus 2 --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)

In this example, Docker selected the first two GPUs to run the container, but if the device option is used, you can specify which GPUs to use:

docker run --gpus '"device=GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5,GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf"' --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 1: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)

In this example, the two GPUs that were not used earlier are now assigned to run on the container.

Running on Bare Metal

To run applications by using the four high performance GPUs, the CUDA_VISIBLE_DEVICES variable must be specified before you run the application.

Note

This method does not use containers.

CUDA orders the GPUs by performance, so GPU 0 will be the highest performing GPU, and the last GPU will be the slowest GPU.

Warning

CUDA_DEVICE_ORDER variable is set to PCI_BUS_ID, this ordering will be overridden.

In the following example, a CUDA application that comes with CUDA samples is run. In the output, GPU 0 is the fastest in a DGX Station A100, and GPU 4 (DGX Display GPU) is the slowest:

sudo apt install cuda-samples-11-2
cd /usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest
sudo make
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../common/inc  -m64    --threads
0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
-gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52
-gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61
-gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75
-gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86
-gencode arch=compute_86,code=compute_86 -o p2pBandwidthLatencyTest.o -c p2pBandwidthLatencyTest.cu
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/local/cuda/bin/nvcc -ccbin g++   -m64
-gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
-gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52
-gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61
-gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75
-gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86
-gencode arch=compute_86,code=compute_86 -o p2pBandwidthLatencyTest p2pBandwidthLatencyTest.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
mkdir -p ../../bin/x86_64/linux/release
cp p2pBandwidthLatencyTest ../../bin/x86_64/linux/release
lab@ro-dvt-058-80gb:/usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest cd /usr/local/cuda-11.2/samples/bin/x86_64/linux/release
lab@ro-dvt-058-80gb:/usr/local/cuda-11.2/samples/bin/x86_64/linux/release ./p2pBandwidthLatencyTest
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, Graphics Device, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, Graphics Device, pciBusID: 47, pciDeviceID: 0, pciDomainID:0
Device: 2, Graphics Device, pciBusID: 81, pciDeviceID: 0, pciDomainID:0
Device: 3, Graphics Device, pciBusID: c2, pciDeviceID: 0, pciDomainID:0
Device: 4, DGX Display, pciBusID: c1, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=0 CAN Access Peer Device=2
Device=0 CAN Access Peer Device=3
Device=0 CANNOT Access Peer Device=4
Device=1 CAN Access Peer Device=0
Device=1 CAN Access Peer Device=2
Device=1 CAN Access Peer Device=3
Device=1 CANNOT Access Peer Device=4
Device=2 CAN Access Peer Device=0
Device=2 CAN Access Peer Device=1
Device=2 CAN Access Peer Device=3
Device=2 CANNOT Access Peer Device=4
Device=3 CAN Access Peer Device=0
Device=3 CAN Access Peer Device=1
Device=3 CAN Access Peer Device=2
Device=3 CANNOT Access Peer Device=4
Device=4 CANNOT Access Peer Device=0
Device=4 CANNOT Access Peer Device=1
Device=4 CANNOT Access Peer Device=2
Device=4 CANNOT Access Peer Device=3

Note

In case a device doesn’t have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix
     D\D     0     1     2     3     4
     0       1     1     1     1     0
     1       1     1     1     1     0
     2       1     1     1     1     0
     3       1     1     1     1     0
     4       0     0     0     0     1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1      2      3      4
     0 1323.03  15.71  15.37  16.81  12.04
     1  16.38 1355.16  15.47  15.81  11.93
     2  16.25  15.85 1350.48  15.87  12.06
     3  16.14  15.71  16.80 1568.78  11.75
     4  12.61  12.47  12.68  12.55 140.26
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
   D\D     0      1      2      3      4
     0 1570.35  93.30  93.59  93.48  12.07
     1  93.26 1583.08  93.55  93.53  11.93
     2  93.44  93.58 1584.69  93.34  12.05
     3  93.51  93.55  93.39 1586.29  11.79
     4  12.68  12.54  12.75  12.51 140.26
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1      2      3      4
     0 1588.71  19.60  19.26  19.73  16.53
     1  19.59 1582.28  19.85  19.13  16.43
     2  19.53  19.39 1583.88  19.61  16.58
     3  19.51  19.11  19.58 1592.76  15.90
     4  16.36  16.31  16.39  15.80 139.42
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D     0      1      2      3      4
     0 1590.33 184.91 185.37 185.45  16.46
     1 185.04 1587.10 185.19 185.21  16.37
     2 185.15 185.54 1516.25 184.71  16.47
     3 185.55 185.32 184.86 1589.52  15.71
     4  16.26  16.28  16.16  15.69 139.43
P2P=Disabled Latency Matrix (us)
   GPU     0      1      2      3      4
     0   3.53  21.60  22.22  21.38  12.46
     1  21.61   2.62  21.55  21.65  12.34
     2  21.57  21.54   2.61  21.55  12.40
     3  21.57  21.54  21.58   2.51  13.00
     4  13.93  12.41  21.42  21.58   1.14

   CPU     0      1      2      3      4
     0   4.26  11.81  13.11  12.00  11.80
     1  11.98   4.11  11.85  12.19  11.89
     2  12.07  11.72   4.19  11.82  12.49
     3  12.14  11.51  11.85   4.13  12.04
     4  12.21  11.83  12.11  11.78   4.02
P2P=Enabled Latency (P2P Writes) Matrix (us)
   GPU     0      1      2      3      4
     0   3.79   3.34   3.34   3.37  13.85
     1   2.53   2.62   2.54   2.52  12.36
     2   2.55   2.55   2.61   2.56  12.34
     3   2.58   2.51   2.51   2.53  14.39
     4  19.77  12.32  14.75  21.60   1.13

   CPU     0      1      2      3      4
     0   4.27   3.63   3.65   3.59  13.15
     1   3.62   4.22   3.61   3.62  11.96
     2   3.81   3.71   4.35   3.73  12.15
     3   3.64   3.61   3.61   4.22  12.06
     4  12.32  11.92  13.30  12.03   4.05

Note

The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

The example above shows the peer-to-peer bandwidth and latency test across all five GPUs, including the DGX Display GPU. The application also shows that there is no peer-to-peer connectivity between any GPU and GPU 4. This indicates that GPU 4 should not be used for high-performance workloads.

Run the example one more time by using the CUDA_VISIBLE_DEVICES variable, which limits the number of GPUs that the application can see.

Note

All GPUs can communicate with all other peer devices.

 CUDA_VISIBLE_DEVICES=0,1,2,3 ./p2pBandwidthLatencyTest
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, Graphics Device, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, Graphics Device, pciBusID: 47, pciDeviceID: 0, pciDomainID:0
Device: 2, Graphics Device, pciBusID: 81, pciDeviceID: 0, pciDomainID:0
Device: 3, Graphics Device, pciBusID: c2, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=0 CAN Access Peer Device=2
Device=0 CAN Access Peer Device=3
Device=1 CAN Access Peer Device=0
Device=1 CAN Access Peer Device=2
Device=1 CAN Access Peer Device=3
Device=2 CAN Access Peer Device=0
Device=2 CAN Access Peer Device=1
Device=2 CAN Access Peer Device=3
Device=3 CAN Access Peer Device=0
Device=3 CAN Access Peer Device=1
Device=3 CAN Access Peer Device=2

Note

In case a device doesn’t have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix
     D\D     0     1     2     3
     0       1     1     1     1
     1       1     1     1     1
     2       1     1     1     1
     3       1     1     1     1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1      2      3
     0 1324.15  15.54  15.62  15.47
     1  16.55 1353.99  15.52  16.23
     2  15.87  17.26 1408.93  15.91
     3  16.33  17.31  18.22 1564.06
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
   D\D     0      1      2      3
     0 1498.08  93.30  93.53  93.48
     1  93.32 1583.08  93.54  93.52
     2  93.55  93.60 1583.08  93.36
     3  93.49  93.55  93.28 1576.69
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1      2      3
     0 1583.08  19.92  20.47  19.97
     1  20.74 1586.29  20.06  20.22
     2  20.08  20.59 1590.33  20.01
     3  20.44  19.92  20.60 1589.52
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D     0      1      2      3
     0 1592.76 184.88 185.21 185.30
     1 184.99 1589.52 185.19 185.32
     2 185.28 185.30 1585.49 185.01
     3 185.45 185.39 184.84 1587.91
P2P=Disabled Latency Matrix (us)
   GPU     0      1      2      3
     0   2.38  21.56  21.61  21.56
     1  21.70   2.34  21.54  21.56
     2  21.55  21.56   2.41  21.06
     3  21.57  21.34  21.56   2.39

   CPU     0      1      2      3
     0   4.22  11.99  12.71  12.09
     1  11.86   4.09  12.00  11.71
     2  12.52  11.98   4.27  12.24
     3  12.22  11.75  12.19   4.25
P2P=Enabled Latency (P2P Writes) Matrix (us)
   GPU     0      1      2      3
     0   2.32   2.57   2.55   2.59
     1   2.55   2.32   2.59   2.52
     2   2.59   2.56   2.41   2.59
     3   2.57   2.55   2.56   2.40

   CPU     0      1      2      3
     0   4.24   3.57   3.72   3.81
     1   3.68   4.26   3.75   3.63
     2   3.79   3.75   4.34   3.71
     3   3.72   3.64   3.66   4.32

Note

The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

For bare metal applications, the UUID can also be specified in the [CUDA_VISIBLE_DEVICES] variable as shown below:

CUDA_VISIBLE_DEVICES=GPU-0f2dff15-7c85-4320-da52-d3d54755d182,GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5 ./p2pBandwidthLatencyTest

The GPU specification is longer because of the nature of UUIDs, but this is the most precise way to pin specific GPUs to the application.

Using Multi-Instance GPUs

Multi-Instance GPUs (MIG) is available on NVIDIA A100 GPUs. If MIG is enabled on the GPUs, and if the GPUs have already been partitioned, then applications can be limited to run on these devices.

This works for both Docker containers and for bare metal using the [CUDA_VISIBLE_DEVICES] as shown in the examples below. For instructions on how to configure and use MIG, refer to the NVIDIA Multi-Instance GPU User Guide.

Identify the MIG instances that will be used. Here is the output from a system that has GPU 0 partitioned into 7 MIGs:

nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
  MIG 1g.10gb Device 0: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0)
  MIG 1g.10gb Device 1: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/8/0)
  MIG 1g.10gb Device 2: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/9/0)
  MIG 1g.10gb Device 3: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/11/0)
  MIG 1g.10gb Device 4: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/12/0)
  MIG 1g.10gb Device 5: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/13/0)
  MIG 1g.10gb Device 6: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/14/0)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
GPU 2: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 3: DGX Display (UUID: GPU-91b9d8c8-e2b9-6264-99e0-b47351964c52)
GPU 4: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)

In Docker, enter the MIG UUID from this output, in which GPU 0 and Device 0 have been selected.

If you are running on DGX Station A100, restart the nv-docker-gpus and docker system services any time MIG instances are created, destroyed or modified by running the following:

sudo systemctl restart nv-docker-gpus; sudo systemctl restart docker

nv-docker-gpus has to be restarted on DGX Station A100 because this service is used to mask the available GPUs that can be used by Docker. When the GPU architecture changes, the service needs to be refreshed.

docker run --gpus '"device=MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0"' --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
  MIG 1g.10gb Device 0: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0)

On bare metal, specify the MIG instances:

Note

This application measures the communication across GPUs, and it is not relevant to read the bandwidth and latency with only one GPU MIG.

The purpose of this example is to illustrate how to use specific GPUs with applications, which is illustrated below.

  1. Go to the following directory:

    cd /usr/local/cuda-11.2/samples/bin/x86_64/linux/release
    
  2. Run the p2pBandwidthLatencyTest

    CUDA_VISIBLE_DEVICES=MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0 ./p2pBandwidthLatencyTest
    [P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
    Device: 0, Graphics Device MIG 1g.10gb, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
    

Note

In case a device doesn’t have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix
     D\D     0
     0       1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0
     0 176.20
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
   D\D     0
     0 187.87
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0
     0 190.77
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D     0
     0 190.53
P2P=Disabled Latency Matrix (us)
   GPU     0
     0   3.57

   CPU     0
     0   4.07
P2P=Enabled Latency (P2P Writes) Matrix (us)
   GPU     0
     0   3.55

   CPU     0
     0   4.07

Note

The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Updating the containerd Override File for MIG configurations

When you add MIG instances, the containerd override file does not automatically get updated, and the new MIG instances that you add will not be added to the allow file. When DGX Station A100 starts, after the nv-docker-gpus service runs, a containerd override file is created in the /etc/systemd/system/containerd.service.d/ directory.

Note

This file blocks Docker from using the display GPU on the DGX Station A100.

Here is an example of an override file:

[Service]
DeviceAllow=/dev/nvidia1
DeviceAllow=/dev/nvidia2
DeviceAllow=/dev/nvidia3
DeviceAllow=/dev/nvidia4
DeviceAllow=/dev/nvidia-caps/nvidia-cap1
DeviceAllow=/dev/nvidia-caps/nvidia-cap2
DeviceAllow=/dev/nvidiactl
DeviceAllow=/dev/nvidia-modeset
DeviceAllow=/dev/nvidia-uvm
DeviceAllow=/dev/nvidia-uvm-tools

The service can only add devices of which it is aware. To ensure that your new MIG instances are added to the allow list, complete the following steps:

  1. To refresh the override file, run the following commands:

    sudo systemctl restart nv-docker-gpus
    
    sudo systemctl restart docker
    
  2. Verify that your new MIG instances are now allowed in the containers. Here is an example of an updated override file:

    |       ID ID Dev |                 BAR1-Usage |  SM    Unc| CE ENC DEC OFA JPG    |
    |                  |                      |        ECC|                       |
    |==================+======================+===========+=======================|
    | 0      0   0   0 |      0MiB / 81252MiB |  98      0|  7   0   5   1   1    |
    |                  |      1MiB / 13107... |           |                       |
    +------------------+----------------------+-----------+-----------------------+
    | Processes:                                                                  |
    | GPU   GI     CI     PID Type    Process name    GPU Memory                    |
    |       ID     ID     Usage                                                   |
    |=============================================================================|
    | No running processes found                                                  |
    

Data Storage Configuration

By default, the DGX system includes several drives in a RAID 0 configuration. These drives are intended for application caching, so you

Using Data Storage for NFS Caching

This section provides information about how you can use data storage for NFS caching.

The DGX systems use cachefilesd to manage NFS caching.

  • Ensure that you have an NFS server with one or more exports with data that will be accessed by the DGX system

  • Ensure that there is network access between the DGX system and the NFS server.

Using cachefilesd

Here are the steps that describe how you can mount the NFS on the DGX system, and how you can cache the NFS by using the DGX SSDs for improved performance.

  1. Configure an NFS mount for the DGX system.

    1. Edit the filesystem tables configuration.

      sudo vi /etc/fstab
      
    2. Add a new line for the NFS mount by using the local /mnt local mount point.

      <nfs_server>:<export_path> /mnt nfs rw,noatime,rsize=32768,wsize=32768,nolock,tcp,intr,fsc,nofail 0 0
      

      Here, /mnt is used an example mount point.

      • Contact your Network Administrator for the correct values for <nfs_server> and <export_path>.

      • The nfs arguments presented here are a list of recommended values based on typical use cases. However, fsc must always be included because that argument specifies using FS-Cache

    1. Save the changes.

  2. Verify that the NFS server is reachable.

    ping <nfs_server>
    

    Use the server IP address or the server name that was provided by your network administrator.

  3. Mount the NFS export.

    sudo mount /mnt
    

    /mnt is an example mount point.

  4. Verify that caching is enabled.

    cat /proc/fs/nfsfs/volumes
    
  1. In the output, find FSC=yes.

    The NFS will be automatically mounted and cached on the DGX system in subsequent reboot cycles.

Disabling cachefilesd

Here is some information about how to disable cachefilesd.

If you do not want to enable cachefilesd by running:

  1. Stop the cachefilesd service:

    sudo systemctl stop cachefilesd
    
  2. Disable the cachefilesd service permanently

    sudo systemctl disable cachefilesd
    

Changing the RAID Configuration for Data Drives

Here is information that describes how to change the RAID configuration for your data drives.

Warning

You must have a minimum of 2 drives to complete these tasks. If you have less than 2 drives, you cannot complete the tasks.

From the factory, the RAID level of the DGX RAID array is RAID 0. This level provides the maximum storage capacity, but it does not provide redundancy. If one SSD in the array fails, the data that is stored on the array is lost. If you are willing to accept reduced capacity in return for a level of protection against drive failure, you can change the level of the RAID array to RAID 5.

Note

If you change the RAID level from RAID 0 to RAID 5, the total storage capacity of the RAID array is reduced.

Before you change the RAID level of the DGX RAID array, back up the data on the array that you want to preserve. When you change the RAID level of the DGX RAID array, the data that is stored on the array is erased.

You can use the configure_raid_array.py custom script, which is installed on the system to change the level of the RAID array without unmounting the RAID volume.

  • To change the RAID level to RAID 5, run the following command:

    sudo configure_raid_array.py -m raid5
    

    After you change the RAID level to RAID 5, the RAID array is rebuilt. Although a RAID array that is being rebuilt is online and ready to be used, a check on the health of the DGX system reports the status of the RAID volume as unhealthy. The time required to rebuild the RAID array depends on the workload on the system. For example, on an idle system, the rebuild might be completed in 30 minutes.

  • To change the RAID level to RAID 0, run the following command:

    sudo configure_raid_array.py -m raid0
    

To confirm that the RAID level was changed, run the lsblk command. The entry in the TYPE column for each drive in the RAID array indicates the RAID level of the array.

Running NGC Containers

This section provides information about how to run NGC containers with your DGX system.

Obtaining an NGC Account

Here is some information about how you can obtain an NGC account.

NVIDIA NGC provides simple access to GPU-optimized software for deep learning, machine learning , and high-performance computing (HPC). An NGC account grants you access to these tools and gives you the ability to set up a private registry to manage your customized software.

If you are the organization administrator for your DGX system purchase, work with NVIDIA Enterprise Support to set up an NGC enterprise account. Refer to the NGC Private Registry User Guide for more information about getting an NGC enterprise account.

Running NGC Containers with GPU Support

To obtain the best performance when running NGC containers on DGX systems, you can use one of the following methods to provide GPU support for Docker containers:

  • Native GPU support (included in Docker 19.03 and later, installed)

  • NVIDIA Container Runtime for Docker

This is in the nvidia-docker2 package.

The recommended method for DGX OS 6 is native GPU support. To run GPU-enabled containers, run docker run --gpus.

Here is an example that uses all GPUs:

docker run --gpus all …

Here is an example that uses 2 GPUs:

docker run --gpus 2

Here is an example that uses specific GPUs:

docker run --gpus '"device=1,2"' ...
docker run --gpus '"device=UUID-ABCDEF-

Refer to Running Containers for more information about running NGC containers on MIG devices.