System Configurations
This section provides information about less common configuration options once a system has been installed.
Refer also to DGX OS Connectivity Requirements for a list of network ports used by various services.
Network Configuration
This section provides information about you can configure the network in your DGX system.
Configuring Network Proxies
If your network needs to use a proxy server, you need to set up configuration files to ensure the DGX system communicates through the proxy.
For the OS and Most Applications
Here is some information about configuring the network for the OS and other applications.
Edit the /etc/environment
file and add the following proxy addresses
to the file, below the PATH line.
http_proxy="http://<username>:<password>@<host>:<port>/"
ftp_proxy="ftp://<username>:<password>@<host>:<port>/"
https_proxy="https://<username>:<password>@<host>:<port>/"
no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"
HTTP_PROXY="http://<username>:<password>@<host>:<port>/"
FTP_PROXY="ftp://<username>:<password>@<host>:<port>/"
HTTPS_PROXY="https://<username>:<password>@<host>:<port>/"
NO_PROXY="localhost,127.0.0.1,localaddress,.localdomain.com"
Where username
and password
are optional.
For example, for the HTTP proxy (both, upper and lower case versions must be changed):
http_proxy="http://myproxy.server.com:8080/"
HTTP_PROXY="http://myproxy.server.com:8080/"
For the apt Package Manager
Here is some information about configuring the network for the apt package manager.
Edit or create the /etc/apt/apt.conf.d/myproxy
proxy configuration
file and include the following lines:
Acquire::http::proxy "http://<username>:<password>@<host>:<port>/";
Acquire::ftp::proxy "ftp://<username>:<password>@<host>:<port>/";
Acquire::https::proxy "https://<username>:<password>@<host>:<port>/";
For example:
Acquire::http::proxy "http://myproxy.server.com:8080/";
Acquire::ftp::proxy "ftp://myproxy.server.com:8080>/";
Acquire::https::proxy "https://myproxy.server.com:8080/";
Configuring ConnectX from InfiniBand to Ethernet
Many DGX Systems are equipped with NVIDIA ConnectX network controllers and are typically used for cluster communications. By default, the controllers are configured as InfiniBand ports. Optionally, you can configure the ports for Ethernet.
Before or after you reconfigure the port, make sure that the network switch that is connected to the port is also reconfigured to Ethernet or that the port is connected to a different switch that is configured for Ethernet.
The code samples in the following sections show the mlxconfig
command.
The mlxconfig
command applies to hosts that use the MLNX_OFED drivers.
If your host uses the Inbox OFED drivers, then substitute the mstconfig
command.
You can determine if your host uses the MLNX_OFED drivers by running the sudo nvidia-manage-ofed.py -s
command.
If the output indicates package names beneath the Mellanox OFED Packages Installed:
field, then
the MLNX_OFED drivers are installed.
Determining the Current Port Configuration
Perform the following steps to determine the current configuration for the port.
Query the devices:
sudo mlxconfig -e query | egrep -e Device\|LINK_TYPE
The following example shows the output for one of the port devices on an NVIDIA DGX 100 System. The output shows the device path and the default, current, and next boot configuration that are all set to
IB(1)
.Device #9: Device type: ConnectX6 Device: 0000:e1:00.0 * LINK_TYPE_P1 IB(1) IB(1) IB(1) * LINK_TYPE_P2 IB(1) IB(1) IB(1)
IB(1) indicates the port is configured for InfiniBand.
ETH(2) indicates the port is configured for Ethernet.
Determine the device path bus numbers for the slot number of the port that you want to configure. Refer to the following documents for more information:
DGX H100/H200 Network Ports in the NVIDIA DGX H100/H200 System User Guide.
DGX A100 Network Ports in the NVIDIA DGX A100 System User Guide.
Configuring the Port
Use the
mlxconfig
command with theset LINK_TYPE_P<x>
argument for each port you want to configure.The following sample command sets port
1
of the controller with PCI IDe1:00.0
to Ethernet (2
):sudo mlxconfig -y -d e1:00.0 set LINK_TYPE_P1=2
The following example output is from an NVIDIA DGX A100 System.
Device #1: ---------- Device type: ConnectX6 Name: MCX653106A-HDA_Ax Description: ConnectX-6 VPI adapter card; HDR IB (200Gb/s) and 200GbE; dual-port QSFP56; PCIe4.0 x16; tall bracket; ROHS R6 Device: e1:00.0 Configurations: Next Boot New LINK_TYPE_P1 ETH(2) IB(1) Apply new Configuration? (y/n) [n] : y Applying... Done! -I- Please reboot machine to load new configurations.
Here is an example that sets port
2
to Ethernet:sudo mlxconfig -y -d e1:00.0 set LINK_TYPE_P2=2
(Optional) Run
mlxconfig
again to confirm the change:sudo mlxconfig -e query |egrep -e Device\|LINK_TYPE
In the following output, port
LINK_TYPE_2
is set toETH(2)
for the next boot. The output shows the device path and the default, current, and next boot configuration.... Device #9: Device type: ConnectX6 Device: 0000:e1:00.0 * LINK_TYPE_P1 IB(1) IB(1) IB(1) * LINK_TYPE_P2 IB(1) IB(1) ETH(2)
Perform an AC power cycle on the system for the change to take effect.
Wait for the operating system to boot.
Docker Configuration
To ensure that Docker can access the NGC container registry through a proxy, Docker uses environment variables.
For best practice recommendations on configuring proxy environment variables for Docker, refer to Control Docker with systemd.
Preparing the DGX System to be Used With Docker
Some initial setup of the DGX system is required to ensure that users have the required privileges to run Docker containers and to prevent IP address conflicts between Docker and the DGX system.
Enabling Users To Run Docker Containers
To prevent the docker
daemon from running without protection against
escalation of privileges, the Docker software requires sudo
privileges
to run containers. Meeting this requirement involves enabling users who
will run Docker containers to run commands with sudo
privileges.
You should ensure that only users whom you trust and who are aware of
the potential risks to the DGX system of running commands with sudo
privileges can run Docker containers.
Before you allow multiple users to run commands with sudo
privileges,
consult your IT department to determine whether you might be violating
your organization’s security policies. For the security implications of
enabling users to run Docker containers, see Docker daemon attack
surface.
You can enable users to run the Docker containers in one of the following ways:
Add each user as an administrator user with
sudo
privileges.Add each user as a standard user without
sudo
privileges and then add the user to thedocker
group.
This approach is inherently insecure because any user who can send commands to the docker engine can escalate privilege and run root-user operations.
To add an existing user to the docker
group, run this command:
sudo usermod -aG docker user-login-id
Where `user-login-id
is the user login ID of the existing user that you are adding to the docker group.
Configuring Docker IP Addresses
To ensure that your DGX system can access the network interfaces for Docker containers, Docker should be configured to use a subnet distinct from other network resources used by the DGX system.
By default, Docker uses the 172.17.0.0/16
subnet. Consult your network
administrator to find out which IP addresses are used by your network. If your
network does not conflict with the default Docker IP address range, no changes
are needed and you can skip this section. However, if your network uses the
addresses in this range for the DGX system, you should change the default Docker
network addresses.
You can change the default Docker network addresses by modifying the /etc/docker/daemon.json
file or modifying the /etc/systemd/system/docker.service.d/docker-override.conf
file. These instructions provide an example of modifying the /etc/systemd/system/docker.service.d/docker-override.conf
to override the default Docker network addresses.
Open the
docker-override.conf
file for editing.sudo vi /etc/systemd/system/docker.service.d/docker-override.conf [Service] ExecStart= ExecStart=/usr/bin/dockerd -H fd:// -s overlay2 LimitMEMLOCK=infinity LimitSTACK=67108864
Make the changes indicated in bold below, setting the correct bridge IP address and IP address ranges for your network.
Consult your IT administrator for the correct addresses.
[Service] ExecStart= ExecStart=/usr/bin/dockerd -H fd:// -s overlay2 --bip=192.168.127.1/24 --fixed-cidr=192.168.127.128/25 LimitMEMLOCK=infinity LimitSTACK=67108864
Save and close the
/etc/systemd/system/docker.service.d/docker-override.conf
file.Reload the
systemctl
daemon.sudo systemctl daemon-reload
Restart Docker.
sudo systemctl restart docker
Connectivity Requirements for NGC Containers
To run NVIDIA NGC containers from the NGC container registry, your network must be able to access the following URLs:
http://repo.download.nvidia.com/baseos/ubuntu/jammy/x86_64/ (You can access this URL only by using
apt-get
but not in the browser)https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/
To verify connection to nvcr.io
, run
wget https://nvcr.io/v2
You should see connecting verification followed by a 401 error:
--2018-08-01 19:42:58-- https://nvcr.io/v2
Resolving nvcr.io (nvcr.io) --> 52.8.131.152, 52.9.8.8
Connecting to nvcr.io (nvcr.io)|52.8.131.152|:443. --> connected.
HTTP request sent, awaiting response. --> 401 Unauthorized
Configuring Static IP Addresses for the Network Ports
Here are the steps to configure static IP addresses for network ports.
During the initial boot set up process for your DGX system, one of the steps was to configure static IP addresses for a network interface. If you did not configure the addresses at that time, you can configure the static IP addresses from the Ubuntu command line using the following instructions.
Note
If you are connecting to the DGX console remotely, connect by using the BMC remote console. If you connect using SSH, your connection will be lost when you complete the final step. Also, if you encounter issues with the configuration file, the BMC connection will help with troubleshooting.
If you cannot remotely access the DGX system, connect a display with a 1440x900 or lower resolution, and a keyboard directly to the DGX system.
Determine the port designation that you want to configure, based on the physical Ethernet port that you have connected to your network.
If your network needs to use a proxy server, you need to set up configuration files to ensure the DGX system communicates through the proxy.
Refer to Configuring Network Proxies for the port designation of the connection that you want to configure.
Edit the network configuration YAML file,
/etc/netplan/01-netcfg.yaml
, and make the following edits.Note
Ensure that your file is identical to the following sample and use spaces and not tabs.
# This file describes the network interfaces available on your system # For more information, see netplan(5). network: version: 2 renderer: networkd ethernets: enp226s0: dhcp4: no addresses: - 10.10.10.2/24 routes: - to: default via: 10.10.10.1 nameservers: addresses: [ 8.8.8.8 ]
Consult your network administrator for your site-specific values such as network, gateway, and nameserver addresses. Replace
enp226s0
with the designations that you determined in the preceding step.Save the file.
Apply the changes.
sudo netplan apply
Note
If you are not returned to the command line prompt after a information, see Changes, errors, and bugs in the Ubuntu Server Guide.
Managing CPU Mitigations
DGX OS software includes security updates to mitigate CPU speculative side-channel vulnerabilities. These mitigations can decrease the performance of deep learning and machine learning workloads.
If your DGX system installation incorporates other measures to mitigate these vulnerabilities, such as measures at the cluster level, you can disable the CPU mitigations for individual DGX nodes and increase performance.
Determining the CPU Mitigation State of the DGX System
Here is information about how you can determine the CPU mitigation state of your DGX system.
If you do not know whether CPU mitigations are enabled or disabled, issue the following.
cat /sys/devices/system/cpu/vulnerabilities/*
CPU mitigations are enabled when the output consists of multiple
lines prefixed with Mitigation:
.
For example:
KVM: Mitigation: Split huge pages
Mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable Mitigation: Clear CPU buffers; SMT vulnerable
Mitigation: PTI
Mitigation: Speculative Store Bypass disabled via prctl and seccomp Mitigation: usercopy/swapgs barriers and __user pointer sanitization Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP:
conditional, RSB filling
Mitigation: Clear CPU buffers; SMT vulnerable
CPU mitigations are disabled if the output consists of multiple
lines prefixed with Vulnerable
.
KVM: Vulnerable
Mitigation: PTE Inversion; VMX: vulnerable Vulnerable; SMT vulnerable
Vulnerable
Vulnerable
Vulnerable: user pointer sanitization and usercopy barriers only; no swapgs barriers
Vulnerable, IBPB: disabled, STIBP: disabled Vulnerable
Disabling CPU Mitigations
Here are the steps to disable CPU mitigations.
Caution: Performing the following instructions will disable the CPU mitigations provided by the DGX OS software.
Install the
nv-mitigations-off
package.sudo apt install nv-mitigations-off -y
Reboot the system.
Verify that the CPU mitigations are disabled.
cat /sys/devices/system/cpu/vulnerabilities/*
The output should include several vulnerable lines. See Determining the CPU Mitigation State of the DGX System Here is information about how you can determine the CPU mitigation state of your DGX system example output.
Re-enable CPU Mitigations
Here are the steps to enable CPU mitigations again.
Remove the
nv-mitigations-off
package.sudo apt purge nv-mitigations-off
Reboot the system.
Verify that the CPU mitigations are enabled.
cat /sys/devices/system/cpu/vulnerabilities/*
The output should include several Mitigations lines. See Determining the CPU Mitigation State of the DGX System for example output.
Managing the DGX Crash Dump Feature
This section provides information about managing the DGX Crash Dump feature. You can use the script that is included in the DGX OS to manage this feature.
Using the Script
Here are commands that help you complete the necessary tasks with the script.
To enable only dmesg crash dumps, run:
/usr/sbin/nvidia-kdump-config enable-dmesg-dump
This option reserves memory for the crash kernel.
To enable both dmesg and vmcore crash dumps, run:
/usr/sbin/nvidia-kdump-config enable-vmcore-dump
This option reserves memory for the crash kernel.
To disable crash dumps, run:
/usr/sbin/nvidia-kdump-config disable
This option disables the use of kdump
and ensures that no memory is
reserved for the crash kernel.
Connecting to Serial Over LAN
You can connect to serial over a LAN.
Warning
This applies only to systems that have the BMC.
While dumping vmcore
, the BMC screen console goes blank
approximately 11 minutes after the crash dump is started. To view the
console output during the crash dump, connect to serial over LAN as
follows:
ipmitool -I lanplus -H -U -P sol activate
Filesystem Quotas
Here is some information about filesystem quotas.
When running NGC containers you might need to limit the amount of disk space that is used on a filesystem to avoid filling up the partition. Refer to How to Set Filesystem Quotas on Ubuntu 18.04 about how to set filesystem quotas on Ubuntu 18.04 and later.
Running Workloads on Systems with Mixed Types of GPUs
The DGX Station A100 comes equipped with four high performance NVIDIA A100 GPUs and one DGX Display GPU. The NVIDIA A100 GPU is used to run high performance and AI workloads, and the DGX Display card is used to drive a high-quality display on a monitor.
When running applications on this system, it is important to identify the best method to launch applications and workloads to make sure the high performance NVIDIA A100 GPUs are used. You can achieve this in one of the following ways:
When you log into the system and check which GPUs are available, you find the following:
nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
GPU 2: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 3: DGX Display (UUID: GPU-91b9d8c8-e2b9-6264-99e0-b47351964c52)
GPU 4: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)
A total of five GPUs are listed by nvidia-smi
. This is because
nvidia-smi
is including the DGX Display GPU that is used to drive
the monitor and high-quality graphics output.
When running an application or workload, the DGX Display GPU can get in the way because it does not have direct NVlink connectivity, sufficient memory, or the performance characteristics of the NVIDIA A100 GPUs that are installed on the system. As a result you should ensure that the correct GPUs are being used.
Running with Docker Containers
On the DGX OS, because Docker has already been configured to identify the high performance NVIDIA A100 GPUs and assign the GPUs to the container, this method is the simplest.
A simple test is to run a small container with the [–gpus all] flag in
the command and once in the container that is running nvidia-smi
.
The output shows that only the high-performance GPUs are available to
the container:
docker run --gpus all --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
GPU 2: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 3: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)
This step will also work when the --gpus n
flag is used, where n
can be 1, 2, 3, or 4
. These values represent the number of GPUs that
should be assigned to that container. For example:
docker run --gpus 2 --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
In this example, Docker selected the first two GPUs to run the
container, but if the device
option is used, you can specify which
GPUs to use:
docker run --gpus '"device=GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5,GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf"' --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 1: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)
In this example, the two GPUs that were not used earlier are now assigned to run on the container.
Running on Bare Metal
To run applications by using the four high performance GPUs, the
CUDA_VISIBLE_DEVICES
variable must be specified before you run the
application.
Note
This method does not use containers.
CUDA orders the GPUs by performance, so GPU 0
will be the highest
performing GPU, and the last GPU will be the slowest GPU.
Warning
CUDA_DEVICE_ORDER
variable is set to PCI_BUS_ID
,
this ordering will be overridden.
In the following example, a CUDA application that comes with CUDA
samples is run. In the output, GPU 0
is the fastest in a DGX Station
A100, and GPU 4
(DGX Display GPU) is the slowest:
sudo apt install cuda-samples-11-2
cd /usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest
sudo make
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../common/inc -m64 --threads
0 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
-gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52
-gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61
-gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75
-gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86
-gencode arch=compute_86,code=compute_86 -o p2pBandwidthLatencyTest.o -c p2pBandwidthLatencyTest.cu
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/local/cuda/bin/nvcc -ccbin g++ -m64
-gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
-gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52
-gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61
-gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75
-gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86
-gencode arch=compute_86,code=compute_86 -o p2pBandwidthLatencyTest p2pBandwidthLatencyTest.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
mkdir -p ../../bin/x86_64/linux/release
cp p2pBandwidthLatencyTest ../../bin/x86_64/linux/release
lab@ro-dvt-058-80gb:/usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest cd /usr/local/cuda-11.2/samples/bin/x86_64/linux/release
lab@ro-dvt-058-80gb:/usr/local/cuda-11.2/samples/bin/x86_64/linux/release ./p2pBandwidthLatencyTest
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, Graphics Device, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, Graphics Device, pciBusID: 47, pciDeviceID: 0, pciDomainID:0
Device: 2, Graphics Device, pciBusID: 81, pciDeviceID: 0, pciDomainID:0
Device: 3, Graphics Device, pciBusID: c2, pciDeviceID: 0, pciDomainID:0
Device: 4, DGX Display, pciBusID: c1, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=0 CAN Access Peer Device=2
Device=0 CAN Access Peer Device=3
Device=0 CANNOT Access Peer Device=4
Device=1 CAN Access Peer Device=0
Device=1 CAN Access Peer Device=2
Device=1 CAN Access Peer Device=3
Device=1 CANNOT Access Peer Device=4
Device=2 CAN Access Peer Device=0
Device=2 CAN Access Peer Device=1
Device=2 CAN Access Peer Device=3
Device=2 CANNOT Access Peer Device=4
Device=3 CAN Access Peer Device=0
Device=3 CAN Access Peer Device=1
Device=3 CAN Access Peer Device=2
Device=3 CANNOT Access Peer Device=4
Device=4 CANNOT Access Peer Device=0
Device=4 CANNOT Access Peer Device=1
Device=4 CANNOT Access Peer Device=2
Device=4 CANNOT Access Peer Device=3
Note
In case a device doesn’t have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.
P2P Connectivity Matrix
D\D 0 1 2 3 4
0 1 1 1 1 0
1 1 1 1 1 0
2 1 1 1 1 0
3 1 1 1 1 0
4 0 0 0 0 1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1 2 3 4
0 1323.03 15.71 15.37 16.81 12.04
1 16.38 1355.16 15.47 15.81 11.93
2 16.25 15.85 1350.48 15.87 12.06
3 16.14 15.71 16.80 1568.78 11.75
4 12.61 12.47 12.68 12.55 140.26
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
D\D 0 1 2 3 4
0 1570.35 93.30 93.59 93.48 12.07
1 93.26 1583.08 93.55 93.53 11.93
2 93.44 93.58 1584.69 93.34 12.05
3 93.51 93.55 93.39 1586.29 11.79
4 12.68 12.54 12.75 12.51 140.26
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1 2 3 4
0 1588.71 19.60 19.26 19.73 16.53
1 19.59 1582.28 19.85 19.13 16.43
2 19.53 19.39 1583.88 19.61 16.58
3 19.51 19.11 19.58 1592.76 15.90
4 16.36 16.31 16.39 15.80 139.42
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
D\D 0 1 2 3 4
0 1590.33 184.91 185.37 185.45 16.46
1 185.04 1587.10 185.19 185.21 16.37
2 185.15 185.54 1516.25 184.71 16.47
3 185.55 185.32 184.86 1589.52 15.71
4 16.26 16.28 16.16 15.69 139.43
P2P=Disabled Latency Matrix (us)
GPU 0 1 2 3 4
0 3.53 21.60 22.22 21.38 12.46
1 21.61 2.62 21.55 21.65 12.34
2 21.57 21.54 2.61 21.55 12.40
3 21.57 21.54 21.58 2.51 13.00
4 13.93 12.41 21.42 21.58 1.14
CPU 0 1 2 3 4
0 4.26 11.81 13.11 12.00 11.80
1 11.98 4.11 11.85 12.19 11.89
2 12.07 11.72 4.19 11.82 12.49
3 12.14 11.51 11.85 4.13 12.04
4 12.21 11.83 12.11 11.78 4.02
P2P=Enabled Latency (P2P Writes) Matrix (us)
GPU 0 1 2 3 4
0 3.79 3.34 3.34 3.37 13.85
1 2.53 2.62 2.54 2.52 12.36
2 2.55 2.55 2.61 2.56 12.34
3 2.58 2.51 2.51 2.53 14.39
4 19.77 12.32 14.75 21.60 1.13
CPU 0 1 2 3 4
0 4.27 3.63 3.65 3.59 13.15
1 3.62 4.22 3.61 3.62 11.96
2 3.81 3.71 4.35 3.73 12.15
3 3.64 3.61 3.61 4.22 12.06
4 12.32 11.92 13.30 12.03 4.05
Note
The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
The example above shows the peer-to-peer bandwidth and latency test
across all five GPUs, including the DGX Display GPU. The application
also shows that there is no peer-to-peer connectivity between any GPU
and GPU 4
. This indicates that GPU 4
should not be used for
high-performance workloads.
Run the example one more time by using the CUDA_VISIBLE_DEVICES
variable, which limits the number of GPUs that the application can see.
Note
All GPUs can communicate with all other peer devices.
CUDA_VISIBLE_DEVICES=0,1,2,3 ./p2pBandwidthLatencyTest
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, Graphics Device, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, Graphics Device, pciBusID: 47, pciDeviceID: 0, pciDomainID:0
Device: 2, Graphics Device, pciBusID: 81, pciDeviceID: 0, pciDomainID:0
Device: 3, Graphics Device, pciBusID: c2, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=0 CAN Access Peer Device=2
Device=0 CAN Access Peer Device=3
Device=1 CAN Access Peer Device=0
Device=1 CAN Access Peer Device=2
Device=1 CAN Access Peer Device=3
Device=2 CAN Access Peer Device=0
Device=2 CAN Access Peer Device=1
Device=2 CAN Access Peer Device=3
Device=3 CAN Access Peer Device=0
Device=3 CAN Access Peer Device=1
Device=3 CAN Access Peer Device=2
Note
In case a device doesn’t have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.
P2P Connectivity Matrix
D\D 0 1 2 3
0 1 1 1 1
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1 2 3
0 1324.15 15.54 15.62 15.47
1 16.55 1353.99 15.52 16.23
2 15.87 17.26 1408.93 15.91
3 16.33 17.31 18.22 1564.06
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
D\D 0 1 2 3
0 1498.08 93.30 93.53 93.48
1 93.32 1583.08 93.54 93.52
2 93.55 93.60 1583.08 93.36
3 93.49 93.55 93.28 1576.69
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1 2 3
0 1583.08 19.92 20.47 19.97
1 20.74 1586.29 20.06 20.22
2 20.08 20.59 1590.33 20.01
3 20.44 19.92 20.60 1589.52
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
D\D 0 1 2 3
0 1592.76 184.88 185.21 185.30
1 184.99 1589.52 185.19 185.32
2 185.28 185.30 1585.49 185.01
3 185.45 185.39 184.84 1587.91
P2P=Disabled Latency Matrix (us)
GPU 0 1 2 3
0 2.38 21.56 21.61 21.56
1 21.70 2.34 21.54 21.56
2 21.55 21.56 2.41 21.06
3 21.57 21.34 21.56 2.39
CPU 0 1 2 3
0 4.22 11.99 12.71 12.09
1 11.86 4.09 12.00 11.71
2 12.52 11.98 4.27 12.24
3 12.22 11.75 12.19 4.25
P2P=Enabled Latency (P2P Writes) Matrix (us)
GPU 0 1 2 3
0 2.32 2.57 2.55 2.59
1 2.55 2.32 2.59 2.52
2 2.59 2.56 2.41 2.59
3 2.57 2.55 2.56 2.40
CPU 0 1 2 3
0 4.24 3.57 3.72 3.81
1 3.68 4.26 3.75 3.63
2 3.79 3.75 4.34 3.71
3 3.72 3.64 3.66 4.32
Note
The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
For bare metal applications, the UUID can also be specified in the [CUDA_VISIBLE_DEVICES] variable as shown below:
CUDA_VISIBLE_DEVICES=GPU-0f2dff15-7c85-4320-da52-d3d54755d182,GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5 ./p2pBandwidthLatencyTest
The GPU specification is longer because of the nature of UUIDs, but this is the most precise way to pin specific GPUs to the application.
Using Multi-Instance GPUs
Multi-Instance GPUs (MIG) is available on NVIDIA A100 GPUs. If MIG is enabled on the GPUs, and if the GPUs have already been partitioned, then applications can be limited to run on these devices.
This works for both Docker containers and for bare metal using the [CUDA_VISIBLE_DEVICES] as shown in the examples below. For instructions on how to configure and use MIG, refer to the NVIDIA Multi-Instance GPU User Guide.
Identify the MIG instances that will be used. Here is the output from a
system that has GPU 0
partitioned into 7 MIGs:
nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
MIG 1g.10gb Device 0: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0)
MIG 1g.10gb Device 1: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/8/0)
MIG 1g.10gb Device 2: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/9/0)
MIG 1g.10gb Device 3: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/11/0)
MIG 1g.10gb Device 4: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/12/0)
MIG 1g.10gb Device 5: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/13/0)
MIG 1g.10gb Device 6: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/14/0)
GPU 1: Graphics Device (UUID: GPU-0f2dff15-7c85-4320-da52-d3d54755d182)
GPU 2: Graphics Device (UUID: GPU-dc598de6-dd4d-2f43-549f-f7b4847865a5)
GPU 3: DGX Display (UUID: GPU-91b9d8c8-e2b9-6264-99e0-b47351964c52)
GPU 4: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf)
In Docker, enter the MIG UUID from this output, in which GPU 0
and
Device 0
have been selected.
If you are running on DGX Station A100, restart the nv-docker-gpus
and docker system services any time MIG instances are created, destroyed
or modified by running the following:
sudo systemctl restart nv-docker-gpus; sudo systemctl restart docker
nv-docker-gpus
has to be restarted on DGX Station A100 because this
service is used to mask the available GPUs that can be used by Docker.
When the GPU architecture changes, the service needs to be refreshed.
docker run --gpus '"device=MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0"' --rm -it ubuntu nvidia-smi -L
GPU 0: Graphics Device (UUID: GPU-269d95f8-328a-08a7-5985-ab09e6e2b751)
MIG 1g.10gb Device 0: (UUID: MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0)
On bare metal, specify the MIG instances:
Note
This application measures the communication across GPUs, and it is not relevant to read the bandwidth and latency with only one GPU MIG.
The purpose of this example is to illustrate how to use specific GPUs with applications, which is illustrated below.
Go to the following directory:
cd /usr/local/cuda-11.2/samples/bin/x86_64/linux/release
Run the
p2pBandwidthLatencyTest
CUDA_VISIBLE_DEVICES=MIG-GPU-269d95f8-328a-08a7-5985-ab09e6e2b751/7/0 ./p2pBandwidthLatencyTest [P2P (Peer-to-Peer) GPU Bandwidth Latency Test] Device: 0, Graphics Device MIG 1g.10gb, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Note
In case a device doesn’t have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.
P2P Connectivity Matrix
D\D 0
0 1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0
0 176.20
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
D\D 0
0 187.87
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0
0 190.77
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
D\D 0
0 190.53
P2P=Disabled Latency Matrix (us)
GPU 0
0 3.57
CPU 0
0 4.07
P2P=Enabled Latency (P2P Writes) Matrix (us)
GPU 0
0 3.55
CPU 0
0 4.07
Note
The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
Updating the containerd Override File for MIG configurations
When you add MIG instances, the containerd override file does not automatically
get updated, and the new MIG instances that you add will not be added to the
allow file. When DGX Station A100 starts, after the nv-docker-gpus service runs,
a containerd override file is created in the
/etc/systemd/system/containerd.service.d/ directory
.
Note
This file blocks Docker from using the display GPU on the DGX Station A100.
Here is an example of an override file:
[Service]
DeviceAllow=/dev/nvidia1
DeviceAllow=/dev/nvidia2
DeviceAllow=/dev/nvidia3
DeviceAllow=/dev/nvidia4
DeviceAllow=/dev/nvidia-caps/nvidia-cap1
DeviceAllow=/dev/nvidia-caps/nvidia-cap2
DeviceAllow=/dev/nvidiactl
DeviceAllow=/dev/nvidia-modeset
DeviceAllow=/dev/nvidia-uvm
DeviceAllow=/dev/nvidia-uvm-tools
The service can only add devices of which it is aware. To ensure that your new MIG instances are added to the allow list, complete the following steps:
To refresh the override file, run the following commands:
sudo systemctl restart nv-docker-gpus
sudo systemctl restart docker
Verify that your new MIG instances are now allowed in the containers. Here is an example of an updated override file:
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG | | | | ECC| | |==================+======================+===========+=======================| | 0 0 0 0 | 0MiB / 81252MiB | 98 0| 7 0 5 1 1 | | | 1MiB / 13107... | | | +------------------+----------------------+-----------+-----------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found |
Data Storage Configuration
By default, the DGX system includes several drives in a RAID 0 configuration. These drives are intended for application caching, so you
Using Data Storage for NFS Caching
This section provides information about how you can use data storage for NFS caching.
The DGX systems use cachefilesd
to manage NFS caching.
Ensure that you have an NFS server with one or more exports with data that will be accessed by the DGX system
Ensure that there is network access between the DGX system and the NFS server.
Using cachefilesd
Here are the steps that describe how you can mount the NFS on the DGX system, and how you can cache the NFS by using the DGX SSDs for improved performance.
Configure an NFS mount for the DGX system.
Edit the filesystem tables configuration.
sudo vi /etc/fstab
Add a new line for the NFS mount by using the local
/mnt
local mount point.<nfs_server>:<export_path> /mnt nfs rw,noatime,rsize=32768,wsize=32768,nolock,tcp,intr,fsc,nofail 0 0
Here,
/mnt
is used an example mount point.Contact your Network Administrator for the correct values for
<nfs_server>
and<export_path>
.The nfs arguments presented here are a list of recommended values based on typical use cases. However,
fsc
must always be included because that argument specifies using FS-Cache
Save the changes.
Verify that the NFS server is reachable.
ping <nfs_server>
Use the server IP address or the server name that was provided by your network administrator.
Mount the NFS export.
sudo mount /mnt
/mnt
is an example mount point.Verify that caching is enabled.
cat /proc/fs/nfsfs/volumes
In the output, find
FSC=yes
.The NFS will be automatically mounted and cached on the DGX system in subsequent reboot cycles.
Disabling cachefilesd
Here is some information about how to disable cachefilesd
.
If you do not want to enable cachefilesd
by running:
Stop the
cachefilesd
service:sudo systemctl stop cachefilesd
Disable the
cachefilesd
service permanentlysudo systemctl disable cachefilesd
Changing the RAID Configuration for Data Drives
Here is information that describes how to change the RAID configuration for your data drives.
Warning
You must have a minimum of 2 drives to complete these tasks. If you have less than 2 drives, you cannot complete the tasks.
From the factory, the RAID level of the DGX RAID array is RAID 0. This level provides the maximum storage capacity, but it does not provide redundancy. If one SSD in the array fails, the data that is stored on the array is lost. If you are willing to accept reduced capacity in return for a level of protection against drive failure, you can change the level of the RAID array to RAID 5.
Note
If you change the RAID level from RAID 0 to RAID 5, the total storage capacity of the RAID array is reduced.
Before you change the RAID level of the DGX RAID array, back up the data on the array that you want to preserve. When you change the RAID level of the DGX RAID array, the data that is stored on the array is erased.
You can use the configure_raid_array.py
custom script, which is
installed on the system to change the level of the RAID array without
unmounting the RAID volume.
To change the RAID level to RAID 5, run the following command:
sudo configure_raid_array.py -m raid5
After you change the RAID level to RAID 5, the RAID array is rebuilt. Although a RAID array that is being rebuilt is online and ready to be used, a check on the health of the DGX system reports the status of the RAID volume as unhealthy. The time required to rebuild the RAID array depends on the workload on the system. For example, on an idle system, the rebuild might be completed in 30 minutes.
To change the RAID level to RAID 0, run the following command:
sudo configure_raid_array.py -m raid0
To confirm that the RAID level was changed, run the lsblk
command. The
entry in the TYPE
column for each drive in the RAID array indicates
the RAID level of the array.
Running NGC Containers
This section provides information about how to run NGC containers with your DGX system.
Obtaining an NGC Account
Here is some information about how you can obtain an NGC account.
NVIDIA NGC provides simple access to GPU-optimized software for deep learning, machine learning , and high-performance computing (HPC). An NGC account grants you access to these tools and gives you the ability to set up a private registry to manage your customized software.
If you are the organization administrator for your DGX system purchase, work with NVIDIA Enterprise Support to set up an NGC enterprise account. Refer to the NGC Private Registry User Guide for more information about getting an NGC enterprise account.
Running NGC Containers with GPU Support
To obtain the best performance when running NGC containers on DGX systems, you can use one of the following methods to provide GPU support for Docker containers:
Native GPU support (included in Docker 19.03 and later, installed)
NVIDIA Container Runtime for Docker
This is in the nvidia-docker2
package.
The recommended method for DGX OS 6 is native GPU support. To run
GPU-enabled containers, run docker run --gpus
.
Here is an example that uses all GPUs:
docker run --gpus all …
Here is an example that uses 2 GPUs:
docker run --gpus 2 …
Here is an example that uses specific GPUs:
docker run --gpus '"device=1,2"' ...
docker run --gpus '"device=UUID-ABCDEF-
Refer to Running Containers for more information about running NGC containers on MIG devices.