Simultaneous iGPU and dGPU
Your NVIDIA IGX Orin™ Developer Kit has an NVIDIA Ampere integrated GPU (iGPU) and PCIe slots for an additional discrete GPU (dGPU). Typically you configure your Dev Kit in either iGPU mode or dGPU mode. For details, see GPU Configurations.
This documentation describes advanced configuration that enables you to use the integrated GPU (iGPU) and a discrete GPU (dGPU) simultaneously (simultaneous mode). In simultaneous mode, you use Docker containers to run applications on the iGPU.
You can leverage simultaneous mode to offload less compute-intense workloads from the dGPU to the iGPU, reserving the dGPU for your more compute-intense workloads. The following table shows the capabilities that are supported by each GPU in simultaneous mode.
dGPU |
iGPU |
|
---|---|---|
Compute | Yes | Yes |
Graphics (headless) | Yes | Yes |
Display | Yes | No |
RDMA | Yes | No |
Video | Yes | Not tested |
The IGX Orin developer kit supports simultaneous mode for the following software releases:
IGX SW 1.0 GA (L4T r36.3) in dGPU mode
IGX SW 1.0 DP (L4T r36.1) in dGPU mode
In simultaneous mode, you use Docker containers to run applications on the iGPU. For initial testing, and for the examples in this documentation, you can use a simple container, such as ubuntu:22.04
.
For your production work, use an iGPU-specific container that has a GPU-specific compute stack and acceleration. For a list of IGX iGPU-specific containers, see Containers.
Enable Access
To enable simultaneous access to the iGPU and dGPU, do the following.
Run the following command, one time only.
sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh configure
WarningOn IGX SW 1.0 DP (L4T r36.1), the libnvdla_compiler.so is missing from this configuration for iGPU, which can lead to the following errors when you try to use TensorRT on the iGPU:
libnvinfer.so: undefined reference to `nvdla::...`
As a workaround, run the following:
lib="/usr/lib/aarch64-linux-gnu/tegra/libnvdla_compiler.so" sudo cp $lib "/nvidia/igpu_on_dgpu/root/$lib" sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh generate_cdi
(Optional) Run the following command to show a list of other useful commands.
l4t-igpu-container-on-dgpu-host-config.sh --help
(Optional) If you no longer want to use simultaneous mode, to revert, run the following.
sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh clean
Standard iGPU Access
After you run the one-time configuration in the previous section, any bare-metal application uses the dGPU by default. To use the iGPU, run the application in a docker container and use the following flags:
--runtime=nvidia
— This ensures the nvidia container runtime is used.-e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0
— This exposes the iGPU driver libraries in the container.
If you have a Docker version equal to or later than 25.0, and NVIDIA Container Toolkit equal to or later than 1.14.5, you can opt-in docker’s native Container Device Interface (CDI) support. In this case, you can run the application in a docker container and use the following flag:
--device=nvidia.com/igpu=0
To opt-in to Docker’s native CDI support, run the following code.
sudo nvidia-ctk runtime configure --runtime=docker --cdi.enable
sudo systemctl restart docker
Advanced iGPU Access
You can attempt to run a bare-metal application on your iGPU by setting your library path before you run your application. To set your library path, run the following command.
LD_LIBRARY_PATH=/nvidia/igpu_on_dgpu/root/usr/lib/aarch64-linux-gnu/tegra <executable>
Setting the library path can result in unexpected behavior. If unexpected behavior occurs, uset the standard iGPU access method instead.
Verify iGPU and dGPU Access
To verify that both the iGPU and dGPU are accessible, do the following,
Run the following command which runs on the dGPU by default.
nvidia-smi --query-gpu=name --format=csv,noheader
You should see output similar to the following, which shows the dGPU.
NVIDIA RTX A6000
Run the following command in a container on the iGPU.
NoteThis example uses the
ubuntu:22.04
container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.docker run --rm \ --runtime=nvidia \ -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \ ubuntu:22.04 \ nvidia-smi --query-gpu name --format=csv,noheader
You should see output similar to the following, which shows the iGPU.
Orin (nvgpu)
In this example, you use the Device Query CUDA sample application to get information about the iGPU and the dGPU.
Clone and build the device query sample application.
git clone --depth 1 --branch v12.2 https://github.com/NVIDIA/cuda-samples.git
cd cuda-samples/Samples/1_Utilities/deviceQuery
make
Get the device information for the dGPU by running the following command. It runs on the dGPU by default.
./deviceQuery
You should see output similar to the following.
./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "NVIDIA RTX 6000 Ada Generation" CUDA Driver Version / Runtime Version 12.2 / 12.2 CUDA Capability Major/Minor version number: 8.9 Total amount of global memory: 48436 MBytes (50789154816 bytes) (142) Multiprocessors, (128) CUDA Cores/MP: 18176 CUDA Cores GPU Max Clock rate: 2505 MHz (2.50 GHz) ...
Get the device information for the iGPU using the following command to run the application in a container on the iGPU.
NoteThis example uses the
ubuntu:22.04
container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.docker run --rm -it --init \ --runtime=nvidia \ -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \ -v ./deviceQuery:/opt/deviceQuery \ ubuntu:22.04 \ /opt/deviceQuery
You should see output similar to the following.
/opt/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "Orin" CUDA Driver Version / Runtime Version 12.2 / 12.2 CUDA Capability Major/Minor version number: 8.7 Total amount of global memory: 54792 MBytes (57453330432 bytes) (016) Multiprocessors, (128) CUDA Cores/MP: 2048 CUDA Cores GPU Max Clock rate: 1185 MHz (1.18 GHz) ...
In this example, you run a GPU-accelerated application on the iGPU and a dGPU simultaneously. You run the Matrix Multiplication CUDA sample application, and you use a matrix size larger than the default so that the application runs longer.
Clone and build the matrix multiplication sample application.
git clone --depth 1 --branch v12.2 https://github.com/NVIDIA/cuda-samples.git cd cuda-samples/Samples/0_Introduction/matrixMul make
Run the matrix multiplication application. It runs on the dGPU by default.
./matrixMul -wA=6400 -hA=3200 -wB=3200 -hB=6400
You should see output similar to the following.
[Matrix Multiply Using CUDA] - Starting... GPU Device 0: "Ada" with compute capability 8.9 MatrixA(6400,3200), MatrixB(3200,6400) Computing result using CUDA Kernel... ...
In a separate terminal, run the matrix multiplication application in a container on the iGPU.
NoteThis example uses the
ubuntu:22.04
container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.docker run --rm -it --init \ --runtime=nvidia \ -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \ -v ./matrixMul:/opt/matrixMul \ ubuntu:22.04 \ /opt/matrixMul -wA=6400 -hA=3200 -wB=3200 -hB=6400
You should see output similar to the following.
[Matrix Multiply Using CUDA] - Starting... GPU Device 0: "Ampere" with compute capability 8.7 MatrixA(6400,3200), MatrixB(3200,6400) Computing result using CUDA Kernel... ...
Monitor GPU Utilization
While you run this example, you can monitor the utilization of both GPUs in separate terminals.
To monitor iGPU utilization, run the following command.
# Monitor continuously watch -n 0.25 'tegrastats --interval 1 | head -n1 | sed -E "s|.*(GR3D_FREQ [0-9]+).*|\1%|"' # -- OR -- # One time # tegrastats --interval 1 | head -n1 | sed -E "s|.*(GR3D_FREQ [0-9]+).*|\1%|"
To monitor dGPU utilization, run the following command.
# Monitor continuously watch -n 0.25 'nvidia-smi --query-gpu=utilization.gpu --format=csv' # -- OR -- # One time # nvidia-smi --query-gpu=utilization.gpu --format=csv
The following image shows both the iGPU and dGPU being accessed and used at the same time.
Image top left — dGPU utilization
Image top right — dGPU matMul
Image bottom-left — iGPU utilization
Image bottom-right — iGPU matMul