Simultaneous iGPU and dGPU#

Your NVIDIA IGX Orin™ Developer Kit has an NVIDIA Ampere integrated GPU (iGPU) and PCIe slots for an additional discrete GPU (dGPU). Typically you configure your Dev Kit in either iGPU mode or dGPU mode. For details, see GPU Configurations.

This documentation describes advanced configuration that enables you to use the integrated GPU (iGPU) and a discrete GPU (dGPU) simultaneously (simultaneous mode). In simultaneous mode, you use Docker containers to run applications on the iGPU.

You can leverage simultaneous mode to offload less compute-intense workloads from the dGPU to the iGPU, reserving the dGPU for your more compute-intense workloads. The following table shows the capabilities that are supported by each GPU in simultaneous mode.

	dGPU	iGPU
Compute	Yes	Yes
Graphics (headless)	Yes	Yes
Display	Yes	No
RDMA	Yes	No
Video	Yes	Not tested

Prerequisites#

The IGX Orin developer kit supports simultaneous mode for the following software releases:

IGX SW 1.0 GA (L4T r36.3) in dGPU mode
IGX SW 1.0 DP (L4T r36.1) in dGPU mode

Docker Containers#

In simultaneous mode, you use Docker containers to run applications on the iGPU. For initial testing, and for the examples in this documentation, you can use a simple container, such as ubuntu:22.04.

For your production work, use an iGPU-specific container that has a GPU-specific compute stack and acceleration. For a list of IGX iGPU-specific containers, see Containers.

Configure Simultaneous Mode#

Enable Access#

To enable simultaneous access to the iGPU and dGPU, do the following.

Run the following command, one time only.

sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh configure

Warning

On IGX SW 1.0 DP (L4T r36.1), the libnvdla_compiler.so is missing from this configuration for iGPU, which can lead to the following errors when you try to use TensorRT on the iGPU:

libnvinfer.so: undefined reference to `nvdla::...`

As a workaround, run the following:

lib="/usr/lib/aarch64-linux-gnu/tegra/libnvdla_compiler.so" sudo cp $lib "/nvidia/igpu_on_dgpu/root/$lib"
sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh generate_cdi

(Optional) Run the following command to show a list of other useful commands.
```
1l4t-igpu-container-on-dgpu-host-config.sh --help
```

(Optional) If you no longer want to use simultaneous mode, to revert, run the following.

sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh clean

Standard iGPU Access#

After you run the one-time configuration in the previous section, any bare-metal application uses the dGPU by default. To use the iGPU, run the application in a docker container and use the following flags:

--runtime=nvidia — This ensures the nvidia container runtime is used.
-e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 — This exposes the iGPU driver libraries in the container.

Tip

If you have a Docker version equal to or later than 25.0, and NVIDIA Container Toolkit equal to or later than 1.14.5, you can opt-in docker’s native Container Device Interface (CDI) support. In this case, you can run the application in a docker container and use the following flag:

--device=nvidia.com/igpu=0

To opt-in to Docker’s native CDI support, run the following code.

sudo nvidia-ctk runtime configure --runtime=docker --cdi.enable
sudo systemctl restart docker

Advanced iGPU Access#

You can attempt to run a bare-metal application on your iGPU by setting your library path before you run your application. To set your library path, run the following command.

LD_LIBRARY_PATH=/nvidia/igpu_on_dgpu/root/usr/lib/aarch64-linux-gnu/tegra <executable>

Warning

Setting the library path can result in unexpected behavior. If unexpected behavior occurs, uset the standard iGPU access method instead.

Verify iGPU and dGPU Access#

To verify that both the iGPU and dGPU are accessible, do the following,

Run the following command which runs on the dGPU by default.
```
1nvidia-smi --query-gpu=name --format=csv,noheader
```
You should see output similar to the following, which shows the dGPU.
```
1NVIDIA RTX A6000
```
Run the following command in a container on the iGPU.

Note

This example uses the ubuntu:22.04 container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.
```
1docker run --rm \
2  --runtime=nvidia \
3  -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \
4  ubuntu:22.04 \
5  nvidia-smi --query-gpu name --format=csv,noheader
```
You should see output similar to the following, which shows the iGPU.
```
1Orin (nvgpu)
```

Example of Device Query#

In this example, you use the Device Query CUDA sample application to get information about the iGPU and the dGPU.

Clone and build the device query sample application.

git clone --depth 1 --branch v12.2 https://github.com/NVIDIA/cuda-samples.git
cd cuda-samples/Samples/1_Utilities/deviceQuery
make

Get the device information for the dGPU by running the following command. It runs on the dGPU by default.

1./deviceQuery

You should see output similar to the following.

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA RTX 6000 Ada Generation"
  CUDA Driver Version / Runtime Version          12.2 / 12.2
  CUDA Capability Major/Minor version number:    8.9
  Total amount of global memory:                 48436 MBytes (50789154816 bytes)
  (142) Multiprocessors, (128) CUDA Cores/MP:    18176 CUDA Cores
  GPU Max Clock rate:                            2505 MHz (2.50 GHz)
  ...

Get the device information for the iGPU using the following command to run the application in a container on the iGPU.

Note

This example uses the ubuntu:22.04 container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.

docker run --rm -it --init \
  --runtime=nvidia \
  -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \
  -v ./deviceQuery:/opt/deviceQuery \
  ubuntu:22.04 \
  /opt/deviceQuery

You should see output similar to the following.

/opt/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Orin"
  CUDA Driver Version / Runtime Version          12.2 / 12.2
  CUDA Capability Major/Minor version number:    8.7
  Total amount of global memory:                 54792 MBytes (57453330432 bytes)
  (016) Multiprocessors, (128) CUDA Cores/MP:    2048 CUDA Cores
  GPU Max Clock rate:                            1185 MHz (1.18 GHz)
  ...

Example of Accelerated Compute#

In this example, you run a GPU-accelerated application on the iGPU and a dGPU simultaneously. You run the Matrix Multiplication CUDA sample application, and you use a matrix size larger than the default so that the application runs longer.

Clone and build the matrix multiplication sample application.

git clone --depth 1 --branch v12.2 https://github.com/NVIDIA/cuda-samples.git
cd cuda-samples/Samples/0_Introduction/matrixMul
make

Run the matrix multiplication application. It runs on the dGPU by default.

./matrixMul -wA=6400 -hA=3200 -wB=3200 -hB=6400

You should see output similar to the following.

[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Ada" with compute capability 8.9

MatrixA(6400,3200), MatrixB(3200,6400)
Computing result using CUDA Kernel...
...

In a separate terminal, run the matrix multiplication application in a container on the iGPU.

Note

This example uses the ubuntu:22.04 container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.

docker run --rm -it --init \
--runtime=nvidia \
-e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \
-v ./matrixMul:/opt/matrixMul \
ubuntu:22.04 \
/opt/matrixMul -wA=6400 -hA=3200 -wB=3200 -hB=6400

You should see output similar to the following.

[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Ampere" with compute capability 8.7

MatrixA(6400,3200), MatrixB(3200,6400)
Computing result using CUDA Kernel...
...

Monitor GPU Utilization#

While you run this example, you can monitor the utilization of both GPUs in separate terminals.

To monitor iGPU utilization, run the following command.

# Monitor continuously
watch -n 0.25 'tegrastats --interval 1 | head -n1 | sed -E "s|.*(GR3D_FREQ [0-9]+).*|\1%|"'

# -- OR --

# One time
# tegrastats --interval 1 | head -n1 | sed -E "s|.*(GR3D_FREQ [0-9]+).*|\1%|"

To monitor dGPU utilization, run the following command.

# Monitor continuously
watch -n 0.25 'nvidia-smi --query-gpu=utilization.gpu --format=csv'

# -- OR --

# One time
# nvidia-smi --query-gpu=utilization.gpu --format=csv

The following image shows both the iGPU and dGPU being accessed and used at the same time.

Image top left — dGPU utilization
Image top right — dGPU matMul
Image bottom-left — iGPU utilization
Image bottom-right — iGPU matMul