Simultaneous iGPU and dGPU

NVIDIA IGX Orin Developer Kit User Guide (Latest)

Your NVIDIA IGX Orin™ Developer Kit has an NVIDIA Ampere integrated GPU (iGPU) and PCIe slots for an additional discrete GPU (dGPU). Typically you configure your Dev Kit in either iGPU mode or dGPU mode. For details, see GPU Configurations.

This documentation describes advanced configuration that enables you to use the integrated GPU (iGPU) and a discrete GPU (dGPU) simultaneously (simultaneous mode). In simultaneous mode, you use Docker containers to run applications on the iGPU.

You can leverage simultaneous mode to offload less compute-intense workloads from the dGPU to the iGPU, reserving the dGPU for your more compute-intense workloads. The following table shows the capabilities that are supported by each GPU in simultaneous mode.

dGPU

iGPU

Compute Yes Yes
Graphics (headless) Yes Yes
Display Yes No
RDMA Yes No
Video Yes Not tested

The IGX Orin developer kit supports simultaneous mode for the following software releases:

  • IGX SW 1.0 GA (L4T r36.3) in dGPU mode

  • IGX SW 1.0 DP (L4T r36.1) in dGPU mode

In simultaneous mode, you use Docker containers to run applications on the iGPU. For initial testing, and for the examples in this documentation, you can use a simple container, such as ubuntu:22.04.

For your production work, use an iGPU-specific container that has a GPU-specific compute stack and acceleration. For a list of IGX iGPU-specific containers, see Containers.

Enable Access

To enable simultaneous access to the iGPU and dGPU, do the following.

  1. Run the following command, one time only.

    Copy
    Copied!
                

    sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh configure

    Warning

    On IGX SW 1.0 DP (L4T r36.1), the libnvdla_compiler.so is missing from this configuration for iGPU, which can lead to the following errors when you try to use TensorRT on the iGPU:

    Copy
    Copied!
                

    libnvinfer.so: undefined reference to `nvdla::...`

    As a workaround, run the following:

    Copy
    Copied!
                

    lib="/usr/lib/aarch64-linux-gnu/tegra/libnvdla_compiler.so" sudo cp $lib "/nvidia/igpu_on_dgpu/root/$lib" sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh generate_cdi

  2. (Optional) Run the following command to show a list of other useful commands.

    Copy
    Copied!
                

    l4t-igpu-container-on-dgpu-host-config.sh --help

  3. (Optional) If you no longer want to use simultaneous mode, to revert, run the following.

    Copy
    Copied!
                

    sudo /opt/nvidia/l4t-igpu-container-on-dgpu-host-config/l4t-igpu-container-on-dgpu-host-config.sh clean

Standard iGPU Access

After you run the one-time configuration in the previous section, any bare-metal application uses the dGPU by default. To use the iGPU, run the application in a docker container and use the following flags:

  • --runtime=nvidia — This ensures the nvidia container runtime is used.

  • -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 — This exposes the iGPU driver libraries in the container.

Tip

If you have a Docker version equal to or later than 25.0, and NVIDIA Container Toolkit equal to or later than 1.14.5, you can opt-in docker’s native Container Device Interface (CDI) support. In this case, you can run the application in a docker container and use the following flag:

  • --device=nvidia.com/igpu=0

To opt-in to Docker’s native CDI support, run the following code.

Copy
Copied!
            

sudo nvidia-ctk runtime configure --runtime=docker --cdi.enable sudo systemctl restart docker

Advanced iGPU Access

You can attempt to run a bare-metal application on your iGPU by setting your library path before you run your application. To set your library path, run the following command.

Copy
Copied!
            

LD_LIBRARY_PATH=/nvidia/igpu_on_dgpu/root/usr/lib/aarch64-linux-gnu/tegra <executable>

Warning

Setting the library path can result in unexpected behavior. If unexpected behavior occurs, uset the standard iGPU access method instead.

Verify iGPU and dGPU Access

To verify that both the iGPU and dGPU are accessible, do the following,

  1. Run the following command which runs on the dGPU by default.

    Copy
    Copied!
                

    nvidia-smi --query-gpu=name --format=csv,noheader

    You should see output similar to the following, which shows the dGPU.

    Copy
    Copied!
                

    NVIDIA RTX A6000

  2. Run the following command in a container on the iGPU.

    Note

    This example uses the ubuntu:22.04 container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.

    Copy
    Copied!
                

    docker run --rm \ --runtime=nvidia \ -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \ ubuntu:22.04 \ nvidia-smi --query-gpu name --format=csv,noheader

    You should see output similar to the following, which shows the iGPU.

    Copy
    Copied!
                

    Orin (nvgpu)

In this example, you use the Device Query CUDA sample application to get information about the iGPU and the dGPU.

  1. Clone and build the device query sample application.

Copy
Copied!
            

git clone --depth 1 --branch v12.2 https://github.com/NVIDIA/cuda-samples.git cd cuda-samples/Samples/1_Utilities/deviceQuery make


  1. Get the device information for the dGPU by running the following command. It runs on the dGPU by default.

    Copy
    Copied!
                

    ./deviceQuery

    You should see output similar to the following.

    Copy
    Copied!
                

    ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "NVIDIA RTX 6000 Ada Generation" CUDA Driver Version / Runtime Version 12.2 / 12.2 CUDA Capability Major/Minor version number: 8.9 Total amount of global memory: 48436 MBytes (50789154816 bytes) (142) Multiprocessors, (128) CUDA Cores/MP: 18176 CUDA Cores GPU Max Clock rate: 2505 MHz (2.50 GHz) ...

  2. Get the device information for the iGPU using the following command to run the application in a container on the iGPU.

    Note

    This example uses the ubuntu:22.04 container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.

    Copy
    Copied!
                

    docker run --rm -it --init \ --runtime=nvidia \ -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \ -v ./deviceQuery:/opt/deviceQuery \ ubuntu:22.04 \ /opt/deviceQuery

    You should see output similar to the following.

    Copy
    Copied!
                

    /opt/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "Orin" CUDA Driver Version / Runtime Version 12.2 / 12.2 CUDA Capability Major/Minor version number: 8.7 Total amount of global memory: 54792 MBytes (57453330432 bytes) (016) Multiprocessors, (128) CUDA Cores/MP: 2048 CUDA Cores GPU Max Clock rate: 1185 MHz (1.18 GHz) ...

In this example, you run a GPU-accelerated application on the iGPU and a dGPU simultaneously. You run the Matrix Multiplication CUDA sample application, and you use a matrix size larger than the default so that the application runs longer.

  1. Clone and build the matrix multiplication sample application.

    Copy
    Copied!
                

    git clone --depth 1 --branch v12.2 https://github.com/NVIDIA/cuda-samples.git cd cuda-samples/Samples/0_Introduction/matrixMul make

  2. Run the matrix multiplication application. It runs on the dGPU by default.

    Copy
    Copied!
                

    ./matrixMul -wA=6400 -hA=3200 -wB=3200 -hB=6400

    You should see output similar to the following.

    Copy
    Copied!
                

    [Matrix Multiply Using CUDA] - Starting... GPU Device 0: "Ada" with compute capability 8.9 MatrixA(6400,3200), MatrixB(3200,6400) Computing result using CUDA Kernel... ...

  3. In a separate terminal, run the matrix multiplication application in a container on the iGPU.

    Note

    This example uses the ubuntu:22.04 container on the iGPU. Use an iGPU-specific container for your production work. For details, see Docker Containers.

    Copy
    Copied!
                

    docker run --rm -it --init \ --runtime=nvidia \ -e NVIDIA_VISIBLE_DEVICES=nvidia.com/igpu=0 \ -v ./matrixMul:/opt/matrixMul \ ubuntu:22.04 \ /opt/matrixMul -wA=6400 -hA=3200 -wB=3200 -hB=6400

    You should see output similar to the following.

    Copy
    Copied!
                

    [Matrix Multiply Using CUDA] - Starting... GPU Device 0: "Ampere" with compute capability 8.7 MatrixA(6400,3200), MatrixB(3200,6400) Computing result using CUDA Kernel... ...

Monitor GPU Utilization

While you run this example, you can monitor the utilization of both GPUs in separate terminals.

  • To monitor iGPU utilization, run the following command.

    Copy
    Copied!
                

    # Monitor continuously watch -n 0.25 'tegrastats --interval 1 | head -n1 | sed -E "s|.*(GR3D_FREQ [0-9]+).*|\1%|"' # -- OR -- # One time # tegrastats --interval 1 | head -n1 | sed -E "s|.*(GR3D_FREQ [0-9]+).*|\1%|"

  • To monitor dGPU utilization, run the following command.

    Copy
    Copied!
                

    # Monitor continuously watch -n 0.25 'nvidia-smi --query-gpu=utilization.gpu --format=csv' # -- OR -- # One time # nvidia-smi --query-gpu=utilization.gpu --format=csv

The following image shows both the iGPU and dGPU being accessed and used at the same time.

  • Image top left — dGPU utilization

  • Image top right — dGPU matMul

  • Image bottom-left — iGPU utilization

  • Image bottom-right — iGPU matMul

matmul_igpudgpu.png

Previous Using the BMC
Next Switch ConnectX-7 Network Link Type
© Copyright © 2024, NVIDIA Corporation. Last updated on Jun 10, 2024.