NVIDIA NIM for LLMs use Docker containers under the hood. Each NIM is its own Docker container and there are several ways to configure it. Below is a full reference of all the ways to configure a NIM container.

GPU Selection#

Passing --gpus all to docker run is acceptable in homogeneous environments with 1 or more of the same GPU.

In heterogeneous environments with a combination of GPUs (for example: A6000 + a GeForce display GPU), workloads should only run on compute-capable GPUs. Expose specific GPUs inside the container using either:

the --gpus flag (ex: --gpus='"device=1"' )

the environment variable NVIDIA_VISIBLE_DEVICES (ex: -e NVIDIA_VISIBLE_DEVICES=1 )

The device ID(s) to use as input(s) are listed in the output of nvidia-smi -L :

GPU 0 : Tesla H100 ( UUID : GPU - b404a1a1 - d532 -5 b5c -20 bc - b34e37f3ac46 ) GPU 1 : NVIDIA GeForce RTX 3080 ( UUID : GPU - b404a1a1 - d532 -5 b5c -20 bc - b34e37f3ac46 )

Refer to the NVIDIA Container Toolkit documentation for more instructions.