Configuring a NIM
NVIDIA NIM for NV-CLIP (NV-CLIP NIM) uses Docker containers. Each NIM has its own Docker container and there are several ways to configure it. Use the following reference information to configure a NIM container.
You can use the --gpus all
argument to docker run
in homogeneous environments with one or more of the same GPU.
In heterogeneous environments with a combination of GPUs, such as a A6000 Ada and a GeForce display GPU, you should only run workloads on compute-capable GPUs. Use either of the following options to identify specific GPUs:
The
--gpus
flag. For example:--gpus='"device=1"'
.The environment variable
CUDA_VISIBLE_DEVICES
. For example:-e CUDA_VISIBLE_DEVICES=1
.
Use the nvidia-smi -L
command to get the device ID(s) to use as input(s). This command should return information similar to the following:
GPU 0: Tesla H100 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
GPU 1: NVIDIA GeForce RTX 3080 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
Refer to the NVIDIA Container Toolkit documentation for further information.
You can pass the following environment variables to a NIM as an argument to the docker run
command. For example: docker run -e NGC_API_KEY=$NGC_API_KEY
, where $NGC_API_KEY
is the value of your NGC API key:
ENV |
Required? |
Default |
Notes |
---|---|---|---|
NGC_API_KEY | Yes | None | You must set this variable to the value of your personal NGC API key. |
NIM_CACHE_PATH | No | /opt/nim/.cache | Location (in container) where the container caches model artifacts. |
NIM_LOG_LEVEL | No | DEFAULT | Log level of NV-CLIP NIM service. Possible values of the variable are DEFAULT, TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL. Mostly, the effect of DEBUG, INFO, WARNING, ERROR, CRITICAL is described in Python 3 logging log level enables printing of diagnostic information for docs. TRACE debugging purposes in Triton server and in uvicorn. When NIM_LOG_LEVEL is DEFAULT sets all log levels to INFO except for Triton server log level which equals ERROR. |
NIM_HTTP_API_PORT | No | 8000 | Publish the NIM service to the prescribed port inside the container. Make sure to adjust the port passed to the -p/ –publish flag of docker run to reflect that value. For example: -p $NIM_SERVER_PORT:$NIM_SERVER_PORT. The left-hand side of : is your host address:port, and does NOT have to match with $NIM_SERVER_PORT. The right-hand side of the : is the port inside the container which MUST match NIM_SERVER_PORT (or 8000 if not set). |
NIM_MANIFEST_PROFILE | No | None | Choose the manifest profile id from Profile Selection for your GPU. By default, NIM chooses the non-optimized profile id. |
NIM_ENABLE_OTEL | No | 1 | Set this flag to 0 to disable OpenTelemetry instrumentation in NIMs. |
The following table lists the paths inside the container to mount local paths.
Container path |
Required? |
Notes |
Docker argument example |
---|---|---|---|
/opt/nim/.cache (or
NIM_CACHE_PATH
if present) |
Not required, but if this volume is not mounted, the container will do a fresh download of the model each time it is brought up. | This is the directory within
which models are downloaded
inside the container. It is
very important that this
directory could be accessed
from inside the container.
This can be achieved by
adding the option -u $(id -u) to the docker run command.
For example, to use
~/.cache/nim as the host
machine directory for
caching models, first do
mkdir -p ~/.cache/nim
before running the docker run ... command.
|
-v ~/.cache/nim:/opt/nim/.cache
-u $(id -u) . |