Configuration
NeMo Text Retriever NIM use docker containers under the hood. Each NIM is its own Docker container and there are several ways to configure it. The remainder of this section describes the various ways to configure a NIM container.
GPU Selection
Passing --gpus all
to docker run is acceptable in homogeneous environments with one or more of the same GPU.
In heterogeneous environments with a combination of GPUs, such as an A6000 + a GeForce display GPU, workloads should only run on compute-capable GPUs. Expose specific GPUs inside the container using either:
the
--gpus
flag (ex:--gpus="device=1"
)the environment variable
NVIDIA_VISIBLE_DEVICES
(ex:-e NVIDIA_VISIBLE_DEVICES=1
)
The device ID(s) to use as input(s) are listed in the output of nvidia-smi -L
:
GPU 0: Tesla H100 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
GPU 1: NVIDIA GeForce RTX 3080 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
Refer to the NVIDIA Container Toolkit documentation for more instructions.
Environment Variables
The following table describes the environment variables that can be passed into a NIM, as a -e
argument added to a docker run
command:
ENV |
Required? |
Default |
Notes |
---|---|---|---|
|
Yes |
None |
You must set this variable to the value of your personal NGC API key. |
|
No |
|
Location (in container) where the container caches model artifacts. |
|
No |
|
Set to |
|
No |
|
Log level of NeMo Text Retriever NIM. Possible values of the variable are DEFAULT, DEBUG, INFO, WARNING, ERROR, CRITICAL. Mostly, the effect of DEBUG, INFO, WARNING, ERROR, CRITICAL is described in Python 3 logging. |
|
No |
|
Publish the NIM service to the prescribed port inside the container. Make sure to adjust the port passed to the |
Volumes
The following table describes the paths inside the container into which local paths can be mounted.
Container path |
Required |
Notes |
Docker argument example |
---|---|---|---|
|
Not required, but if this volume is not mounted, the container will do a fresh download of the model each time it is brought up. |
This is the directory within which models are downloaded inside the container. It is very important that this directory could be accessed from inside the container. This can be achieved by adding the option |
|