Configuration#

The MSA Search NIM uses docker containers under the hood. Each NIM has its own Docker container and there are several ways to configure it. The section below describes how to configure a NIM container.

GPU Selection#

By default, Docker can use all available GPUs on the system when it starts with the NVIDIA Container Runtime:

docker run --runtime=nvidia ...

In environments with a combination of GPUs, you can only expose specific GPUs inside the container using either:

  • The --gpus flag. For example, docker run --gpus='"device=1"' ...

  • The environment variable NVIDIA_VISIBLE_DEVICES. For example, to expose only Device 1, pass -e NVIDIA_VISIBLE_DEVICES=1. To expose GPU IDs 1 and 4, pass-e NVIDIA_VISIBLE_DEVICES=1,4.

The device IDs to use as inputs are listed in the output of nvidia-smi -L:

GPU 0: Tesla H100 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
GPU 1: NVIDIA GeForce RTX 3080 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)

Refer to the NVIDIA Container Toolkit documentation for more instructions.

Environment Variables#

The following table describes the environment variables that can be passed into a NIM as a -e argument added to a docker run command:

ENV

Required?

Default

Notes

NGC_API_KEY

Yes

None

You must set this variable to the value of your personal NGC API key.

NIM_CACHE_PATH

No

/opt/nim/.cache

Location (in container) where the container caches model artifacts.

NIM_HTTP_API_PORT

No

8000

Publish the NIM service to the prescribed port inside the container. Make sure to adjust the port passed to the -p/--publish flag of docker run to reflect that (ex: -p $NIM_HTTP_API_PORT:$NIM_HTTP_API_PORT). The left-hand side of this : is your host address:port, and does NOT have to match with $NIM_HTTP_API_PORT. The right-hand side of the : is the port inside the container which MUST match NIM_HTTP_API_PORT (or 9000 if not set). Supported endpoints are /v1/license (Returns the license information), /v1/metadata (Returns metadata including asset information, license information, model information, and version) and /v1/metrics (Exposes Prometheus metrics via an ASGI app endpoint).

NIM_DISABLE_GPU_SERVER

No

True

Disables the GPU Server for MMSeqs2, which improves performance when using compatible databases. If you are using compatible databases, you can enable the GPU Server with NIM_DISABLE_GPU_SERVER=True.

NIM_GLOBAL_MAX_MSA_DEPTH

No

10,000

Sets a hard-limit on the number of returned MSA sequences.

NIM_LOG_LEVEL

No

INFO

This variable allows you to specify the level of logging detail you want to see in the container’s logs. Available options are DEBUG, INFO, WARNING, and ERROR.

MODEL_PATH

No

Unset

This variable enables a hard override of the NIM’s model path. Users should generally not need to use this variable, but it can be useful when deploying to some cloud services which use alternative methods for model caching.

NIM_MAX_WAIT_FOR_GPU_ACQUISITION

No

300

The amount of time in seconds which a request will wait before failing if it cannot acquire a GPU. For users that are heavily-utilizing underprovisioned instances, setting this value to a longer timeout may help reduce failures at the expense of longer queueing times.

Volumes#

The following table describes the paths inside the container into which local paths can be mounted.

Container path

Required

Notes

Docker argument example

/opt/nim/.cache (or NIM_CACHE_PATH if present)

Not required, but if this volume is not mounted, the container will do a fresh download of the model each time it is brought up.

This is the directory within which models are downloaded inside the container. It is very important that this directory can be accessed from inside the container. This can be achieved by setting the permissions of the local directory to read-write-execute (777). For example, to use ~/.cache/nim as the host machine directory for caching models, first do mkdir -p ~/.cache/nim, then chmod 777 ~/.cache/nim before running the docker run command.

-v ~/.cache/nim:/opt/nim/.cache