Configuration#

Riva NMT NIM uses docker containers under the hood. Each NIM has its own Docker container and there are several ways to configure it. The remainder of this section describes how to configure a NIM container.

GPU Selection#

Passing --gpus all to docker run is not supported in environments with two or more GPUs.

In environments with a combination of GPUs, expose specific GPU inside the container using either:

the --gpus flag (ex: --gpus='"device=1"')
the environment variable NVIDIA_VISIBLE_DEVICES (ex: -e NVIDIA_VISIBLE_DEVICES=1)

The device ID(s) to use as input(s) are listed in the output of nvidia-smi -L:

GPU 0: Tesla H100 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
GPU 1: NVIDIA GeForce RTX 3080 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)

Refer to the NVIDIA Container Toolkit documentation for more instructions.

Shared memory flag#

Tokenization uses Triton’s Python backend capabilities that scales with the number of CPU cores available. You may need to increase the available shared memory given to the microservice container.

Example providing 8g of shared memory:

docker run ... --shm-size=8g ...

Environment Variables#

The following table describes the environment variables that can be passed into a NIM as a -e argument added to a docker run command:

General#

NGC_API_KEY

Your personal NGC API key. Required.

NIM_CACHE_PATH

The location in the container where the container caches model artifacts. If this volume is not mounted, the container does a fresh download of the model every time the container starts.

Default: /opt/nim/.cache

NIM_HTTP_API_PORT

Publish the NIM HTTP service to the prescribed port inside the container. Make sure to adjust the port passed to the -p/--publish flag of docker run to reflect that (ex: -p $NIM_HTTP_API_PORT:$NIM_HTTP_API_PORT). The left-hand side of this : is your host address:port, and does NOT have to match with $NIM_HTTP_API_PORT. The right-hand side of the : is the port inside the container which MUST match NIM_HTTP_API_PORT or (or 9000 if not set). Supported endpoints are as follows:

/v1/license: Returns the license information
/v1/metadata: Returns metadata including asset information, license information, model information, and version
/v1/metrics: Exposes Prometheus metrics via an ASGI app endpoint

Default: 9000

NIM_GRPC_API_PORT

Publish the Riva TTS service over GRPC to the prescribed port inside the container. Make sure to adjust the port passed to the -p/--publish flag of docker run to reflect that (ex: -p $NIM_GRPC_API_PORT:$NIM_GRPC_API_PORT). The left-hand side of this : is your host address:port, and does NOT have to match with $NIM_GRPC_API_PORT. The right-hand side of the : is the port inside the container which MUST match NIM_GRPC_API_PORT (or 50051 if not set).

Default: 50051

NIM_HTTP_TRITON_PORT

Port number for Triton HTTP endpoint. This is used internally and need not be exposed to the host.

Default: 8000

NIM_GRPC_TRITON_PORT

Port number for Triton gRPC endpoint. This is used internally and need not be exposed to the host.

Default: 8001

NIM_TRITON_METRICS_PORT

Port number for Triton metrics endpoint. This is used internally and need not be exposed to the host.

Default: 8002

SSL / HTTPS#

These variables control SSL/TLS configuration for secure connections to and from the NIM service, and secure model downloads. Use these variables to enable HTTPS, configure certificates, and manage proxy and certificate authority settings.

NIM_SSL_MODE

Specify a value to enable SSL/TLS in served endpoints. The possible values are as follows:

"DISABLED": No HTTPS
"TLS": HTTPS with only server-side TLS (client certificate not required). TLS requires NIM_SSL_CERTS_PATH and NIM_SSL_KEY_PATH to be set.
"MTLS": HTTPS with mTLS (client certificate required). MTLS requires NIM_SSL_CERTS_PATH, NIM_SSL_KEY_PATH, NIM_SSL_CLIENT_KEY_PATH, NIM_SSL_CLIENT_CERT_PATH and NIM_SSL_CA_CERTS_PATH to be set.

Default: "DISABLED"

NIM_SSL_KEY_PATH

The path to the server’s TLS private key file (required for TLS HTTPS). It’s used to decrypt incoming messages and sign outgoing ones. Required if NIM_SSL_MODE is enabled.

Default: None

NIM_SSL_CERT_PATH

The path to the server’s certificate file (required for TLS HTTPS). It contains the public key and server identification information. Required if NIM_SSL_MODE is enabled. Default: None

NIM_SSL_CA_CERTS_PATH

The path to the CA (Certificate Authority) certificate. Required if NIM_SSL_MODE="MTLS".

Default: None

NIM_SSL_CLIENT_KEY_PATH

The path to the client’s TLS private key file (required for TLS HTTPS). Required if NIM_SSL_MODE=mtls is enabled.

Default: None

NIM_SSL_CLIENT_CERT_PATH

The path to the client’s certificate file (required for TLS HTTPS). Required if NIM_SSL_MODE=mtls is enabled.

Default: None

Local Model Cache#

These variables manage location of the model cache. Downloaded models and custom profiles are saved into the cache. Mount a cache volume and provide path to the volume in the NIM_CACHE_PATH environment variable.

NIM_CACHE_PATH: The location in the container where the container caches model artifacts. If this volume is not mounted, the container does a fresh download of the model every time the container starts. Default: /opt/nim/.cache

Volumes#

These settings define how to mount local file system paths into the NIM container.

/opt/nim/.cache

This is the default directory where models are downloaded and cached inside the container. Mount a directory from your host machine to this path to preserve the cache between container runs. If this volume is not mounted, the container will download the model every time it starts. You can customize this path with the NIM_CACHE_PATH environment variable.

For example, to use ~/.cache/nim on your host machine as the cache directory:

Create the directory on your host: mkdir -p ~/.cache/nim
Mount the directory by running the docker run command with the -v and -u options: docker run ... -v ~/.cache/nim:/opt/nim/.cache -u $(id -u)