Configuring the NIM#
The following environment variables can be used to configure the NIM at runtime:
ENV |
Required |
Default |
Notes |
---|---|---|---|
NGC_API_KEY |
Yes |
— |
Your NGC API key for model access. |
NIM_HTTP_API_PORT |
No |
8000 |
The port for the HTTP API server. |
USE_CUDA_IPC |
No |
Auto-detected |
Enables CUDA IPC for inter-process communication. If not explicitly set, this value will be auto-detected.
|
ENABLE_CUDA_MPS |
No |
0 |
Allows CUDA to use the multi process server. On some systems and configurations, this can produce increased performance. Values
|
NIM_ENABLE_OTEL |
No |
True |
Enables OpenTelemetry. |
CUDA_VISIBLE_DEVICES |
No |
All |
A comma-separated list of GPU IDs to use |
NIM_CACHE_PATH |
No |
/opt/nim/.cache |
The default location in the NIM to use for caching. |
NIM_LOG_LEVEL |
No |
DEFAULT |
Controls NIM logging verbosity. Supported levels are [‘DEFAULT’, ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’, ‘TRACE’] |
NIM_TRITON_LOG_VERBOSE |
No |
0 |
Controls logging verbosity of the Triton server component. Supported values are [0,1,2,3,4,5], with 0 being the least verbose and 5 being the most verbose option. |
Additional Runtime Variables:
–tmpfs /tmp/ram,rw,size=2g
: Can be used to configure/tmp/ram
to use host memory space for potential speed up on some systems.
Host caching via LOCAL_NIM_CACHE#
To persist model downloads across container restarts and align with other NIMs,
bind mount a host cache directory using LOCAL_NIM_CACHE
(see Quickstart for
example commands). The directory you mount on the host becomes the in-container
/opt/nim/.cache
path, which is also the value of NIM_CACHE_PATH
.
Notes on input resolution and variants#
The current release exposes the model as cosmos-embed1
and bundles the
224p variant, which produces 256‑dimensional embeddings. Input videos using
supported codecs and sizes are automatically resized by the NIM. Variant
selection is not configurable in this release.