Is this page helpful?

Configuration#

NIM LLM offers several configuration options using environment variables to control caching, logging, and model profiles.

Export the API Key#

To use your API key when starting the NIM container, you must make it available as an environment variable.

Model-Specific NIM

Export NGC API Key

Export the variable in your shell (temporary), replacing VALUE with your actual API key:
```
export NGC_API_KEY=VALUE
```

Persist the variable (optional):

If using bash:

echo "export NGC_API_KEY=$NGC_API_KEY" >> ~/.bashrc

If using zsh:

echo "export NGC_API_KEY=$NGC_API_KEY" >> ~/.zshrc

Verify the variable is set:
```
echo "$NGC_API_KEY"
```

Model-Free NIM

Export Hugging Face Access Token

Complete the steps in the Model-Specific NIM tab to export your NGC Personal API key.
Export the variable in your shell (temporary), replacing <token-value> with your actual token:
```
export HF_TOKEN="<token-value>"
```

Persist it (optional, for future terminals):

If using bash:

echo 'export HF_TOKEN="<token-value>"' >> ~/.bashrc

If using zsh:

echo 'export HF_TOKEN="<token-value>"' >> ~/.zshrc

Verify the variable is set:
```
echo "$HF_TOKEN"
```

Note

If you want to serve a pre-downloaded local model or a private cloud model instead of downloading one from Hugging Face, you do not need a Hugging Face access token. Refer to Model Downloads for your workflow.

Important

For enhanced security, consider storing your token in a file and retrieving it as needed with cat. Alternatively, consider using a password manager.

Model Cache and Source#

Model-Specific NIM

Local Cache

An essential variable to configure on your host system is the cache path directory. This directory is mapped from the host machine to container; assets (for example, model weights) are downloaded to this host directory and persist across container restarts. Configuring a local cache is highly recommended, as it avoids re-downloading large model files upon subsequent container restarts. You can name the environment variable containing the path to the local cache whatever you want.

Create the cache directory and export an environment variable:

export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p $LOCAL_NIM_CACHE
# Optionally add sticky bit to avoid issues writing to the cache if the container is running as a different user
chmod -R a+rwxt $LOCAL_NIM_CACHE

When you start the NIM container, you must map your host machine’s local cache directory ($LOCAL_NIM_CACHE) to the container’s internal cache path (/opt/nim/.cache) using a Docker volume mount, such as -v "$LOCAL_NIM_CACHE:/opt/nim/.cache". This mapping ensures that the large model weights downloaded by the container are saved to your host machine. Because containers are ephemeral, any data stored only inside the container is lost when it stops. By using a volume mount, subsequent container runs detect the existing model files in your local cache and skip the lengthy download process, allowing the NIM to start up faster.

Cache Directory Permissions

The NIM container runs as a non-root user with GID 0 (root group). The cache directory on your host must be writable by GID 0:

export NIM_CACHE_PATH=/tmp/nim-cache
mkdir -p "$NIM_CACHE_PATH"
sudo chgrp -R 0 "$NIM_CACHE_PATH"
sudo chmod -R g+rwX "$NIM_CACHE_PATH"

Run the container with the cache mounted:

docker run --gpus all \
  -v "$NIM_CACHE_PATH:/opt/nim/.cache" \
  ...

To run as a custom user (e.g., your host user), pass -u <uid>:0:

docker run --gpus all -u $(id -u):0 \
  -v "$NIM_CACHE_PATH:/opt/nim/.cache" \
  ...

Important

When using -u <uid>, you must include :0 to set GID 0 (e.g., -u $(id -u):0). The container’s writable directories are group-owned by GID 0. Without it, the container will fail with PermissionError when writing to cache, config, or log paths.

Tip

To make this setting permanent across terminal sessions, you can add export LOCAL_NIM_CACHE=~/.cache/nim to your ~/.bashrc or ~/.zshrc profile.

Model-Free NIM

Custom Model Source

After you have set up your local cache, you can configure model-free NIM to download a model from Hugging Face. To do this, specify the directory where the model files should be stored and set the appropriate environment variable.

Specify a Hugging Face model directly using the hf:// prefix.

export NIM_MODEL_PATH="hf://openai/gpt-oss-20b"

The NIM will automatically download the model from Hugging Face and cache it in your configured LOCAL_NIM_CACHE. When running the container, you will map your host machine’s local cache directory to a path inside the container using the -v flag. This ensures that the downloaded model weights persist on your host machine across ephemeral container restarts.

For example, using -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" tells Docker: “Take the folder at $LOCAL_NIM_CACHE on my host machine, and make it available inside the container at the path /opt/nim/.cache.”

Note

If you plan to use a model source other than Hugging Face, such as a pre-downloaded local model or private cloud storage, the URI and path will be different. Refer to model-download-sources for instructions.

Advanced Configurations#

For production deployments or specific organizational requirements, you can configure additional environment variables. NIM LLM supports advanced settings such as:

TLS/SSL Configuration (NIM_SSL_MODE, NIM_SSL_KEY_PATH, NIM_SSL_CERTS_PATH, NIM_SSL_CA_CERTS_PATH): Control whether SSL/TLS is enabled for API connections, and specify the locations of SSL private keys, certificates, and CA certificates.
Unified Structured Logging and Verbosity (NIM_JSONL_LOGGING, NIM_LOG_LEVEL): Enable structured JSONL log output and set the logging verbosity level for debugging or monitoring.
Manual Model Profile Overrides (NIM_MODEL_PROFILE): Manually select or override the model execution profile, allowing control over parameters like precision or hardware acceleration.

For more details on these advanced settings, refer to Advanced Configurations.