Hardware Requirements#
This NIM offers optimized performance through a custom vLLM backend for a limited set of GPUs.
The following are the minimum required specifications for supported hardware components:
Requirement |
Specification |
|---|---|
CPU |
AMD64, ARM64 |
GPU |
Refer to |
Hardware Requirements#
The following are the minimum required specifications for supported hardware components:
Requirement |
Specification |
|---|---|
CPU |
AMD64, ARM64 |
GPU |
Refer to |
Software Requirements#
Minimum required versions for supported software components.
Requirement |
Specification |
|---|---|
Operating System |
Ubuntu 22.04 LTS or later recommended |
Container Toolkit |
1.14.0 or later |
CUDA SDK |
12.9 or later |
GPU Driver |
580 or later |
Docker |
24.0 or later |
Operating System#
While other Linux distributions can be compatible with NIM, they have not been officially validated.
We recommend using Ubuntu 22.04 LTS or later for the best experience.
CUDA SDK#
Install CUDA SDK by following the CUDA installation guide for Linux.
GPU Drivers#
Install the NVIDIA GPU drivers by following the NVIDIA Driver Installation Guide.
Docker#
Docker is required to run the containerized NIM services.
Install Docker Engine for your Linux distribution by following the Docker Engine installation guide.
Verify that the Docker daemon is running and that your user can execute
dockercommands withoutsudo. Add your user to thedockergroup if needed:sudo groupadd docker sudo usermod -aG docker $USER
Log out and back in for the group change to take effect.
Container Toolkit#
The NVIDIA Container Toolkit enables Docker containers to access the host GPU.
Install the toolkit by following the NVIDIA Container Toolkit installation guide.
Configure Docker to use the NVIDIA runtime by following the Docker configuration steps.
Restart the Docker daemon after configuration:
sudo systemctl restart docker
NIM Container Access#
To download and deploy NIM containers, you need one of the following:
A free NVIDIA Developer Program membership.
An NVIDIA AI Enterprise license. To request a free 90-day evaluation license, refer to Ways to Get Started With NVIDIA AI Enterprise and Activate Your NVIDIA AI Enterprise License.
Generate Access Credentials#
An NGC Personal API Key is required to access NVIDIA NIM containers and models hosted on NGC.
Generate the Personal API Key on the Setup API Keys page.
When creating the Personal API key, select at least NGC Catalog from the Services Included list. You can also include additional services if you want to use the same key for other purposes.
Warning
Legacy API keys are not supported by NIM. Always use a Personal API Key.
Verify NVIDIA Runtime Access#
To ensure that your setup is correct, run the following command:
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
This command should produce output similar to one of the following, where you can confirm CUDA driver version, and available GPUs.
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA H100 80GB HBM3 On | 00000000:1B:00.0 Off | 0 |
| N/A 36C P0 112W / 700W | 78489MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Export NGC API Key#
Export the variable in your shell (temporary), replacing
<VALUE>with your actual API key:export NGC_API_KEY=<VALUE>
Persist the variable (optional):
If using bash:
echo "export NGC_API_KEY=$NGC_API_KEY" >> ~/.bashrc
If using zsh:
echo "export NGC_API_KEY=$NGC_API_KEY" >> ~/.zshrc
Verify the variable is set:
echo "$NGC_API_KEY"
Model Cache#
NIM downloads model weights to a cache on the host that you mount into the container. Artifacts persist across restarts, so you do not pull the full model on every run.
Local Cache#
An essential variable to configure on your host system is the cache path directory. This directory is mapped from the host machine to container; assets (for example, model weights) are downloaded to this host directory and persist across container restarts. Configuring a local cache is highly recommended, as it avoids re-downloading large model files upon subsequent container restarts. You can name the environment variable containing the path to the local cache whatever you want.
Create the cache directory and export an environment variable:
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p $LOCAL_NIM_CACHE
# Optionally add sticky bit to avoid issues writing to the cache if the container is running as a different user
chmod -R a+rwxt $LOCAL_NIM_CACHE
When you start the NIM container, you must map your host machine’s local cache directory ($LOCAL_NIM_CACHE) to the container’s internal cache path (/opt/nim/.cache) using a Docker volume mount, such as -v "$LOCAL_NIM_CACHE:/opt/nim/.cache". This mapping ensures that the large model weights downloaded by the container are saved to your host machine. Because containers are ephemeral, any data stored only inside the container is lost when it stops. By using a volume mount, subsequent container runs detect the existing model files in your local cache and skip the lengthy download process, allowing the NIM to start up faster.
Cache Directory Permissions#
The NIM container runs as a non-root user with GID 0 (root group). The cache directory on your host must be writable by GID 0:
export NIM_CACHE_PATH=/tmp/nim-cache
mkdir -p "$NIM_CACHE_PATH"
sudo chgrp -R 0 "$NIM_CACHE_PATH"
sudo chmod -R g+rwX "$NIM_CACHE_PATH"
Run the container with the cache mounted:
docker run --gpus all \
-v "$NIM_CACHE_PATH:/opt/nim/.cache" \
...
To run as a custom user (e.g., your host user), pass -u <uid>:0:
docker run --gpus all -u $(id -u):0 \
-v "$NIM_CACHE_PATH:/opt/nim/.cache" \
...
Important
When using -u <uid>, you must include :0 to set GID 0 (e.g., -u $(id -u):0). The container’s writable directories are group-owned by GID 0. Without it, the container will fail with PermissionError when writing to cache, config, or log paths.
Tip
To make this setting permanent across terminal sessions, you can add export LOCAL_NIM_CACHE=~/.cache/nim to your ~/.bashrc or ~/.zshrc profile.
Docker Login#
To pull the NIM container image from NGC, first authenticate with the NVIDIA Container Registry with the following command:
echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
Use $oauthtoken as the username and NGC_API_KEY as the password. The $oauthtoken username is a special name that indicates that you will authenticate with an API key and not a user name and password.
For more information on performance benchmarks, refer to the Performance Explorer.