Air Gap Deployment#

NVIDIA NIM for Vision Language Models (VLMs) supports serving models in an air gap system (also known as air wall, air-gapping, or disconnected network). In an air gap system, you can run a NIM with no internet connection, and with no connection to the NGC registry.

Before you use this documentation, review all prerequisites and instructions in Get Started with NIM, and Serving models from local assets.

For air gap deployment, see offline cache.

Air Gap Deployment (offline cache option)#

NIM supports serving models in an air gap system. If NIM detects a previously loaded profile in the cache, it serves that profile from the cache.

Prerequisites#

Before deploying in an air gap environment, you need to download the model profiles on a system with internet access. Follow these steps:

Step 1: Set up environment variables on the connected system

# Choose a container name for bookkeeping
export CONTAINER_NAME=vlm-download-container

# Set your NIM image name (replace with actual values from NGC)
export IMG_NAME=<your-nim-image-name>

# Choose a path on your system to cache the downloaded models
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"

Step 2: Download models to the cache on the connected system

First, launch the NIM container to access the download utilities:

docker run -it --rm --name=$CONTAINER_NAME \
  --runtime=nvidia \
  --gpus all \
  --shm-size=16GB \
  -e NGC_API_KEY=$NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -u $(id -u) \
  $IMG_NAME bash

Inside the container, list available profiles and download the one you need:

# List available model profiles
list-model-profiles

# download a profile matching the air-gapped profile (replace with actual profile ID)
download-to-cache --profile 09e2f8e68f78ce94bf79d15b40a21333cea5d09dbe01ede63f6c957f4fcfab7b

Exit the container after the download completes:

exit

Air-Gapped System Deployment#

Step 3: Transfer the cache to the air-gapped system

Copy the entire cache directory from the connected system to your air-gapped system.

Step 4: Set up environment variables on the air-gapped system

# Choose a container name for bookkeeping
export CONTAINER_NAME=vlm-airgap-container

# Set your NIM image name (same as used for download)
export IMG_NAME=<your-nim-image-name>

# Path to the transferred cache directory on the air-gapped system
export AIR_GAP_NIM_CACHE=~/.cache/air-gap-nim-cache

# Ensure the directory exists and copy the transferred cache
mkdir -p "$AIR_GAP_NIM_CACHE"
cp -r <path-to-transferred-cache>/* "$AIR_GAP_NIM_CACHE"

Step 5: Launch NIM on the air-gapped system

# replace NIM_MODEL_PROFILE value with actual profile ID
docker run -it --rm --name=$CONTAINER_NAME \
  --runtime=nvidia \
  --gpus all \
  --shm-size=16GB \
  -e NIM_MODEL_PROFILE=09e2f8e68f78ce94bf79d15b40a21333cea5d09dbe01ede63f6c957f4fcfab7b \
  -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache" \
  -u $(id -u) \
  -p 8000:8000 \
  $IMG_NAME

Verification#

Once the NIM is running on the air-gapped system, verify the deployment by checking the available models:

curl -X GET 'http://0.0.0.0:8000/v1/models'

You should see output similar to what’s described in the Run Inference section of the Get Started guide.