Air-Gap Deployment#

Air-gap deployment lets you run a NIM without an internet connection (for example, no connection to any remote model registry such as NGC or Hugging Face Hub).

This guide describes how to prepare, transfer, and run NIM LLM in an air-gapped environment.

Overview#

Air-gap deployment follows a two-phase workflow:

  1. Network-connected phase: On a machine with internet access and credentials, download model assets and prepare them for transfer.

  2. Air-gapped phase: On the isolated machine, mount the pre-staged model assets and run the NIM container with no outbound network access and no API keys.

Between these phases, you transfer the model assets to the air-gapped system using any allowed channel (archive and copy, scp or rsync, physical media, and so on).

Important

In the air-gapped phase, do not set NGC_API_KEY or HF_TOKEN. The NIM must load all model assets from local storage only.

Regular NIM and Model-Free NIM#

NIM LLM 2.0 supports two deployment modalities that affect the air-gap workflow.

Regular (Model-Specific) NIM#

  • The NIM container ships with a model manifest that defines the model and one or more profiles (tensor parallel size, backend, and so on).

  • Model weights are not shipped inside the container. At startup, NIM loads the model from cache if present, or downloads it when the system has network and credentials.

  • For air-gap, you must pre-stage the model using download-to-cache (and optionally create-model-store) during the connected phase, then transfer the cache or model store to the air-gapped system.

Model-Free NIM#

  • The model is configured at runtime by the user (using NIM_MODEL_PATH or the positional vLLM model CLI argument). The container does not ship a fixed model manifest.

  • When NIM_MODEL_PATH points to a local path (for example, /mnt/models/my-llama), NIM serves directly from that path. No download occurs, so if the model directory is already present on the air-gapped system, you do not need to use download-to-cache or create-model-store.

Quick Reference#

The following table summarizes the model source location at runtime and how it is accessed:

NIM Type

Model Source at Runtime

Download at Startup?

Air-Gap: Need download-to-cache or create-model-store?

Regular NIM

Baked-in manifest (references remote source)

Yes, if not in cache

Yes — pre-stage cache or model store

Model-free NIM

Local path

No

No — ensure local path is available in air-gap

Network-Connected Phase#

On a machine with internet access, use the NIM container to download model assets and optionally create a model store directory.

Download a Model to Cache#

The download-to-cache command downloads selected or default model profiles to the NIM cache.

docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v $LOCAL_NIM_CACHE:/opt/nim/.cache \
  $IMG_NAME \
  download-to-cache -p <PROFILE_HASH>

You can also download all profiles with --all, or let NIM auto-select a profile based on the available hardware by omitting the -p flag.

Tip

Run list-model-profiles to discover available profiles and their hashes before downloading.

Create a Model Store#

The create-model-store command extracts files from a cached model profile and creates a properly formatted model directory. If the profile is not already cached, the command downloads it first.

docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v $LOCAL_NIM_CACHE:/opt/nim/.cache \
  -v $MODEL_REPO:/model-repo \
  $IMG_NAME \
  create-model-store -p <PROFILE_HASH> -m /model-repo

The resulting directory is ready for transfer to the air-gapped system.

Transfer Model Assets#

Transfer the model store directory (or cache) from the connected machine to the air-gapped environment. Common methods include:

  • Archive and copy: Create a tar or zip archive, transfer it through an allowed channel, then extract it on the air-gapped host.

  • scp or rsync: Use direct directory transfer when a one-way or out-of-band SSH path is available. rsync supports incremental synchronization.

  • Physical media: Copy the directory or archive onto portable storage (for example, a USB drive), physically move it to the air-gapped environment, then extract it.

Air-Gapped Phase#

On the air-gapped system, mount the pre-staged model directory and run the NIM container.

Use a Model Store#

Run the following commands to use a model store:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repository
export NIM_SERVED_MODEL_NAME=my-model

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PATH=/model-repo \
  -e NIM_SERVED_MODEL_NAME \
  -v "$MODEL_REPO:/model-repo" \
  "$IMG_NAME"

Important

Do not set NGC_API_KEY or HF_TOKEN in this phase.

Use the Offline Cache#

If you prefer to transfer the cache rather than a model store, mount the cache and specify the profile as shown in the following commands:

export CONTAINER_NAME=nim-airgap
export AIR_GAP_NIM_CACHE=/path/to/transferred-cache
export PROFILE_HASH=<PROFILE_HASH>

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PROFILE="$PROFILE_HASH" \
  -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache" \
  "$IMG_NAME"

Model-Free NIM with a Local Path#

When using a model-free NIM with a local model path, no download occurs at startup. If the model directory is already on the air-gapped system, you can run without download-to-cache or create-model-store and without any API keys:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repo

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_SERVED_MODEL_NAME=my-model \
  -e NIM_MODEL_PATH=/model-repo \
  -v "$MODEL_REPO:/model-repo" \
  "$IMG_NAME"

Proxy and Certificate Configuration#

If the network-connected phase runs behind a corporate or outbound proxy, you must configure the standard proxy environment variables so that downloads are routed correctly.

HTTP/HTTPS Proxy#

Variable

Description

HTTP_PROXY

Proxy server for HTTP connections

HTTPS_PROXY

Proxy server for HTTPS connections

NO_PROXY

Comma-separated list of hostnames, domains, or IPs that should bypass the proxy

Note

In the air-gapped phase, there is no outbound traffic so proxy variables are not needed.

CA Certificate Injection#

When outbound HTTPS traffic goes through a TLS-inspecting proxy, you must inject the corporate Certificate Authority (CA) certificate so that TLS verification succeeds. Set SSL_CERT_FILE to point to the certificate file inside the container.

docker run --rm --gpus all \
  -v /path/to/corporate-ca.pem:/opt/nim/proxy-ca.pem:ro \
  -e SSL_CERT_FILE=/opt/nim/proxy-ca.pem \
  -e HTTP_PROXY=http://proxy-host:8888 \
  -e HTTPS_PROXY=http://proxy-host:8888 \
  -e NGC_API_KEY \
  $IMG_NAME \
  download-to-cache -p <PROFILE_HASH>

Note

CA certificate injection is only relevant during the network-connected phase. After the deployment is fully air-gapped, there is no outbound HTTPS traffic and no CA configuration is needed.

Environment Variables#

The following environment variables are relevant to air-gap deployment:

Variable

Description

NIM_MODEL_PATH

Path to the local model repository or a remote URI. In air-gapped mode, set to the local mount path.

NIM_CACHE_PATH

Directory path for model cache storage. Default: ~/.cache/nim.

NIM_MODEL_PROFILE

Profile ID to use. Required when running from cache without a model store.

NIM_SERVED_MODEL_NAME

Name used for the served model in API responses.

NGC_API_KEY

NGC API key for downloading models. Do not set in air-gapped phase.

HF_TOKEN

Hugging Face token for downloading models. Do not set in air-gapped phase.

Examples#

Use the commands in the following examples to deploy a NIM in an air-gapped environment for different scenarios.

Download and Create a Model Store#

Network-connected phase:

export NGC_API_KEY=your_api_key
export IMG_NAME=${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.0
export PROFILE_HASH=<your_profile_hash>
export LOCAL_NIM_CACHE="${LOCAL_NIM_CACHE:-$(pwd)/.cache}"
export MODEL_REPO=/path/to/model-repository
mkdir -p "$MODEL_REPO" "$LOCAL_NIM_CACHE"

# Download to cache
docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  "$IMG_NAME" \
  download-to-cache -p "$PROFILE_HASH"

# Create model store
docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -v "$MODEL_REPO:/model-repo" \
  "$IMG_NAME" \
  create-model-store -p "$PROFILE_HASH" -m /model-repo

Transfer the model store to the air-gapped system, then run:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repository
export NIM_SERVED_MODEL_NAME=my-model

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PATH=/model-repo \
  -e NIM_SERVED_MODEL_NAME \
  -v "$MODEL_REPO:/model-repo" \
  "$IMG_NAME"

Use the Offline Cache#

Network-connected phase:

export LOCAL_NIM_CACHE="${LOCAL_NIM_CACHE:-$HOME/.cache/nim}"
export PROFILE_HASH=<your_profile_hash>
mkdir -p "$LOCAL_NIM_CACHE"

docker run --rm --gpus all \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -e NGC_API_KEY \
  "$IMG_NAME" \
  download-to-cache -p "$PROFILE_HASH"

Transfer the cache to the air-gapped system, then run:

export AIR_GAP_NIM_CACHE=/path/to/transferred-cache
export CONTAINER_NAME=nim-airgap

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PROFILE="$PROFILE_HASH" \
  -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache" \
  "$IMG_NAME"

Model-Free NIM with a Local Path (No Download Needed)#

If the model directory is already on the air-gapped system (for example, copied using secure transfer), you can run directly without any prior download steps:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repo

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_SERVED_MODEL_NAME=my-model \
  -e NIM_MODEL_PATH=/model-repo \
  -v "$MODEL_REPO:/model-repo" \
  "$IMG_NAME"

Network-Connected Phase with Proxy and CA Certificate#

For environments that require an HTTP/HTTPS proxy and custom CA certificates during the download phase:

docker run --rm --gpus all \
  --add-host=host.docker.internal:host-gateway \
  -v ~/.mitmproxy/corporate-ca.pem:/opt/nim/proxy-ca.pem:ro \
  -e NIM_SERVER_PORT=8000 \
  -e NIM_SERVED_MODEL_NAME=my-model \
  -e NIM_MODEL_PATH=hf://TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
  -e HTTP_PROXY=http://host.docker.internal:8888 \
  -e HTTPS_PROXY=http://host.docker.internal:8888 \
  -e SSL_CERT_FILE=/opt/nim/proxy-ca.pem \
  "$IMG_NAME"

Note

After the model is pre-staged and transferred, the air-gapped phase does not require proxy or CA certificate configuration.