Is this page helpful?

Air-Gap Deployment#

Air-gap deployment lets you run a NIM without an internet connection, for example, with no connection to remote model registries such as NGC or Hugging Face Hub. This guide describes how to prepare, transfer, and run NIM LLM in an air-gapped environment.

Air-gap deployment follows a two-phase workflow:

Network-connected phase: On a machine with internet access and credentials, download model assets and prepare them for transfer.
Air-gapped phase: On the isolated machine, mount the pre-staged model assets and run the NIM container with no outbound network access and no API keys.

Transfer the model assets between these phases using any allowed channel, such as an archive copy, scp, rsync, or physical media.

Important

In the air-gapped phase, do not set NGC_API_KEY or HF_TOKEN. The NIM must load all model assets from local storage only.

Choose a Deployment Modality#

NIM LLM supports two deployment modalities. Your choice affects how you prepare model assets for an air-gapped deployment.

Model-Specific NIM#

For model-specific NIM, keep the following in mind:

The NIM container ships with a model manifest that defines the model and one or more profiles (tensor parallel size, backend, and so on).
Use a model-specific container image for this workflow, such as ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3.
Model weights are not shipped inside the container. At startup, NIM loads the model from cache if present, or downloads it when the system has network and credentials.
For air-gap, you must pre-stage the model using download-to-cache (and optionally create-model-store) during the connected phase, then transfer the cache or model store to the air-gapped system.

Model-Free NIM#

For model-free NIM, keep the following in mind:

The model is configured at runtime by the user (using NIM_MODEL_PATH or the positional vLLM model CLI argument). The container does not ship a fixed model manifest.
Use the model-free container image for this workflow, such as ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.3.
When NIM_MODEL_PATH points to a local path (for example, /mnt/models/my-llama), NIM serves directly from that path. No download occurs, so if the model directory is already present on the air-gapped system, you do not need to use download-to-cache or create-model-store.
When NIM_MODEL_PATH is a remote URI (for example, ngc://org/team/model:tag), NIM generates a runtime manifest on the first deployment and automatically persists it inside NIM_CACHE_PATH. On subsequent restarts — including in strict air-gap environments — NIM finds the cached manifest in the same cache volume and skips regeneration, making no outbound network or authentication calls. No extra environment variables are required; mounting the same PVC at NIM_CACHE_PATH is sufficient.

Quick Reference#

The following table summarizes the model source location at runtime, the container image to use, and how the model is accessed:

NIM Type	Container Image	`NIM_MODEL_PATH` value	Download at Startup?	Air-Gap support
Regular NIM	`${NIM_LLM_MODEL_SPECIFIC_IMAGE}`	(not used)	Yes, if not in cache	Yes — pre-stage cache or model store
Model-free NIM	`${NIM_LLM_MODEL_FREE_IMAGE}`	Local path (e.g. `/mnt/model`)	No	Yes — ensure local path is mounted
Model-free NIM	`${NIM_LLM_MODEL_FREE_IMAGE}`	Remote URI (e.g. `ngc://...`) — first deploy	Yes, generates manifest	Requires network on first deploy
Model-free NIM	`${NIM_LLM_MODEL_FREE_IMAGE}`	Remote URI (e.g. `ngc://...`) — redeploy with manifest on PVC	No	Yes — manifest reused from PVC, no auth calls

Network-Connected Phase#

On a machine with internet access, use the NIM container to download model assets and optionally create a model store directory.

Use a container image that matches your workflow:

For model-specific NIM, use ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3.
For model-free NIM, use ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.3.

Download a Model to Cache#

The download-to-cache command downloads selected or default model profiles to the NIM cache.

docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3 \
  download-to-cache -p <PROFILE_HASH>

You can also download all profiles with --all, or let NIM auto-select a profile based on the available hardware by omitting the -p flag.

Tip

Run list-model-profiles to discover available profiles and their hashes before downloading.

Create a Model Store#

The create-model-store command extracts files from a cached model profile and creates a properly formatted model directory. If the profile is not already cached, the command downloads it first.

docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -v "$MODEL_REPO:/model-repo" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3 \
  create-model-store -p <PROFILE_HASH> -m /model-repo

The resulting directory is ready for transfer to the air-gapped system.

Transfer Model Assets#

Transfer the model store directory (or cache) from the connected machine to the air-gapped environment. Common methods include:

Archive and copy: Create a tar or zip archive, transfer it through an allowed channel, then extract it on the air-gapped host.
scp or rsync: Use direct directory transfer when a one-way or out-of-band SSH path is available. rsync supports incremental synchronization.
Physical media: Copy the directory or archive onto portable storage (for example, a USB drive), physically move it to the air-gapped environment, then extract it.

Air-Gapped Phase#

On the air-gapped system, mount the pre-staged model directory and run the NIM container.

Use a Model Store#

Run the following commands to use a model store:

Note

Use the same image type that you used to create the model store in a model-specific NIM workflow.

Set the runtime variables:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repository
export NIM_SERVED_MODEL_NAME=my-model

Start the NIM container:

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PATH=/model-repo \
  -e NIM_SERVED_MODEL_NAME \
  -v "$MODEL_REPO:/model-repo" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3

Important

Do not set NGC_API_KEY or HF_TOKEN in this phase.

Use Model-Free NIM with a Remote URI (PVC-Based Redeploy)#

If you use model-free NIM with NIM_MODEL_PATH set to a remote URI (for example, ngc://org/team/model:tag), NIM generates a runtime manifest on the first deployment and automatically saves a copy inside NIM_CACHE_PATH.

On subsequent restarts — including in a strict air-gap environment — NIM finds the cached manifest in the same cache volume and skips regeneration, making no outbound network or authentication calls. No additional environment variables are required.

Prerequisites:

The NIM cache from the first (network-connected) deployment must be on a persistent volume (PVC) that is re-mounted between deployments. Both model assets and the manifest are stored there.

Set the runtime variables:

export CONTAINER_NAME=nim-airgap-model-free
export LOCAL_NIM_CACHE=/path/to/persistent-nim-cache   # same PVC used on first deploy
export NIM_MODEL_PATH=ngc://org/team/model:tag

Start the NIM container:

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PATH \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.0

NIM detects the cached manifest in /opt/nim/.cache and starts without any outbound network access.

Important

Do not set NGC_API_KEY in the air-gap phase. If the cache volume is not mounted or was not populated by a prior network-connected deployment, NIM will attempt to regenerate the manifest and fail in a strict air-gap environment.

Tip

To force manifest regeneration (for example, after an upstream model update), delete nim_runtime_manifest.yaml from your persistent cache directory before restarting.

Use the Offline Cache#

If you prefer to transfer the cache rather than a model store, mount the cache and specify the profile as shown in the following commands:

Note

Use the same image type that you used when populating the cache in a model-specific NIM workflow.

Set the runtime variables:

export CONTAINER_NAME=nim-airgap
export AIR_GAP_NIM_CACHE=/path/to/transferred-cache
export PROFILE_HASH=<PROFILE_HASH>

Start the NIM container:

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PROFILE="$PROFILE_HASH" \
  -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3

Model-Free NIM with a Local Path#

When using a model-free NIM with a local model path, no download occurs at startup. If the model directory is already on the air-gapped system, you can run without download-to-cache or create-model-store and without any API keys:

Set the container name and model repository path:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repo

Run the NIM container with the local model path:

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_SERVED_MODEL_NAME=my-model \
  -e NIM_MODEL_PATH=/model-repo \
  -v "$MODEL_REPO:/model-repo" \
  ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.3

Proxy and Certificate Configuration#

If the network-connected phase runs behind a corporate or outbound proxy, you must configure the standard proxy environment variables so that downloads are routed correctly.

HTTP/HTTPS Proxy#

When running the network-connected phase behind a proxy, you may need to specify environment variables to direct HTTP and HTTPS traffic through your organization’s proxy servers. The following variables are commonly used for proxy configuration:

Variable	Description
`HTTP_PROXY`	Proxy server for HTTP connections
`HTTPS_PROXY`	Proxy server for HTTPS connections
`NO_PROXY`	Comma-separated list of hostnames, domains, or IPs that should bypass the proxy

Note

In the air-gapped phase, there is no outbound traffic so proxy variables are not needed.

CA Certificate Injection#

When outbound HTTPS traffic goes through a TLS-inspecting proxy, or when downloading from a corporate registry whose TLS certificate is signed by an internal CA, you must inject the corporate Certificate Authority (CA) certificate so that TLS verification succeeds. Set SSL_CERT_FILE to point to a combined CA bundle inside the container.

Warning

Setting SSL_CERT_FILE replaces the container’s default trust store. If you point it at a file containing only your corporate CA, connections to public endpoints (such as api.ngc.nvidia.com) will fail because the public CAs are no longer trusted. Always create a combined bundle that includes the default CAs and your corporate CA.

Create a combined CA bundle (one-time, on the host):

To add your corporate CA without losing trust in public CAs, concatenate the container’s default bundle with your corporate CA certificate:

# Extract the default CA bundle from the container
docker run --rm --entrypoint bash \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3 \
  -c 'cat /etc/ssl/certs/ca-certificates.crt' > combined-ca-bundle.pem

# Append your corporate CA
cat /path/to/corporate-ca.pem >> combined-ca-bundle.pem

Run with the combined bundle:

Mount and reference the combined bundle:

docker run --rm --gpus all \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -v ./combined-ca-bundle.pem:/etc/ssl/certs/custom-ca-bundle.pem:ro \
  -e SSL_CERT_FILE=/etc/ssl/certs/custom-ca-bundle.pem \
  -e REQUESTS_CA_BUNDLE=/etc/ssl/certs/custom-ca-bundle.pem \
  -e NGC_API_KEY \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3 \
  download-to-cache --all

If downloading through a proxy, also add -e HTTP_PROXY=... and -e HTTPS_PROXY=.... A proxy is not required for SSL_CERT_FILE to take effect.

Note

CA certificate injection is only relevant during the network-connected phase. After the deployment is fully air-gapped, there is no outbound HTTPS traffic and no CA configuration is needed.

These variables affect outbound TLS (model downloads). They are unrelated to the NIM_SSL_* variables, which configure inbound TLS at the nginx proxy layer. For more information, refer to Environment Variables: SSL and TLS.

Environment Variables#

The following environment variables are relevant to air-gap deployment:

Variable	Description
`NIM_MODEL_PATH`	Path to the local model repository or a remote URI. In air-gapped mode, set to the local mount path.
`NIM_CACHE_PATH`	Directory path for model cache storage inside the container. Default: `/opt/nim/.cache`.
`NIM_MODEL_PROFILE`	Profile ID to use. Required when running from cache without a model store.
`NIM_SERVED_MODEL_NAME`	Name used for the served model in API responses.
`NGC_API_KEY`	NGC API key for downloading models. Do not set in air-gapped phase.
`HF_TOKEN`	Hugging Face token for downloading models. Do not set in air-gapped phase.

Examples#

Use the following examples to deploy a NIM in an air-gapped environment for different scenarios.

Download and Create a Model Store#

Use this workflow to create a model store on a network-connected system and then run NIM from that model store in the air-gapped environment.

In the network-connected phase, set the required variables and create the local directories:

export NGC_API_KEY=your_api_key
export PROFILE_HASH=<your_profile_hash>
export LOCAL_NIM_CACHE="${LOCAL_NIM_CACHE:-$(pwd)/.cache}"
export MODEL_REPO=/path/to/model-repository
mkdir -p "$MODEL_REPO" "$LOCAL_NIM_CACHE"

Download the model to the local cache:

docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3 \
  download-to-cache -p "$PROFILE_HASH"

Create the model store:

docker run -it --rm --gpus all \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -v "$MODEL_REPO:/model-repo" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3 \
  create-model-store -p "$PROFILE_HASH" -m /model-repo

Transfer the model store to the air-gapped system.

On the air-gapped system, set the runtime variables and start NIM:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repository
export NIM_SERVED_MODEL_NAME=my-model

Start NIM:

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PATH=/model-repo \
  -e NIM_SERVED_MODEL_NAME \
  -v "$MODEL_REPO:/model-repo" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3

Use the Offline Cache#

Use this workflow when you want to transfer the NIM cache instead of creating a model store.

In the network-connected phase, set the required variables and create the cache directory:

export LOCAL_NIM_CACHE="${LOCAL_NIM_CACHE:-$(pwd)/.cache}"
export PROFILE_HASH=<your_profile_hash>
mkdir -p "$LOCAL_NIM_CACHE"

Download the model to the cache:

docker run --rm --gpus all \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -e NGC_API_KEY \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3 \
  download-to-cache -p "$PROFILE_HASH"

Transfer the cache to the air-gapped system.

On the air-gapped system, set the runtime variables and start NIM:

export AIR_GAP_NIM_CACHE=/path/to/transferred-cache
export CONTAINER_NAME=nim-airgap
export PROFILE_HASH=<your_profile_hash>

Start NIM:

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_MODEL_PROFILE="$PROFILE_HASH" \
  -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache" \
  ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.3

Model-Free NIM with a Local Path (No Download Needed)#

If the model directory is already on the air-gapped system (for example, copied using secure transfer), you can run directly without any prior download steps:

Set the runtime variables:

export CONTAINER_NAME=nim-airgap
export MODEL_REPO=/path/to/model-repo

Start NIM by using the local model path:

docker run -it --rm --name="$CONTAINER_NAME" \
  --gpus all \
  --shm-size=16GB \
  -p 8000:8000 \
  -e NIM_SERVED_MODEL_NAME=my-model \
  -e NIM_MODEL_PATH=/model-repo \
  -v "$MODEL_REPO:/model-repo" \
  ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.3

Network-Connected Phase with Proxy and CA Certificate#

For environments that require an HTTP/HTTPS proxy and custom CA certificates during the download phase, use a combined CA bundle (see CA Certificate Injection):

docker run --rm --gpus all \
  --add-host=host.docker.internal:host-gateway \
  -v ./combined-ca-bundle.pem:/etc/ssl/certs/custom-ca-bundle.pem:ro \
  -e NIM_SERVER_PORT=8000 \
  -e NIM_SERVED_MODEL_NAME=my-model \
  -e NIM_MODEL_PATH=hf://TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
  -e HTTP_PROXY=http://host.docker.internal:8888 \
  -e HTTPS_PROXY=http://host.docker.internal:8888 \
  -e SSL_CERT_FILE=/etc/ssl/certs/custom-ca-bundle.pem \
  -e REQUESTS_CA_BUNDLE=/etc/ssl/certs/custom-ca-bundle.pem \
  ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.3

Note

After the model is pre-staged and transferred, the air-gapped phase does not require proxy or CA certificate configuration.