Air-Gap Deployment#
Air-gap deployment lets you run a NIM without an internet connection, for example, with no connection to remote model registries such as NGC or Hugging Face Hub. This guide describes how to prepare, transfer, and run NIM LLM in an air-gapped environment.
Air-gap deployment follows a two-phase workflow:
Network-connected phase: On a machine with internet access and credentials, download model assets and prepare them for transfer.
Air-gapped phase: On the isolated machine, mount the pre-staged model assets and run the NIM container with no outbound network access and no API keys.
Transfer the model assets between these phases using any allowed channel, such as an archive copy, scp, rsync, or physical media.
Important
In the air-gapped phase, do not set NGC_API_KEY or HF_TOKEN. The NIM must load all model assets from local storage only.
Choose a Deployment Modality#
NIM LLM supports two deployment modalities. Your choice affects how you prepare model assets for an air-gapped deployment.
Model-Specific NIM#
For model-specific NIM, keep the following in mind:
The NIM container ships with a model manifest that defines the model and one or more profiles (tensor parallel size, backend, and so on).
Use a model-specific container image for this workflow, such as
${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1.Model weights are not shipped inside the container. At startup, NIM loads the model from cache if present, or downloads it when the system has network and credentials.
For air-gap, you must pre-stage the model using
download-to-cache(and optionallycreate-model-store) during the connected phase, then transfer the cache or model store to the air-gapped system.
Model-Free NIM#
For model-free NIM, keep the following in mind:
The model is configured at runtime by the user (using
NIM_MODEL_PATHor the positional vLLM model CLI argument). The container does not ship a fixed model manifest.Use the model-free container image for this workflow, such as
${NIM_LLM_MODEL_FREE_IMAGE}:2.0.1.When
NIM_MODEL_PATHpoints to a local path (for example,/mnt/models/my-llama), NIM serves directly from that path. No download occurs, so if the model directory is already present on the air-gapped system, you do not need to usedownload-to-cacheorcreate-model-store.When
NIM_MODEL_PATHis a remote URI (for example,ngc://org/team/model:tag), NIM generates a runtime manifest on the first deployment and automatically persists it insideNIM_CACHE_PATH. On subsequent restarts — including in strict air-gap environments — NIM finds the cached manifest in the same cache volume and skips regeneration, making no outbound network or authentication calls. No extra environment variables are required; mounting the same PVC atNIM_CACHE_PATHis sufficient.
Quick Reference#
The following table summarizes the model source location at runtime, the container image to use, and how the model is accessed:
NIM Type |
Container Image |
|
Download at Startup? |
Air-Gap support |
|---|---|---|---|---|
Regular NIM |
|
(not used) |
Yes, if not in cache |
Yes — pre-stage cache or model store |
Model-free NIM |
|
Local path (e.g. |
No |
Yes — ensure local path is mounted |
Model-free NIM |
|
Remote URI (e.g. |
Yes, generates manifest |
Requires network on first deploy |
Model-free NIM |
|
Remote URI (e.g. |
No |
Yes — manifest reused from PVC, no auth calls |
Network-Connected Phase#
On a machine with internet access, use the NIM container to download model assets and optionally create a model store directory.
Use a container image that matches your workflow:
For model-specific NIM, use
${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1.For model-free NIM, use
${NIM_LLM_MODEL_FREE_IMAGE}:2.0.1.
Download a Model to Cache#
The download-to-cache command downloads selected or default model profiles to the NIM cache.
docker run -it --rm --gpus all \
-e NGC_API_KEY \
-v $LOCAL_NIM_CACHE:/opt/nim/.cache \
${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1 \
download-to-cache -p <PROFILE_HASH>
You can also download all profiles with --all, or let NIM auto-select a profile based on the
available hardware by omitting the -p flag.
Tip
Run list-model-profiles to discover available profiles and their hashes before downloading.
Create a Model Store#
The create-model-store command extracts files from a cached model profile and creates a
properly formatted model directory. If the profile is not already cached, the command downloads
it first.
docker run -it --rm --gpus all \
-e NGC_API_KEY \
-v $LOCAL_NIM_CACHE:/opt/nim/.cache \
-v $MODEL_REPO:/model-repo \
${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1 \
create-model-store -p <PROFILE_HASH> -m /model-repo
The resulting directory is ready for transfer to the air-gapped system.
Transfer Model Assets#
Transfer the model store directory (or cache) from the connected machine to the air-gapped environment. Common methods include:
Archive and copy: Create a
tarorziparchive, transfer it through an allowed channel, then extract it on the air-gapped host.scporrsync: Use direct directory transfer when a one-way or out-of-band SSH path is available.rsyncsupports incremental synchronization.Physical media: Copy the directory or archive onto portable storage (for example, a USB drive), physically move it to the air-gapped environment, then extract it.
Air-Gapped Phase#
On the air-gapped system, mount the pre-staged model directory and run the NIM container.
Use a Model Store#
Run the following commands to use a model store:
Note
Use the same image type that you used to create the model store in a model-specific NIM workflow.
Set the runtime variables:
export CONTAINER_NAME=nim-airgap export MODEL_REPO=/path/to/model-repository export NIM_SERVED_MODEL_NAME=my-model
Start the NIM container:
docker run -it --rm --name="$CONTAINER_NAME" \ --gpus all \ --shm-size=16GB \ -p 8000:8000 \ -e NIM_MODEL_PATH=/model-repo \ -e NIM_SERVED_MODEL_NAME \ -v "$MODEL_REPO:/model-repo" \ ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1
Important
Do not set NGC_API_KEY or HF_TOKEN in this phase.
Use Model-Free NIM with a Remote URI (PVC-Based Redeploy)#
If you use model-free NIM with NIM_MODEL_PATH set to a remote URI (for example,
ngc://org/team/model:tag), NIM generates a runtime manifest on the first deployment and
automatically saves a copy inside NIM_CACHE_PATH.
On subsequent restarts — including in a strict air-gap environment — NIM finds the cached manifest in the same cache volume and skips regeneration, making no outbound network or authentication calls. No additional environment variables are required.
Prerequisites:
The NIM cache from the first (network-connected) deployment must be on a persistent volume (PVC) that is re-mounted between deployments. Both model assets and the manifest are stored there.
Set the runtime variables:
export CONTAINER_NAME=nim-airgap-model-free export NIM_CACHE=/path/to/persistent-nim-cache # same PVC used on first deploy export NIM_MODEL_PATH=ngc://org/team/model:tag
Start the NIM container:
docker run -it --rm --name="$CONTAINER_NAME" \ --gpus all \ --shm-size=16GB \ -p 8000:8000 \ -e NIM_MODEL_PATH \ -v "$NIM_CACHE:/opt/nim/.cache" \ ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.0
NIM detects the cached manifest in
/opt/nim/.cacheand starts without any outbound network access.
Important
Do not set NGC_API_KEY in the air-gap phase. If the cache volume is not mounted or was not
populated by a prior network-connected deployment, NIM will attempt to regenerate the manifest and
fail in a strict air-gap environment.
Tip
To force manifest regeneration (for example, after an upstream model update), delete
nim_runtime_manifest.yaml from your persistent cache directory before restarting.
Use the Offline Cache#
If you prefer to transfer the cache rather than a model store, mount the cache and specify the profile as shown in the following commands:
Note
Use the same image type that you used when populating the cache in a model-specific NIM workflow.
Set the runtime variables:
export CONTAINER_NAME=nim-airgap export AIR_GAP_NIM_CACHE=/path/to/transferred-cache export PROFILE_HASH=<PROFILE_HASH>
Start the NIM container:
docker run -it --rm --name="$CONTAINER_NAME" \ --gpus all \ --shm-size=16GB \ -p 8000:8000 \ -e NIM_MODEL_PROFILE="$PROFILE_HASH" \ -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache" \ ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1
Model-Free NIM with a Local Path#
When using a model-free NIM with a local model path, no download occurs at startup.
If the model directory is already on the air-gapped system, you can run without
download-to-cache or create-model-store and without any API keys:
Set the container name and model repository path:
export CONTAINER_NAME=nim-airgap export MODEL_REPO=/path/to/model-repo
Run the NIM container with the local model path:
docker run -it --rm --name="$CONTAINER_NAME" \ --gpus all \ --shm-size=16GB \ -p 8000:8000 \ -e NIM_SERVED_MODEL_NAME=my-model \ -e NIM_MODEL_PATH=/model-repo \ -v "$MODEL_REPO:/model-repo" \ ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.1
Proxy and Certificate Configuration#
If the network-connected phase runs behind a corporate or outbound proxy, you must configure the standard proxy environment variables so that downloads are routed correctly.
HTTP/HTTPS Proxy#
When running the network-connected phase behind a proxy, you may need to specify environment variables to direct HTTP and HTTPS traffic through your organization’s proxy servers. The following variables are commonly used for proxy configuration:
Variable |
Description |
|---|---|
|
Proxy server for HTTP connections |
|
Proxy server for HTTPS connections |
|
Comma-separated list of hostnames, domains, or IPs that should bypass the proxy |
Note
In the air-gapped phase, there is no outbound traffic so proxy variables are not needed.
CA Certificate Injection#
When outbound HTTPS traffic goes through a TLS-inspecting proxy, you must inject the
corporate Certificate Authority (CA) certificate so that TLS verification succeeds. Set SSL_CERT_FILE to point
to the certificate file inside the container.
docker run --rm --gpus all \
-v /path/to/corporate-ca.pem:/opt/nim/proxy-ca.pem:ro \
-e SSL_CERT_FILE=/opt/nim/proxy-ca.pem \
-e HTTP_PROXY=http://proxy-host:8888 \
-e HTTPS_PROXY=http://proxy-host:8888 \
-e NGC_API_KEY \
${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1 \
download-to-cache -p <PROFILE_HASH>
Note
CA certificate injection is only relevant during the network-connected phase. After the deployment is fully air-gapped, there is no outbound HTTPS traffic and no CA configuration is needed.
Environment Variables#
The following environment variables are relevant to air-gap deployment:
Variable |
Description |
|---|---|
|
Path to the local model repository or a remote URI. In air-gapped mode, set to the local mount path. |
|
Directory path for model cache storage. Default: |
|
Profile ID to use. Required when running from cache without a model store. |
|
Name used for the served model in API responses. |
|
NGC API key for downloading models. Do not set in air-gapped phase. |
|
Hugging Face token for downloading models. Do not set in air-gapped phase. |
Examples#
Use the following examples to deploy a NIM in an air-gapped environment for different scenarios.
Download and Create a Model Store#
Use this workflow to create a model store on a network-connected system and then run NIM from that model store in the air-gapped environment.
In the network-connected phase, set the required variables and create the local directories:
export NGC_API_KEY=your_api_key export PROFILE_HASH=<your_profile_hash> export LOCAL_NIM_CACHE="${LOCAL_NIM_CACHE:-$(pwd)/.cache}" export MODEL_REPO=/path/to/model-repository mkdir -p "$MODEL_REPO" "$LOCAL_NIM_CACHE"
Download the model to the local cache:
docker run -it --rm --gpus all \ -e NGC_API_KEY \ -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \ ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1 \ download-to-cache -p "$PROFILE_HASH"
Create the model store:
docker run -it --rm --gpus all \ -e NGC_API_KEY \ -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \ -v "$MODEL_REPO:/model-repo" \ ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1 \ create-model-store -p "$PROFILE_HASH" -m /model-repo
Transfer the model store to the air-gapped system.
On the air-gapped system, set the runtime variables and start NIM:
export CONTAINER_NAME=nim-airgap export MODEL_REPO=/path/to/model-repository export NIM_SERVED_MODEL_NAME=my-model
Start NIM:
docker run -it --rm --name="$CONTAINER_NAME" \ --gpus all \ --shm-size=16GB \ -p 8000:8000 \ -e NIM_MODEL_PATH=/model-repo \ -e NIM_SERVED_MODEL_NAME \ -v "$MODEL_REPO:/model-repo" \ ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1
Use the Offline Cache#
Use this workflow when you want to transfer the NIM cache instead of creating a model store.
In the network-connected phase, set the required variables and create the cache directory:
export LOCAL_NIM_CACHE="${LOCAL_NIM_CACHE:-$HOME/.cache/nim}" export PROFILE_HASH=<your_profile_hash> mkdir -p "$LOCAL_NIM_CACHE"
Download the model to the cache:
docker run --rm --gpus all \ -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \ -e NGC_API_KEY \ ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1 \ download-to-cache -p "$PROFILE_HASH"
Transfer the cache to the air-gapped system.
On the air-gapped system, set the runtime variables and start NIM:
export AIR_GAP_NIM_CACHE=/path/to/transferred-cache export CONTAINER_NAME=nim-airgap export PROFILE_HASH=<your_profile_hash>
Start NIM:
docker run -it --rm --name="$CONTAINER_NAME" \ --gpus all \ --shm-size=16GB \ -p 8000:8000 \ -e NIM_MODEL_PROFILE="$PROFILE_HASH" \ -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache" \ ${NIM_LLM_MODEL_SPECIFIC_IMAGE}:2.0.1
Model-Free NIM with a Local Path (No Download Needed)#
If the model directory is already on the air-gapped system (for example, copied using secure transfer), you can run directly without any prior download steps:
Set the runtime variables:
export CONTAINER_NAME=nim-airgap export MODEL_REPO=/path/to/model-repo
Start NIM by using the local model path:
docker run -it --rm --name="$CONTAINER_NAME" \ --gpus all \ --shm-size=16GB \ -p 8000:8000 \ -e NIM_SERVED_MODEL_NAME=my-model \ -e NIM_MODEL_PATH=/model-repo \ -v "$MODEL_REPO:/model-repo" \ ${NIM_LLM_MODEL_FREE_IMAGE}:2.0.1
Network-Connected Phase with Proxy and CA Certificate#
For environments that require an HTTP/HTTPS proxy and custom CA certificates during the download phase:
docker run --rm --gpus all \
--add-host=host.docker.internal:host-gateway \
-v ~/.mitmproxy/corporate-ca.pem:/opt/nim/proxy-ca.pem:ro \
-e NIM_SERVER_PORT=8000 \
-e NIM_SERVED_MODEL_NAME=my-model \
-e NIM_MODEL_PATH=hf://TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
-e HTTP_PROXY=http://host.docker.internal:8888 \
-e HTTPS_PROXY=http://host.docker.internal:8888 \
-e SSL_CERT_FILE=/opt/nim/proxy-ca.pem \
${NIM_LLM_MODEL_FREE_IMAGE}:2.0.1
Note
After the model is pre-staged and transferred, the air-gapped phase does not require proxy or CA certificate configuration.