Air Gap Deployment#
About Air Gap Deployments#
The microservice supports serving models in an air gap system–also known as air wall, air-gapping, or disconnected network.
If the microservice detects a model profile in the model cache, the microservice can serve that profile from the cache rather than download the model profile again. By creating a model cache, you can run the microservice without any internet connection and no connection to the NGC registry.
Caching Model Profiles#
On a machine with internet access and connectivity to the NGC registry, perform the following steps to cache model profiles:
Export your NGC API key as an environment variable:
$ export NGC_API_KEY="<nvapi-...>"
Create an directory for the model cache:
$ export AIR_GAP_NIM_CACHE=~/.cache/air-gap-nim-cache $ mkdir -p "${AIR_GAP_NIM_CACHE}"
Optional: Determine a specific model profile to cache:
$ docker run --rm \ --gpus=all --runtime=nvidia \ -e NGC_API_KEY \ -u $(id -u) \ -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache/" \ nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0 \ list-model-profiles
Review the output to determine the model profiles to cache.
Download the model profiles to cache:
$ docker run --rm \ --gpus=all --runtime=nvidia \ -e NGC_API_KEY \ -u $(id -u) \ -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache/" \ nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0 \ download-to-cache -p <profile-one> [-p <profile-two> -p <...>]
For more information about the commands, refer to Utilities in the NIM for LLMs documentation.
Running on Air-Gapped Systems#
After the container exits and the model profiles are downloaded, transfer the air-gapped directory, ~/.cache/air-gap-nim-cache
, to a path that is accessible to the air-gapped system.
Perform the following steps on the air-gapped system to start the microservice and use model cache:
Export an environment variable with the path to the model cache:
$ export LOCAL_NIM_CACHE=<path-to-transferred-model-cache>
Start the container:
$ docker run -d \ --name nemoguard-jailbreakdetect \ --gpus=all --runtime=nvidia \ -u $(id -u) \ -p 8000:8000 \ -v "$LOCAL_NIM_CACHE:/opt/nim/.cache/" \ -e NIM_MODEL_PROFILE=<profile-one> \ nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0
The key differences from the common command to start the microservice:
The
NGC_API_KEY
environment variable is omitted because the microservice does not download model profiles.Optional: The
NIM_MODEL_PROFILE
variable specifies the model profile to run.