Air Gap Deployment#

About Air Gap Deployments#

The microservice supports serving models in an air gap system–also known as air wall, air-gapping, or disconnected network.

If the microservice detects a model profile in the model cache, the microservice can serve that profile from the cache rather than download the model profile again. By creating a model cache, you can run the microservice without any internet connection and no connection to the NGC registry.

Caching Model Profiles#

On a machine with internet access and connectivity to the NGC registry, perform the following steps to cache model profiles:

Export your NGC API key as an environment variable:
```
$ export NGC_API_KEY="<nvapi-...>"
```

Create an directory for the model cache:

$ export AIR_GAP_NIM_CACHE=~/.cache/air-gap-nim-cache
$ mkdir -p "${AIR_GAP_NIM_CACHE}"

Optional: Determine a specific model profile to cache:

$ docker run --rm \
  --gpus=all --runtime=nvidia \
  -e NGC_API_KEY \
  -u $(id -u) \
  -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache/" \
  nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0 \
  list-model-profiles

Review the output to determine the model profiles to cache.

Download the model profiles to cache:

$ docker run --rm \
  --gpus=all --runtime=nvidia \
  -e NGC_API_KEY \
  -u $(id -u) \
  -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache/" \
  nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0 \
  download-to-cache -p <profile-one> [-p <profile-two> -p <...>]

For more information about the commands, refer to Utilities in the NIM for LLMs documentation.

Running on Air-Gapped Systems#

After the container exits and the model profiles are downloaded, transfer the air-gapped directory, ~/.cache/air-gap-nim-cache, to a path that is accessible to the air-gapped system. Perform the following steps on the air-gapped system to start the microservice and use model cache:

Export an environment variable with the path to the model cache:
```
$ export LOCAL_NIM_CACHE=<path-to-transferred-model-cache>
```

Start the container:

$ docker run -d \
  --name nemoguard-jailbreakdetect \
  --gpus=all --runtime=nvidia \
  -u $(id -u) \
  -p 8000:8000 \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache/" \
  -e NIM_MODEL_PROFILE=<profile-one> \
  nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0

The key differences from the common command to start the microservice:

The NGC_API_KEY environment variable is omitted because the microservice does not download model profiles.
Optional: The NIM_MODEL_PROFILE variable specifies the model profile to run.