Air Gap Deployment#

About Air Gap Deployments#

The microservice supports serving models in an air gap system–also known as air wall, air-gapping, or disconnected network.

If the microservice detects a model profile in the model cache, the microservice can serve that profile from the cache rather than download the model profile again. By creating a model cache, you can run the microservice without any internet connection and no connection to the NGC registry.

Caching Model Profiles#

On a machine with internet access and connectivity to the NGC registry, perform the following steps to cache model profiles:

  1. Export your NGC API key as an environment variable:

    $ export NGC_API_KEY="<nvapi-...>"
    
  2. Create an directory for the model cache:

    $ export AIR_GAP_NIM_CACHE=~/.cache/air-gap-nim-cache
    $ mkdir -p "${AIR_GAP_NIM_CACHE}"
    
  3. Optional: Determine a specific model profile to cache:

    $ docker run --rm \
      --gpus=all --runtime=nvidia \
      -e NGC_API_KEY \
      -u $(id -u) \
      -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache/" \
      nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0 \
      list-model-profiles
    

    Review the output to determine the model profiles to cache.

  4. Download the model profiles to cache:

    $ docker run --rm \
      --gpus=all --runtime=nvidia \
      -e NGC_API_KEY \
      -u $(id -u) \
      -v "$AIR_GAP_NIM_CACHE:/opt/nim/.cache/" \
      nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0 \
      download-to-cache -p <profile-one> [-p <profile-two> -p <...>]
    

    For more information about the commands, refer to Utilities in the NIM for LLMs documentation.

Running on Air-Gapped Systems#

After the container exits and the model profiles are downloaded, transfer the air-gapped directory, ~/.cache/air-gap-nim-cache, to a path that is accessible to the air-gapped system. Perform the following steps on the air-gapped system to start the microservice and use model cache:

  1. Export an environment variable with the path to the model cache:

    $ export LOCAL_NIM_CACHE=<path-to-transferred-model-cache>
    
  2. Start the container:

    $ docker run -d \
      --name nemoguard-jailbreakdetect \
      --gpus=all --runtime=nvidia \
      -u $(id -u) \
      -p 8000:8000 \
      -v "$LOCAL_NIM_CACHE:/opt/nim/.cache/" \
      -e NIM_MODEL_PROFILE=<profile-one> \
      nvcr.io/nim/nvidia/nemoguard-jailbreak-detect:1.0.0
    

    The key differences from the common command to start the microservice:

    • The NGC_API_KEY environment variable is omitted because the microservice does not download model profiles.

    • Optional: The NIM_MODEL_PROFILE variable specifies the model profile to run.