NeMo Guardrails Microservice Deployment Guide#
Find more information on prerequisites and configuring the key parts of the values.yaml file in the NeMo Microservices Helm Chart.
Prerequisites#
- A persistent volume provisioner that uses network storage such as NFS, S3, vSAN, and so on. The guardrails configuration is stored in persistent storage. 
- Optional: Install NVIDIA GPU Operator if your nodes have NVIDIA GPUs. The demonstration configuration shown on this page does not require NVIDIA GPUs. 
- To use a single model from one locally-deployed NIM for LLMs container, deploy the container and know the service name so you can specify the service name in the - NIM_ENDPOINT_URLenvironment variable when you install the NeMo Microservices Helm Chart.
- To use models from multiple locally-deployed NIM for LLMs containers, deploy the following services as prerequisites: 
Install NeMo Guardrails as a Standalone Microservice#
Follow the steps below to install NeMo Guardrails as a standalone microservice.
- Set the environment variables with your NGC API key and NVIDIA API key: - $ export NGC_CLI_API_KEY="M2..." $ export NVIDIA_API_KEY="nvapi-..." 
- Create a namespace for the microservice: - $ kubectl create namespace guardrails-ms 
- Add a Docker registry secret for downloading the container image from NVIDIA NGC: - $ kubectl create secret -n guardrails-ms docker-registry nvcrimagepullsecret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ --docker-password=$NGC_CLI_API_KEY 
- Go to the NeMo Microservices Helm Chart page in the NGC Catalog and select the desired version of the chart. For more information about using Helm charts provided in the NGC Catalog in general, see Helm Charts in the NGC Catalog User Guide. - Download the chart using the - helm fetchcommand with your user name and NGC API key as follows:- $ helm fetch "https://helm.ngc.nvidia.com/nvidia/nemo-microservices/charts/nemo-microservices-helm-chart-25.7.0.tgz" \ --username='$oauthtoken' \ --password=$NGC_CLI_API_KEY 
- Save the default chart values in a file: - $ helm show values nemo-microservices-helm-chart > values.yaml - Edit the - values.yamlfile as needed or create a custom values file. For more information about the default values, refer to NeMo Microservices Helm Chart.
- By default, the - values.yamlfile is configured to install the entire microservices and use NIM Proxy to route requests to deployed NIM for LLMs microservices. To install the service as a standalone microservice, make the following value overrides:- customizer: enabled: false data-store: enabled: false entity-store: enabled: false nemo-operator: enabled: false evaluator: enabled: false guardrails: enabled: true deployment-management: enabled: true nim-operator: enabled: true nim-proxy: enabled: true 
- Install the chart and then port-forward the service: - helm --namespace guardrails-ms install guardrails-ms \ nemo-microservices-helm-chart \ -f values.yaml - Partial Output - Get the application URL by running these commands: export POD_NAME=... export CONTAINER_PORT=... echo "Visit http://127.0.0.1:8080 to use your application" kubectl port-forward $POD_NAME 8080:$CONTAINER_PORT - Running the - exportand- kubectlcommands prints the URL that you can use to interact with the microservice.
To install NeMo Guardrails with NIM for LLMs#
If you want to use locally deployed NIM for LLMs microservices directly instead of using NIM Proxy, update the values.yaml file to specify the service address for NIM for LLMs as follows.
- Update the - values.yamlfile to specify the service address for NIM for LLMs:- customizer: enabled: false data-store: enabled: false entity-store: enabled: false nemo-operator: enabled: false evaluator: enabled: false guardrails: env: # NIM Proxy service address # NIM_ENDPOINT_URL: nemo-nim-proxy:8000 # A single NIM for LLMs service address NIM_ENDPOINT_URL: <meta-llama3-8b-instruct:8000> deployment-management: enabled: false nim-operator: enabled: false nim-proxy: enabled: false 
- Optional: Configure access to models on - build.nvidia.com.- Add the following secret that populates the - NVIDIA_API_KEYenvironment variable in the container:- $ kubectl create secret -n guardrails-ms generic nvidia-api-secret \ --from-literal=NVIDIA_API_KEY=$NVIDIA_API_KEY 
- Update the - values.yamlfile so that- NVIDIA_API_KEYis populated in the container from the secret:- guardrails: guardrails: nvcfAPIKeySecretName: nvidia-api-secret 
 
Set Up NeMo Guardrails to Use NIM Endpoints from build.nvidia.com#
- If you set the - NIM_ENDPOINT_URLenvironment variable to a NIM endpoint from- build.nvidia.com, add the following secret that populates the- NIM_ENDPOINT_API_KEYenvironment variable in the container:- $ kubectl create secret -n guardrails-ms generic nim-endpoint-api-secret \ --from-literal=nim-endpoint-api-key=$NVIDIA_API_KEY 
- Edit the - values.yamlfile with the following changes:- guardrails: env: NIM_ENDPOINT_URL: <nim-endpoint-url> NIM_ENDPOINT_API_KEY: valueFrom: secretKeyRef: name: nim-endpoint-api-secret key: nim-endpoint-api-key