NeMo Guardrails Microservice Deployment Guide#
Find more information on prerequisites and configuring the key parts of the values.yaml
file in the NeMo Microservices Helm Chart.
Prerequisites#
A persistent volume provisioner that uses network storage such as NFS, S3, vSAN, and so on. The guardrails configuration is stored in persistent storage.
To use a single model from one locally-deployed NIM for LLMs container, deploy the container and know the service name so you can specify the service name in the
NIM_ENDPOINT_URL
environment variable when you install the NeMo Microservices Helm Chart.To use models from multiple locally-deployed NIM containers, deploy the following services as prerequisites:
An external PostgreSQL database. Refer to Configure with External PostgreSQL Database for more information.
(Optional) Install NVIDIA GPU Operator if your nodes have NVIDIA GPUs. The demonstration configuration shown on this page does not require NVIDIA GPUs.
Install NeMo Guardrails as a Standalone Microservice#
Follow the steps below to install NeMo Guardrails as a standalone microservice.
Set the environment variables with your NGC API key and NVIDIA API key:
$ export NGC_CLI_API_KEY="M2..." $ export NVIDIA_API_KEY="nvapi-..."
Create a namespace for the microservice:
$ kubectl create namespace guardrails-ms
Add a Docker registry secret for downloading the container image from NVIDIA NGC:
$ kubectl create secret -n guardrails-ms docker-registry nvcrimagepullsecret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ --docker-password=$NGC_CLI_API_KEY
Go to the NeMo Microservices Helm Chart page in the NGC Catalog and select the desired version of the chart. For more information about using Helm charts provided in the NGC Catalog in general, see Helm Charts in the NGC Catalog User Guide.
Add the chart repository with the
helm repo add
command with your user name and NGC API key as follows:$ helm repo add nmp "https://helm.ngc.nvidia.com/nvidia/nemo-microservices" \ --username='$oauthtoken' \ --password=$NGC_CLI_API_KEY
Then run helm repo update
Save the default chart values in a file for reference:
$ helm show values nmp/nemo-microservices-helm-chart > values.yaml
Edit the
values.yaml
file as needed or create a custom values file. For more information about the default values, refer to NeMo Microservices Helm Chart.By default, the
values.yaml
file is configured to install the entire microservices and use NIM Proxy to route requests to deployed NIM for LLMs microservices. To install the service as a standalone microservice, make the following value overrides:tags: platform: false guardrails: true
Or pass this value into the helm install
command with --set tags.guardrails=true
, which is equivalent.
Install the chart and then port-forward the service:
helm --namespace guardrails-ms install guardrails-ms nmp/nemo-microservices-helm-chart \ -f values.yaml
To install NeMo Guardrails with NIM for LLMs#
If you want to use locally deployed NIM for LLMs microservices directly instead of using NIM Proxy, update the values.yaml
file to specify the service address for NIM for LLMs as follows.
Update the
values.yaml
file to specify the service address for NIM for LLMs:tags: platform: false guardrails: true nim: enabled: true guardrails: env: # NIM Proxy service address # NIM_ENDPOINT_URL: nemo-nim-proxy:8000 # A single NIM for LLMs service address NIM_ENDPOINT_URL: <meta-llama3-8b-instruct:8000> deployment-management: enabled: false nim-proxy: enabled: false
Optional: Configure access to models on
build.nvidia.com
.Add the following secret that populates the
NVIDIA_API_KEY
environment variable in the container:$ kubectl create secret -n guardrails-ms generic nvidia-api-secret \ --from-literal=NVIDIA_API_KEY=$NVIDIA_API_KEY
Update the
values.yaml
file so thatNVIDIA_API_KEY
is populated in the container from the secret:guardrails: guardrails: nvcfAPIKeySecretName: nvidia-api-secret
Set Up NeMo Guardrails to Use NIM Endpoints from build.nvidia.com
#
If you set the
NIM_ENDPOINT_URL
environment variable to a NIM endpoint frombuild.nvidia.com
, add the following secret that populates theNIM_ENDPOINT_API_KEY
environment variable in the container:$ kubectl create secret -n guardrails-ms generic nim-endpoint-api-secret \ --from-literal=nim-endpoint-api-key=$NVIDIA_API_KEY
Edit the
values.yaml
file with the following changes:guardrails: env: NIM_ENDPOINT_URL: <nim-endpoint-url> NIM_ENDPOINT_API_KEY: valueFrom: secretKeyRef: name: nim-endpoint-api-secret key: nim-endpoint-api-key
Configure with External PostgreSQL Database#
By default, the NeMo Microservices Helm Chart uses the Bitnami PostgreSQL chart to deploy a PostgreSQL database. To use an external PostgreSQL database, refer to PostgreSQL.
Configure for High Availability#
To deploy the NeMo Guardrails microservice in a high-availability configuration, set up the service with a local or an external PostgreSQL database. The NeMo Guardrails microservice is deployed with a single replica by default. You can increase the number of replicas to achieve high availability and load balancing.
For example, to deploy with three replicas for failover and load balancing, add the following to the values.yaml
file:
guardrails:
replicaCount: 3