Managing NeMo Guardrails#
About NeMo Guardrails#
NVIDIA NeMo Guardrails microservice enables adding programmable guardrails to LLM endpoints. NeMo Guardrails sits between your application code and the LLM to provide a way for you to adjust user prompts before sending them to the LLM and LLM responses before they are given to users.
When you deploy a NeMo Guardrails microservice, the NIM Operator creates a Deployment and Service endpoint for the NeMo Guardrails. Guardrails configurations are stored in a PVC directory and mounted into the Guardrails Deployment.
Read the NeMo Guardrails documentation for details on using guardrails.
Prerequisites#
All the common NeMo microservice prerequisites.
A NIM endpoint where your models are hosted. NIM endpoints must support the OpenAI spec and can be deployed as,
A NIM Cache and NIM Service locally on your cluster.
A NIM Proxy. Refer to the NeMo microservices documentation for details on deploying a NIM Proxy. Note that NIM Operator does not support NIM Proxy with multiple NIMs
A hosted model from an LLM provider. For example, a NVIDIA hosted model from https://integrate.api.nvidia.com/v1. Hosted models require an API key and secret to access the model, which is described more in the Kubernetes prerequisites below.
Kubernetes
A persistent volume provisioner that uses network storage (such as NFS, S3, or vSAN) to hold NeMo Guardrails configuration files.
You can create a PVC and specify the name in the configuration file when you create the NeMo Guardrails resource, or you can request that the Operator creates a PVC.
If you plan to use a NIM hosted, create a secret containing your API Key for https://build.nvidia.com or OpenAI.
Create a secret file like the following
nemo-guardrail-secret.yaml
example:--- apiVersion: v1 stringData: NIM_ENDPOINT_API_KEY: <API-key> kind: Secret metadata: name: <nim-api-key> namespace: nemo type: Opaque
Apply the secret file.
$ kubectl apply -n nemo -f nemo-guardrail-secret.yaml
Deploy NeMo Guardrails#
Update the following sample scripts <inputs>
with values for your cluster configuration.
Create a file, such as
nemo-guardrail.yaml
, with contents similar to the following example. If you have a NIM endpoint you’d like to use, update theNIM_ENDPOINT_URL
with your NIM Service URL and port:apiVersion: apps.nvidia.com/v1alpha1 kind: NemoGuardrail metadata: name: nemoguardrails-sample namespace: nemo spec: configStore: pvc: name: "pvc-guardrail-config" create: true storageClass: "<storage-class>" volumeAccessMode: ReadWriteMany size: "1Gi" nimEndpoint: baseURL: "<https://integrate.api.nvidia.com/v1>" #Required if you are using a hosted NIM endpoint. Create a secret with your API key. apiKeySecret: "<nim-api-key>" expose: service: type: ClusterIP port: 8000 image: repository: nvcr.io/nvidia/nemo-microservices/guardrails tag: "25.04" pullPolicy: IfNotPresent pullSecrets: - ngc-secret metrics: serviceMonitor: {} replicas: 1 resources: limits: cpu: "1" ephemeral-storage: 10Gi
Apply the manifest:
$ kubectl apply -n nemo -f nemo-guardrail.yaml
Optional: View information about the NeMo Guardrails services:
$ kubectl describe nemoguardrails.apps.nvidia.com -n nemo
Partial Output
... Conditions: Last Transition Time: 2024-08-12T19:09:43Z Message: Deployment is ready Reason: Ready Status: True Type: Ready Last Transition Time: 2024-08-12T19:09:43Z Message: Reason: Ready Status: False Type: Failed State: Ready
You now have a NeMo Guardrails microservice deployed to your cluster.
This sample NeMo Guardrail deploys the microservice with an empty configuration store. Before using NeMo Guardrails on a model, the configuration store in the NeMo Guardrail documentation. From there you can also learn how to create a configuration and update the configuration store.
Verify NeMo Guardrails#
Once you have a NeMo Guardrails deployed on your cluster, use the steps below to verify the service is up and running.
Start a pod that has access to the
curl
command. Substitute any pod that has this command and meets your organization’s security requirements.$ kubectl run --rm -it -n default curl --image=curlimages/curl:latest -- ash
After the pod starts, you are connected to the
ash
shell in the pod.Connect to the NeMo Evaluator service
$ curl -X GET "http://nemoguardrails-sample.nemo:8000/v1/guardrail/configs"
Example Output
{"object":"list","data":[],"pagination":{"page":1,"page_size":10,"current_page_size":0,"total_pages":0,"total_results":0},"sort":"created_at"}~
Press Ctrl+D to exit and delete the pod.
Configuration Reference#
The following table shows information about the commonly modified fields for the NeMo Guardrails custom resource.
Field |
Description |
Default Value |
---|---|---|
|
Specifies to add the user-supplied annotations to the pod. |
None |
|
Specifies the NeMo Guardrails configuration location as a configMap.
Before deploying the NeMo Guardrail service, create a ConfigMap with your guardrail configurations, then pass the name as spec.configStore.configMap.name: |
|
|
When set to Refer to the NeMo Guardrails Configuration Store documentation for more details. If you deploy a NeMo Microservice with an empty configuration store, you must update the configuration with a valid configuration before you start running guardrails. |
|
|
Specifies the name for the PVC. |
None |
|
Specifies the size, in Gi, for the PVC to create. This field is required if you specify |
None |
|
Specifies the StorageClass for the PVC to create. Leave empty if you have |
None |
|
Specifies to create a subpath on the PVC and cache the model profiles in the directory.
The default subpath is |
|
|
Specifies the access mode for the PVC to create. |
None |
|
When set to If you have an ingress controller, values like the following sample configures an ingress for the ingress:
enabled: true
spec:
ingressClassName: nginx
host: demo.nvidia.example.com
paths:
- path: /v1/chat/completions
pathType: Prefix
|
|
|
Specifies the network port number for the NeMo Guardrails microservice. |
|
|
Specifies the Kubernetes service type to create for the NeMo microservice. |
|
|
Specifies the group for the pods.
This value is used to set the security context of the pod in the |
|
|
Specifies repository, tag, pull policy, and pull secret for the container image. |
None |
|
Specifies the user-supplied labels to add to the pod. |
None |
|
When set to |
|
|
Specifies key in the secret that contains the API key for accessing NVIDIA Host models from https://build.nvidia.com.
Defaults is |
|
|
Specifies the name of the secret that contains the API key for accessing NVIDIA Host models from https://build.nvidia.com.
This is required if the base URL is for a NIM proxy.
Generate your API key from the |
|
|
Specifies the base URL of the service where your NIM is hosted.
This is required if you include
The default base URL for NVIDIA hosted models is When using a hosted model you must set |
|
|
Specifies the number of replicas to have on the cluster. |
None |
|
Specifies the memory and CPU requests. |
None |
|
Specifies the memory and CPU limits. |
None |
|
Specifies the tolerations for the pods. |
None |
|
Specifies the user ID for the pod.
This value is used to set the security context of the pod in the |
|
Next Steps#
Refer to the NeMo Guardrails documentation for details on