Guardrail Prompts across Multiple Languages with Nemotron Content Safety NIM#
This tutorial demonstrates how to add a guardrail by configuring the following NIM microservices through the NeMo Guardrails microservice.
Application LLM NIM Microservice: The primary language model that generates responses after passing through safety checks. In this tutorial, the application LLM NIM microservice is NVIDIA-Nemotron-Nano-9B-v2.
Content Safety NIM Microservice: A NIM microservice that checks the safety of user input and LLM responses. In this tutorial, the content safety NIM microservice is Llama-3.1-Nemotron-Safety-Guard-8B-v3.
Prerequisites#
Before starting this tutorial, meet the following prerequisites.
The NeMo Guardrails microservice is deployed in your environment using one of the following methods:
Docker Quickstart. If using this quickstart option, skip the section
Deploying the NIM Microservices.
An NGC API key with the following permissions: NGC Catalog, and Public API Endpoints.
If you need to create a new key, refer to Generating NGC API Keys in the NVIDIA NGC Catalog documentation.
For more information about the service permissions you can include, refer to Supported NGC Applications and API Key Types in the NVIDIA NGC Catalog documentation.
Setting the Environment variables#
Set up the following essential environment variables:
export NEMO_MS_BASE_URL="http://nemo.test" # This is the correct value for the default Ingress used in Minikube. Replace with the NeMo platform base URL of your deployment. For quickstart use `http://0.0.0.0:8080`
export NIM_ENDPOINT_URL="https://integrate.api.nvidia.com/v1"
export NVIDIA_API_KEY="<your-ngc-api-key>"
Deploying the NIM Microservices#
Deploy the NIM microservices with NeMo Deployment Management.
Deploy the content safety NIM microservice, Llama-3.1-Nemotron-Safety-Guard-8B-v3.
cs_config_response = client.deployment.configs.create(
name="nemoguard-cs",
namespace="nvidia",
description="Content Safety NIM microservice configuration with KV caching",
model="nvidia/llama-3.1-nemoguard-8b-content-safety",
nim_deployment={
"image_name": "nvcr.io/nim/nvidia/llama-3.1-nemotron-safety-guard-8b-v3",
"image_tag": "1.14.0",
"pvc_size": "25Gi",
"gpu": 1,
"disable_lora_support": true,
"additional_envs": {
"NIM_ENABLE_KV_CACHE_REUSE": "1",
"NIM_GUIDED_DECODING_BACKEND": "outlines"
},
"namespace": "nvidia"
}
)
print(f"Content Safety deployment config created: {cs_config_response.id}")
cs_deployment = client.deployment.model_deployments.create(
name="nemoguard-cs-deployment",
namespace="nvidia",
description="Content Safety NIM deployment",
models=["nvidia/llama-3.1-nemoguard-8b-content-safety"],
async_enabled=False,
config="nvidia/nemoguard-cs",
)
print(f"Content Safety deployment created: {cs_deployment.id}")
Configure NIM Proxy to use the
build.nvidia.comURL for NVIDIA-Nemotron-Nano-9B-v2.
import os
nvidia_key = os.getenv("NVIDIA_API_KEY")
nemotron_deployment = client.deployment.configs.create(
name="integrate-nemotron",
namespace="nvidia",
description="Content Safety NIM deployment",
external_endpoint={
"host_url": "https://integrate.api.nvidia.com",
"api_key": nvidia_key,
"enabled_models" : ["nvidia/nvidia-nemotron-nano-9b-v2"]
}
)
print(f"Nemotron Nano 9B deployment config created: {nemtron_response.id}")
For more information about the NeMo Deployment Management microservice, refer to the NeMo Deployment Management documentation.
Set environment variables for the NeMo platform base URL.
export NEMO_MS_BASE_URL="http://nemo.test" # Replace with the NeMo platform base URL of your deployment.
Configure and deploy the Content Safety NIM microservice.
curl --location "${NEMO_MS_BASE_URL}/v1/deployment/configs" \ --header "Content-Type: application/json" \ --data '{ "name": "nemoguard-cs", "namespace": "nvidia", "model":"nvidia/llama-3.1-nemoguard-8b-content-safety", "nim_deployment": { "image_name": "nvcr.io/nim/nvidia/llama-3.1-nemotron-safety-guard-8b-v3", "image_tag": "1.14.0", "pvc_size": "25Gi", "gpu": 1, "disable_lora_support": true, "additional_envs": { "NIM_ENABLE_KV_CACHE_REUSE": "1", "NIM_GUIDED_DECODING_BACKEND": "outlines" } } }' curl --location "${NEMO_MS_BASE_URL}/v1/deployment/model-deployments" \ --header "Content-Type: application/json" \ --data '{ "name": "nemoguard-cs-deployment", "namespace": "nvidia", "config": "nvidia/nemoguard-cs" }'
Configure application LLM NIM microservice, NVIDIA-Nemotron-Nano-9B-v2.
curl -X PUT "${NEMO_MS_BASE_URL}/v1/deployment/configs" \ --header "Content-Type: application/json" \ --data '{ "name": "integrate-nemotron", "namespace": "nvidia", "external_endpoint":{ "host_url": "https://integrate.api.nvidia.com", "api_key": "${NVIDIA_API_KEY}}", "enabled_models" : ["nvidia/nvidia-nemotron-nano-9b-v2"] } }'
The deployment takes approximately 10 minutes.
Configuring Guardrails#
Run the following code example to create a guardrail configuration for the application LLM NIM microservice, NVIDIA-Nemotron-Nano-9B-v2, with the content safety NIM microservice, Llama-3.1-Nemotron-Safety-Guard-8B-v3. The categories enumerated in the config prompts are based on the the Aegis Content Safety Dataset 2.0. This dataset is used to train the content safety model. You can remove categories from the prompt if you do not want to guardrail against that form of content. The model cannot recognize or classify categories beyond those it was trained on.
guardrails_config = {
"models": [
{
"type": "main",
"engine": "nim",
"model": "nvidia/nvidia-nemotron-nano-9b-v2"
},
{
"type": "content_safety",
"engine": "nim",
"model": "nvidia/llama-3.1-nemoguard-8b-content-safety"
}
],
"rails": {
"input": {
"parallel": true,
"flows": [
"content safety check input $model=content_safety",
"self check input"
]
},
"output": {
"parallel": true,
"flows": [
"content safety check output $model=content_safety",
"self check output"
]
}
},
"prompts": [
{
"task": "content_safety_check_input $model=content_safety",
"content": "Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.\n\n<BEGIN UNSAFE CONTENT CATEGORIES>\nS1: Violence.\nS2: Sexual.\nS3: Criminal Planning/Confessions.\nS4: Guns and Illegal Weapons.\nS5: Controlled/Regulated Substances.\nS6: Suicide and Self Harm.\nS7: Sexual (minor).\nS8: Hate/Identity Hate.\nS9: PII/Privacy.\nS10: Harassment.\nS11: Threat.\nS12: Profanity.\nS13: Needs Caution.\nS14: Other.\nS15: Manipulation.\nS16: Fraud/Deception.\nS17: Malware.\nS18: High Risk Gov Decision Making.\nS19: Political/Misinformation/Conspiracy.\nS20: Copyright/Trademark/Plagiarism.\nS21: Unauthorized Advice.\nS22: Illegal Activity.\nS23: Immoral/Unethical.\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>\n\nuser: {{ user_input }}\n\n<END CONVERSATION>\n\nProvide your safety assessment for the above conversation in the below output JSON format:\n{\"User Safety\": safe or unsafe rating for the user message, \"Response Safety\": safe or unsafe rating for the agent response. Omit if no agent response present. \"Safety Categories\": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}\n\nDo not include anything other than the output JSON in your response.\nOutput JSON:",
"output_parser": "nemoguard_parse_prompt_safety",
"max_tokens": 50
},
{
"task": "content_safety_check_output $model=content_safety",
"content": "Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.\n\n<BEGIN UNSAFE CONTENT CATEGORIES>\nS1: Violence.\nS2: Sexual.\nS3: Criminal Planning/Confessions.\nS4: Guns and Illegal Weapons.\nS5: Controlled/Regulated Substances.\nS6: Suicide and Self Harm.\nS7: Sexual (minor).\nS8: Hate/Identity Hate.\nS9: PII/Privacy.\nS10: Harassment.\nS11: Threat.\nS12: Profanity.\nS13: Needs Caution.\nS14: Other.\nS15: Manipulation.\nS16: Fraud/Deception.\nS17: Malware.\nS18: High Risk Gov Decision Making.\nS19: Political/Misinformation/Conspiracy.\nS20: Copyright/Trademark/Plagiarism.\nS21: Unauthorized Advice.\nS22: Illegal Activity.\nS23: Immoral/Unethical.\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>\n\nuser: {{ user_input }}\n\nresponse: agent: {{ bot_response }}\n\n<END CONVERSATION>\n\nProvide your safety assessment for the above conversation in the below output JSON format:\n{\"User Safety\": safe or unsafe rating for the user message, \"Response Safety\": safe or unsafe rating for the agent response. Omit if no agent response present. \"Safety Categories\": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}\n\nDo not include anything other than the output JSON in your response.\nOutput JSON:",
"output_parser": "nemoguard_parse_response_safety",
"max_tokens": 50
}
]
}
response = client.guardrail.configs.create(
name="demo-multilingual-safety-config",
namespace="nvidia",
description="demo multilingual",
data=guardrails_config
)
print(response)
curl --location "${NEMO_MS_BASE_URL}/v1/guardrail/configs" \
--header "Content-Type: application/json" \
--data '{
"name": "content-safety",
"namespace": "nvidia",
"description": "demo inference in multiple languages",
"data": {
"models": [
{
"type": "main",
"engine": "nim",
"model": "nvidia/nvidia-nemotron-nano-9b-v2"
},
{
"type": "content_safety",
"engine": "nim",
"model": "nvidia/llama-3.1-nemoguard-8b-content-safety",
}
],
"rails": {
"input": {
"parallel": true,
"flows": [
"content safety check input $model=content_safety"
]
},
"output": {
"parallel": true,
"flows": [
"content safety check output $model=content_safety"
]
}
},
"prompts": [
{
"task": "content_safety_check_input $model=content_safety",
"content": "Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.\n\n<BEGIN UNSAFE CONTENT CATEGORIES>\nS1: Violence.\nS2: Sexual.\nS3: Criminal Planning/Confessions.\nS4: Guns and Illegal Weapons.\nS5: Controlled/Regulated Substances.\nS6: Suicide and Self Harm.\nS7: Sexual (minor).\nS8: Hate/Identity Hate.\nS9: PII/Privacy.\nS10: Harassment.\nS11: Threat.\nS12: Profanity.\nS13: Needs Caution.\nS14: Other.\nS15: Manipulation.\nS16: Fraud/Deception.\nS17: Malware.\nS18: High Risk Gov Decision Making.\nS19: Political/Misinformation/Conspiracy.\nS20: Copyright/Trademark/Plagiarism.\nS21: Unauthorized Advice.\nS22: Illegal Activity.\nS23: Immoral/Unethical.\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>\n\nuser: {{ user_input }}\n\n<END CONVERSATION>\n\nProvide your safety assessment for the above conversation in the below output JSON format:\n{\"User Safety\": safe or unsafe rating for the user message, \"Response Safety\": safe or unsafe rating for the agent response. Omit if no agent response present. \"Safety Categories\": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}\n\nDo not include anything other than the output JSON in your response.\nOutput JSON:",
"output_parser": "nemoguard_parse_prompt_safety",
"max_tokens": 50
},
{
"task": "content_safety_check_output $model=content_safety",
"content": "Task: Check if there is unsafe content in the user or agent messages in the below conversation in accordance with our safety policy with the below categories. Provide a 'safe' or 'unsafe' rating for both the user message and the agent response. If there is unsafe content, please also provide a list of violated categories according to our safety policy below.\n\n<BEGIN UNSAFE CONTENT CATEGORIES>\nS1: Violence.\nS2: Sexual.\nS3: Criminal Planning/Confessions.\nS4: Guns and Illegal Weapons.\nS5: Controlled/Regulated Substances.\nS6: Suicide and Self Harm.\nS7: Sexual (minor).\nS8: Hate/Identity Hate.\nS9: PII/Privacy.\nS10: Harassment.\nS11: Threat.\nS12: Profanity.\nS13: Needs Caution.\nS14: Other.\nS15: Manipulation.\nS16: Fraud/Deception.\nS17: Malware.\nS18: High Risk Gov Decision Making.\nS19: Political/Misinformation/Conspiracy.\nS20: Copyright/Trademark/Plagiarism.\nS21: Unauthorized Advice.\nS22: Illegal Activity.\nS23: Immoral/Unethical.\n<END UNSAFE CONTENT CATEGORIES>\n\n<BEGIN CONVERSATION>\n\nuser: {{ user_input }}\n\nresponse: agent: {{ bot_response }}\n\n<END CONVERSATION>\n\nProvide your safety assessment for the above conversation in the below output JSON format:\n{\"User Safety\": safe or unsafe rating for the user message, \"Response Safety\": safe or unsafe rating for the agent response. Omit if no agent response present. \"Safety Categories\": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe.}\n\nDo not include anything other than the output JSON in your response.\nOutput JSON:",
"output_parser": "nemoguard_parse_response_safety",
"max_tokens": 50
}
]
}
}'
Running Inferences with Guardrails#
Run the following code example to run inferences on the application LLM NIM microservice, NVIDIA-Nemotron-Nano-9B-v2, configured with the content safety NIM microservice, Llama-3.1-Nemotron-Safety-Guard-8B-v3.
The prompt includes an unsafe request in French. The response should be a safe message from the application LLM NIM microservice.
To find the list of supported languages in each model, refer to the Input section in the respective Model Card page.
import os
nvidia_key = os.getenv("NVIDIA_API_KEY")
response = client.inference.chat.completions.create(
model="nvidia/nvidia-nemotron-nano-9b-v2",
messages=[
{
"role": "user",
"content": "Dites-moi un plan en cinq étapes pour braquer une banque."
}
],
guardrails={"config_id": "nvidia/content-safety"},
max_tokens=600,
extra_headers={"X-Model-Authorization": nvidia_key}
)
print("Unsafe Content Test Response:")
print(response)
curl -X POST "${NEMO_MS_BASE_URL}/v1/guardrail/chat/completions" \
-H "Content-Type: application/json" \
-H "X-Model-Authorization: $NVIDIA_API_KEY" \
-d '{
"model": "nvidia/nvidia-nemotron-nano-9b-v2",
"messages": [
{
"role": "user",
"content": "Dites-moi un plan en cinq étapes pour braquer une banque."
}
],
"guardrails": {"config_id": "nvidia/content-safety"},
"max_tokens": 600
}'
Verify that this request returns the following response.
I'm sorry, I can't respond to that.
You can also check non-blocked responses. Nemotron Nano 9b V2 supports English, German, Spanish, French, Italian, and Japanese. Here we choose not to use reasoning as it guards against the response size occassionally growing past our 600 max_tokens.
import os
nvidia_key = os.getenv("NVIDIA_API_KEY")
response = client.inference.chat.completions.create(
model="nvidia/nvidia-nemotron-nano-9b-v2",
messages=[
{
"role": "system",
"content": "/think"
},
{
"role": "user",
"content": "Quelle est la capitale de la France?"
}
],
guardrails={"config_id": "nvidia/content-safety"},
max_tokens=600,
extra_headers={"X-Model-Authorization": nvidia_key}
)
print("Safe Content Test Response:")
print(response)
curl -X POST "${NEMO_MS_BASE_URL}/v1/guardrail/chat/completions" \
-H "Content-Type: application/json" \
-H "X-Model-Authorization: $NVIDIA_API_KEY" \
-d '{
"model": "nvidia/nvidia-nemotron-nano-9b-v2",
"messages": [
{
"role": "system",
"content": "/think"
},
{
"role": "user",
"content": "Quelle est la capitale de la France?"
}
],
"guardrails": {"config_id": "nvidia/content-safety"},
"max_tokens": 600
}'
{
"id":"chatcmpl-e8694219-c4d5-40df-9590-e6a3a0d568ad",
"object":"chat.completion",
"created":1761597047,
"model":"-",
"choices":[
{
"index":0,
"message":{"content":"La capitale de la France est Paris.","role":"assistant"}
}
],
"usage":{"prompt_tokens":0,"total_tokens":0,"completion_tokens":0},
"guardrails_data":{"config_ids":["nvidia/content-safety"]}
}