About Guardrails#

By using the NVIDIA NeMo Guardrails microservice, add safety checks and content moderation to a large language model (LLM).

How NeMo Guardrails Interacts with LLMs#

NVIDIA NeMo Guardrails microservice configures a system that places NemoGuard NIM microservices between your application and the application’s LLM. By setting up the Guardrails configurations depending on your use case, you can add safety checks and content moderation to your LLMs.

Guardrails between applications and LLMs

A single NeMo Guardrails microservice instance can serve multiple applications and help manage multiple guardrail configurations and LLMs. The NeMo Guardrails microservice can call both internal and external LLMs, providing a solution for guarding models whether they are inside or outside the NeMo microservices cluster.

To get user inputs and the application LLM’s outputs checked by the guardrails, configure your application to use the NeMo Guardrails microservice’s inference endpoint instead of the LLM endpoint.

In each guardrail configuration in your inference request to NeMo Guardrails, you can specify a specific LLM to use for actual text generation, the NemoGuard NIM microservices to run on input and output text, and guardrail policies to apply to the NemoGuard NIM microservices.

The following architecture diagram visualizes how NeMo Guardrails acts as a central hub, routing requests to different models and services for tasks like content safety and topic control. It also shows that the platform can interact with both internal and external clusters.

NeMo Microservice Platform used to guardrail multiple LLMs

The diagram starts with a Chatbot and a Document Summary Service on the left.

The Chatbot sends a Chatbot Request with a Content Safety Config to NeMo Guardrails.
The Document Summary Service sends a Document Service Request with a Topic Control Config to NeMo Guardrails.

Inside the NeMo Microservices Platform, NeMo Guardrails is the central component.

The solid green arrows representing the Content Safety Workflow flows from NeMo Guardrails to the NemoGuard Content Safety NIM. At this stage, the NemoGuard Content Safety NIM is responsible for checking the content safety of the user input. If input passes the check, the Llama Nemotron Nano NIM receives the input and generates an output. The Content Safety NIM then checks the output from the Llama Nemotron Nano NIM. If the output passes the check, the output is sent back to the Chatbot user. If the input or output fails the check, the output is sent back to the Chatbot user with a message indicating that the output failed the check.
The dashed orange arrows representing the Topic Control Workflow flows from NeMo Guardrails to the NemoGuard Topic Control NIM. At this stage, the NemoGuard Topic Control NIM is responsible for checking the topic safety of the user input. If input passes the check, the Llama Nemotron Super NIM in the external cluster receives the input and generates an output. The Topic Control NIM then checks the output from the Llama Nemotron Super NIM. If the output passes the check, the output is sent back to the Document Summary Service using the original request and output. If the input or output fails the check, the output is sent back to the Document Summary Service with a message indicating that the output failed the check.

Tutorials#

Use the following tutorials to learn how to accomplish common guardrail tasks using the NeMo Guardrails microservice.

Quickstart with Docker Compose

Quickstart with Docker Compose. This is for developers to experiment with the microservice before deploying it in production.

nemo-guardrails docker compose local development data scientists

NeMo Guardrails Quickstart Using Docker Compose

NeMo Guardrails Tutorials

Tutorials for common guardrail tasks. This is using Helm installation for get you prepared for production setup.

nemo-guardrails helm kubernetes data scientists

Guardrail Tutorials

API Usage Guides#

The following guides provide detailed information on how to perform common guardrail operations using the NeMo Guardrails microservice APIs, both in the NeMo Microservices Python SDK or directly with the REST API.

Manage Guardrail Configurations

Manage guardrail configurations.

Manage Guardrail Configs

Check Guardrails

Check content with input and output guardrails.

Checking a Guardrail

Run Inference with Guardrails

Run inference and apply safety checks.

Running Inference with Guardrails

Manage Accessing Models

When deployed individually, use the REST API to configure access to models.

Manage NeMo Guardrails Access to Models

Differences between the Guardrails Microservice and the Toolkit#

The NeMo Guardrails microservice is powered by the open-source NeMo Guardrails toolkit and runs in a containerized environment for guardrail operations with other infrastructure microservices. The following list identifies the differences between the NeMo Guardrails microservice and the NeMo Guardrails open-source toolkit.

Certain toolkit server features are unavailable in the microservice, such as threads and the chat user interface.
The microservice supports Colang 1.0 guardrail configurations only.
The microservice supports NIM for LLMs and LLMs hosted on an OpenAI-compatible API endpoint.
The API is OpenAI compatible, with the addition of the guardrails field in the request body and the guardrails_data field in the response body.
In the microservice, the LLM provider is determined by the model field that is specified in the HTTP request. In the toolkit, the LLM provider is determined by the guardrail configuration.
The microservice supports a default guardrail configuration that is used when no guardrail configuration is specified as part of an HTTP request.