Overview of the NVIDIA NeMo Guardrails Library API Server
The NVIDIA NeMo Guardrails library API server provides the following capabilities:
- Loads guardrails configurations at startup.
- Exposes an OpenAI-compatible REST API for chat completions and model listing.
- Works with the OpenAI Python SDK. Use
OpenAI(base_url="http://localhost:8000/v1"). - Includes a built-in chat UI for testing.
- Supports multiple configurations and combines them for each request.
Quick Start
The following steps show how to start the NVIDIA NeMo Guardrails library API server with the provided configuration files and send test requests to the endpoints.
Prerequisites
Meet the following prerequisites before you use the NVIDIA NeMo Guardrails library API server.
-
Install the NVIDIA NeMo Guardrails library with the
serverextra. For instructions, refer to Extra Dependencies. -
Set the environment variable for your NVIDIA API key.
This key is required to access NVIDIA-hosted models on build.nvidia.com. The provided example configurations and code examples throughout the documentation use NVIDIA-hosted models.
Start the Server
Follow these steps to start the server:
-
Point the server to a parent directory that contains multiple configuration subdirectories:
-
To check if the server is running and list the available configurations, use the following command:
Each subdirectory that contains a config.yml or config.yaml file becomes an available configuration ID.
Send a Request
Send a chat completion request to the server:
View the Chat UI
Open http://localhost:8000 in your browser to access the built-in chat UI for testing.