PolicyAI Integration
NeMo Guardrails supports using the PolicyAI content moderation API as an input and output rail out-of-the-box (you need to have the POLICYAI_API_KEY environment variable set).
PolicyAI provides flexible policy-based content moderation, allowing you to define custom policies for your specific use cases and manage them through tags.
Setup
- Sign up for a PolicyAI account at musubilabs.ai
- Create your policies and organize them with tags
- Set the required environment variables:
Usage
Basic Input Moderation
Basic Output Moderation
Using Different Tags
To use different policy tags for different environments, set the POLICYAI_TAG_NAME environment variable:
Complete Example
How It Works
-
Input Rails: When a user sends a message, PolicyAI evaluates it against all policies attached to the configured tag. If any policy returns
UNSAFE, the message is blocked. -
Output Rails: Before the bot’s response is sent to the user, PolicyAI evaluates it. If the content violates any policy, the response is replaced with a refusal message.
Response Format
PolicyAI returns the following information for each evaluation:
assessment:"SAFE"or"UNSAFE"category: The category of violation (if UNSAFE)severity: Severity level from 0 (safe) to 3 (high severity)reason: Human-readable explanation
Customizing Behavior
To customize the behavior when content is flagged, you can override the default flows in your config:
Environment Variables
Error Handling
If the PolicyAI API is unavailable or returns an error, the action will raise an exception. To implement fail-open or fail-closed behavior, you can wrap the action in a try-catch block in your custom flows.