GuardrailsAI Integration

View as Markdown

NeMo Guardrails provides out-of-the-box support for GuardrailsAI validators, enabling comprehensive input and output validation using a rich ecosystem of community-built validators. GuardrailsAI offers validators for content safety, PII detection, toxic language filtering, jailbreak detection, topic restriction, and much more.

The integration provides access to both built-in validators and the entire Guardrails Hub ecosystem, allowing you to dynamically load and configure validators for your specific use cases.

Setup

To use GuardrailsAI validators, you need to install the guardrails-ai package:

$pip install guardrails-ai

You may also need to install specific validators from the Guardrails Hub:

$guardrails hub install guardrails/toxic_language
$guardrails hub install guardrails/detect_jailbreak
$guardrails hub install guardrails/guardrails_pii

Usage

The GuardrailsAI integration uses a flexible configuration system that allows you to define validators with their parameters and metadata, then reference them in your input and output rails.

Configuration Structure

Add GuardrailsAI validators to your config.yml:

1rails:
2 config:
3 guardrails_ai:
4 validators:
5 - name: toxic_language
6 parameters:
7 threshold: 0.5
8 validation_method: "sentence"
9 metadata: {}
10 - name: guardrails_pii
11 parameters:
12 entities: ["phone_number", "email", "ssn"]
13 metadata: {}
14 - name: competitor_check
15 parameters:
16 competitors: ["Apple", "Google", "Microsoft"]
17 metadata: {}

Input Rails

To use GuardrailsAI validators for input validation:

1rails:
2 input:
3 flows:
4 - guardrailsai check input $validator="guardrails_pii"
5 - guardrailsai check input $validator="competitor_check"

Output Rails

To use GuardrailsAI validators for output validation:

1rails:
2 output:
3 flows:
4 - guardrailsai check output $validator="toxic_language"
5 - guardrailsai check output $validator="restricttotopic"

Result Format in Colang Flows

The GuardrailsAI actions (validate_guardrails_ai_input and validate_guardrails_ai_output) return a dict that is stored in $result when used in flows. This dict contains:

  • validation_result: The raw GuardrailsAI validation outcome (e.g., PassResult or FailResult).
  • valid: A boolean derived from the GuardrailsAI validation_passed field. Use this in flow conditions such as if not $result["valid"] to decide whether to block.

Built-in Validators

The integration includes support for the following validators that are pre-registered in the NeMo Guardrails validator registry. For detailed parameter specifications and usage examples, refer to the official GuardrailsAI Hub documentation for each validator:

  • competitor_check - hub://guardrails/competitor_check
  • detect_jailbreak - hub://guardrails/detect_jailbreak
  • guardrails_pii - hub://guardrails/guardrails_pii
  • one_line - hub://guardrails/one_line
  • provenance_llm - hub://guardrails/provenance_llm
  • regex_match - hub://guardrails/regex_match
  • restricttotopic - hub://tryolabs/restricttotopic
  • toxic_language - hub://guardrails/toxic_language
  • valid_json - hub://guardrails/valid_json
  • valid_length - hub://guardrails/valid_length

Complete Example

Here’s a comprehensive example configuration:

1models:
2 - type: main
3 engine: openai
4 model: gpt-4
5
6rails:
7 config:
8 guardrails_ai:
9 validators:
10 - name: toxic_language
11 parameters:
12 threshold: 0.5
13 validation_method: "sentence"
14 metadata: {}
15 - name: guardrails_pii
16 parameters:
17 entities: ["phone_number", "email", "ssn", "credit_card"]
18 metadata: {}
19 - name: competitor_check
20 parameters:
21 competitors: ["Apple", "Google", "Microsoft", "Amazon"]
22 metadata: {}
23 - name: restricttotopic
24 parameters:
25 valid_topics: ["technology", "science", "education"]
26 metadata: {}
27 - name: valid_length
28 parameters:
29 min: 10
30 max: 500
31 metadata: {}
32
33 input:
34 flows:
35 - guardrailsai check input $validator="guardrails_pii"
36 - guardrailsai check input $validator="competitor_check"
37
38 output:
39 flows:
40 - guardrailsai check output $validator="toxic_language"
41 - guardrailsai check output $validator="restricttotopic"
42 - guardrailsai check output $validator="valid_length"

Custom Validators from Guardrails Hub

You can use any validator from the Guardrails Hub by specifying its hub path:

1rails:
2 config:
3 guardrails_ai:
4 validators:
5 - name: custom_validator_name
6 parameters:
7 # Custom parameters specific to the validator
8 metadata: {}

The integration will automatically fetch validator information from the hub if it’s not in the built-in registry.

Performance Considerations

  • Validators are cached to improve performance on repeated use
  • Guard instances are reused when the same validator is called with identical parameters
  • Consider the latency impact when chaining multiple validators

For a complete working example, see the GuardrailsAI example configuration.