Logging and Debugging Guardrails Generated Responses | NVIDIA NeMo Guardrails Library Developer Guide

This guide covers the various methods for logging, debugging, and understanding what happens during guardrails generation.

Overview

The NeMo Guardrails library provides multiple ways to inspect and debug guardrails generation:

Method	Use Case
Verbose Mode	Real-time console logging during development
Explain Method	Quick summary of the last generation
Generation Options (log)	Detailed structured logs returned with responses
Output Variables	Return specific context variables

Verbose Mode

Enable detailed console logging by setting verbose=True when creating the LLMRails instance:

1 from nemoguardrails import LLMRails, RailsConfig
2 
3 config = RailsConfig.from_path("path/to/config")
4 rails = LLMRails(config, verbose=True)

This outputs detailed information about:

LLM calls and their prompts/completions
Rail activations and decisions
Action executions
Flow transitions

Explain Method

Get a quick summary of the last generation using the explain() method:

1 response = rails.generate(messages=[
2     {"role": "user", "content": "Hello!"}
3 ])
4 
5 info = rails.explain()
6 info.print_llm_calls_summary()

The ExplainInfo object provides methods to inspect:

LLM calls summary
Colang history
Generated events

Generation Options: Log

For detailed structured logging, use the log generation option. This returns comprehensive information about what happened during generation.

Enabling Log Options

1 response = rails.generate(
2     messages=[{"role": "user", "content": "Hello!"}],
3     options={
4         "log": {
5             "activated_rails": True,
6             "llm_calls": True,
7             "internal_events": True,
8             "colang_history": True
9         }
10     }
11 )

Log Option Reference

Option	Description
`activated_rails`	Detailed information about rails activated during generation
`llm_calls`	Information about all LLM calls (prompt, completion, tokens, timing)
`internal_events`	Array of internal generated events
`colang_history`	Conversation history in Colang format

Response Structure

{
  "response": [...],
  "log": {
    "activated_rails": [...],
    "stats": {...},
    "llm_calls": [...],
    "internal_events": [...],
    "colang_history": "..."
  }
}

Using print_summary()

The log object has a print_summary() method for a human-readable overview:

1 response.log.print_summary()

Example output:

# General stats
- Total time: 2.85s
  - [0.56s][19.64%]: INPUT Rails
  - [1.40s][49.02%]: DIALOG Rails
  - [0.58s][20.22%]: GENERATION Rails
  - [0.31s][10.98%]: OUTPUT Rails
- 5 LLM calls, 2.74s total duration, 1641 total prompt tokens, 103 total completion tokens, 1744 total tokens.
# Detailed stats
- [0.56s] INPUT (self check input): 1 actions (self_check_input), 1 llm calls [0.56s]
- [0.43s] DIALOG (generate user intent): 1 actions (generate_user_intent), 1 llm calls [0.43s]
- [0.96s] DIALOG (generate next step): 1 actions (generate_next_step), 1 llm calls [0.95s]
- [0.58s] GENERATION (generate bot message): 2 actions (retrieve_relevant_chunks, generate_bot_message), 1 llm calls [0.49s]
- [0.31s] OUTPUT (self check output): 1 actions (self_check_output), 1 llm calls [0.31s]

Accessing Detailed Data

Access specific log components programmatically:

1 # Access LLM calls
2 for call in response.log.llm_calls:
3     print(f"Task: {call.task}")
4     print(f"Duration: {call.duration}s")
5     print(f"Prompt tokens: {call.prompt_tokens}")
6     print(f"Completion tokens: {call.completion_tokens}")
7     print(f"Total tokens: {call.total_tokens}")
8 
9 # Access activated rails
10 for rail in response.log.activated_rails:
11     print(f"Type: {rail.type}, Name: {rail.name}")
12     print(f"Decisions: {rail.decisions}")
13     print(f"Duration: {rail.duration}s")
14 
15 # Access stats
16 stats = response.log.stats
17 print(f"Total duration: {stats.total_duration}s")
18 print(f"Input rails: {stats.input_rails_duration}s")
19 print(f"Dialog rails: {stats.dialog_rails_duration}s")
20 print(f"Output rails: {stats.output_rails_duration}s")

Output Variables

Return specific context variables using the output_vars option:

Return Specific Variables

1 response = rails.generate(
2     messages=[{"role": "user", "content": "Hello!"}],
3     options={
4         "output_vars": ["triggered_input_rail", "triggered_output_rail"]
5     }
6 )
7 
8 print(response.output_data)
9 # {'triggered_input_rail': None, 'triggered_output_rail': None}

Return All Context Variables

Set output_vars to True to return the complete context:

1 response = rails.generate(
2     messages=[{"role": "user", "content": "Hello!"}],
3     options={
4         "output_vars": True
5     }
6 )
7 
8 # Access all context data
9 print(response.output_data.keys())

Common Output Variables

Variable	Description
`last_user_message`	The last user message
`last_bot_message`	The last bot message
`triggered_input_rail`	Name of input rail that triggered (if any)
`triggered_output_rail`	Name of output rail that triggered (if any)
`relevant_chunks`	Retrieved knowledge base chunks
`allowed`	Whether the input was allowed

Combining Log and Output Variables

Use both options together for comprehensive debugging:

1 response = rails.generate(
2     messages=[{"role": "user", "content": "Tell me about the company."}],
3     options={
4         "output_vars": ["triggered_input_rail", "relevant_chunks"],
5         "log": {
6             "activated_rails": True,
7             "llm_calls": True
8         }
9     }
10 )
11 
12 # Check if any rail was triggered
13 if response.output_data.get("triggered_input_rail"):
14     print(f"Input blocked by: {response.output_data['triggered_input_rail']}")
15 
16 # Inspect what happened
17 response.log.print_summary()

Debugging Common Issues

Input Blocked Unexpectedly

1 response = rails.generate(
2     messages=[{"role": "user", "content": "Your message"}],
3     options={
4         "output_vars": ["triggered_input_rail"],
5         "log": {"activated_rails": True}
6     }
7 )
8 
9 if response.output_data.get("triggered_input_rail"):
10     # Find the input rail that blocked
11     for rail in response.log.activated_rails:
12         if rail.type == "input" and rail.stop:
13             print(f"Blocked by: {rail.name}")
14             # Check the LLM decision
15             for action in rail.executed_actions:
16                 for llm_call in action.llm_calls:
17                     print(f"Prompt: {llm_call.prompt}")
18                     print(f"Completion: {llm_call.completion}")

Understanding Flow Execution

1 response = rails.generate(
2     messages=[{"role": "user", "content": "Hello!"}],
3     options={
4         "log": {
5             "internal_events": True,
6             "colang_history": True
7         }
8     }
9 )
10 
11 # View internal events
12 for event in response.log.internal_events:
13     print(f"{event['type']}: {event}")
14 
15 # View Colang history
16 print(response.log.colang_history)

Analyzing LLM Performance

1 response = rails.generate(
2     messages=[{"role": "user", "content": "Hello!"}],
3     options={"log": {"llm_calls": True}}
4 )
5 
6 total_tokens = 0
7 total_duration = 0
8 
9 for call in response.log.llm_calls:
10     print(f"Task: {call.task}")
11     print(f"  Duration: {call.duration:.2f}s")
12     print(f"  Tokens: {call.total_tokens}")
13     total_tokens += call.total_tokens
14     total_duration += call.duration
15 
16 print(f"\nTotal: {total_tokens} tokens in {total_duration:.2f}s")

Server API Logging

When using the server API, include options in the request body:

1 {
2     "config_id": "my_config",
3     "messages": [{"role": "user", "content": "Hello!"}],
4     "options": {
5         "output_vars": ["triggered_input_rail"],
6         "log": {
7             "activated_rails": true,
8             "llm_calls": true
9         }
10     }
11 }

Complete Debugging Example

1 from nemoguardrails import LLMRails, RailsConfig
2 
3 # Enable verbose mode for console output
4 config = RailsConfig.from_path("path/to/config")
5 rails = LLMRails(config, verbose=True)
6 
7 # Generate with full logging
8 response = rails.generate(
9     messages=[{"role": "user", "content": "What is the company policy?"}],
10     options={
11         "output_vars": True,
12         "log": {
13             "activated_rails": True,
14             "llm_calls": True,
15             "internal_events": True,
16             "colang_history": True
17         }
18     }
19 )
20 
21 # Print summary
22 print("=== Generation Summary ===")
23 response.log.print_summary()
24 
25 # Check for blocked content
26 print("\n=== Rail Triggers ===")
27 print(f"Input rail triggered: {response.output_data.get('triggered_input_rail')}")
28 print(f"Output rail triggered: {response.output_data.get('triggered_output_rail')}")
29 
30 # Analyze LLM calls
31 print("\n=== LLM Calls ===")
32 for call in response.log.llm_calls:
33     print(f"{call.task}: {call.total_tokens} tokens, {call.duration:.2f}s")
34 
35 # View final response
36 print(f"\n=== Response ===")
37 print(response.response[0]["content"])

Tracing - Production monitoring and observability with OpenTelemetry