Logging and Debugging Guardrails Generated Responses

View as Markdown

This guide covers the various methods for logging, debugging, and understanding what happens during guardrails generation.

Overview

The NeMo Guardrails library provides multiple ways to inspect and debug guardrails generation:

MethodUse Case
Verbose ModeReal-time console logging during development
Explain MethodQuick summary of the last generation
Generation Options (log)Detailed structured logs returned with responses
Output VariablesReturn specific context variables

Verbose Mode

Enable detailed console logging by setting verbose=True when creating the LLMRails instance:

1from nemoguardrails import LLMRails, RailsConfig
2
3config = RailsConfig.from_path("path/to/config")
4rails = LLMRails(config, verbose=True)

This outputs detailed information about:

  • LLM calls and their prompts/completions
  • Rail activations and decisions
  • Action executions
  • Flow transitions

Explain Method

Get a quick summary of the last generation using the explain() method:

1response = rails.generate(messages=[
2 {"role": "user", "content": "Hello!"}
3])
4
5info = rails.explain()
6info.print_llm_calls_summary()

The ExplainInfo object provides methods to inspect:

  • LLM calls summary
  • Colang history
  • Generated events

Generation Options: Log

For detailed structured logging, use the log generation option. This returns comprehensive information about what happened during generation.

Enabling Log Options

1response = rails.generate(
2 messages=[{"role": "user", "content": "Hello!"}],
3 options={
4 "log": {
5 "activated_rails": True,
6 "llm_calls": True,
7 "internal_events": True,
8 "colang_history": True
9 }
10 }
11)

Log Option Reference

OptionDescription
activated_railsDetailed information about rails activated during generation
llm_callsInformation about all LLM calls (prompt, completion, tokens, timing)
internal_eventsArray of internal generated events
colang_historyConversation history in Colang format

Response Structure

{
"response": [...],
"log": {
"activated_rails": [...],
"stats": {...},
"llm_calls": [...],
"internal_events": [...],
"colang_history": "..."
}
}

Using print_summary()

The log object has a print_summary() method for a human-readable overview:

1response.log.print_summary()

Example output:

# General stats
- Total time: 2.85s
- [0.56s][19.64%]: INPUT Rails
- [1.40s][49.02%]: DIALOG Rails
- [0.58s][20.22%]: GENERATION Rails
- [0.31s][10.98%]: OUTPUT Rails
- 5 LLM calls, 2.74s total duration, 1641 total prompt tokens, 103 total completion tokens, 1744 total tokens.
# Detailed stats
- [0.56s] INPUT (self check input): 1 actions (self_check_input), 1 llm calls [0.56s]
- [0.43s] DIALOG (generate user intent): 1 actions (generate_user_intent), 1 llm calls [0.43s]
- [0.96s] DIALOG (generate next step): 1 actions (generate_next_step), 1 llm calls [0.95s]
- [0.58s] GENERATION (generate bot message): 2 actions (retrieve_relevant_chunks, generate_bot_message), 1 llm calls [0.49s]
- [0.31s] OUTPUT (self check output): 1 actions (self_check_output), 1 llm calls [0.31s]

Accessing Detailed Data

Access specific log components programmatically:

1# Access LLM calls
2for call in response.log.llm_calls:
3 print(f"Task: {call.task}")
4 print(f"Duration: {call.duration}s")
5 print(f"Prompt tokens: {call.prompt_tokens}")
6 print(f"Completion tokens: {call.completion_tokens}")
7 print(f"Total tokens: {call.total_tokens}")
8
9# Access activated rails
10for rail in response.log.activated_rails:
11 print(f"Type: {rail.type}, Name: {rail.name}")
12 print(f"Decisions: {rail.decisions}")
13 print(f"Duration: {rail.duration}s")
14
15# Access stats
16stats = response.log.stats
17print(f"Total duration: {stats.total_duration}s")
18print(f"Input rails: {stats.input_rails_duration}s")
19print(f"Dialog rails: {stats.dialog_rails_duration}s")
20print(f"Output rails: {stats.output_rails_duration}s")

Output Variables

Return specific context variables using the output_vars option:

Return Specific Variables

1response = rails.generate(
2 messages=[{"role": "user", "content": "Hello!"}],
3 options={
4 "output_vars": ["triggered_input_rail", "triggered_output_rail"]
5 }
6)
7
8print(response.output_data)
9# {'triggered_input_rail': None, 'triggered_output_rail': None}

Return All Context Variables

Set output_vars to True to return the complete context:

1response = rails.generate(
2 messages=[{"role": "user", "content": "Hello!"}],
3 options={
4 "output_vars": True
5 }
6)
7
8# Access all context data
9print(response.output_data.keys())

Common Output Variables

VariableDescription
last_user_messageThe last user message
last_bot_messageThe last bot message
triggered_input_railName of input rail that triggered (if any)
triggered_output_railName of output rail that triggered (if any)
relevant_chunksRetrieved knowledge base chunks
allowedWhether the input was allowed

Combining Log and Output Variables

Use both options together for comprehensive debugging:

1response = rails.generate(
2 messages=[{"role": "user", "content": "Tell me about the company."}],
3 options={
4 "output_vars": ["triggered_input_rail", "relevant_chunks"],
5 "log": {
6 "activated_rails": True,
7 "llm_calls": True
8 }
9 }
10)
11
12# Check if any rail was triggered
13if response.output_data.get("triggered_input_rail"):
14 print(f"Input blocked by: {response.output_data['triggered_input_rail']}")
15
16# Inspect what happened
17response.log.print_summary()

Debugging Common Issues

Input Blocked Unexpectedly

1response = rails.generate(
2 messages=[{"role": "user", "content": "Your message"}],
3 options={
4 "output_vars": ["triggered_input_rail"],
5 "log": {"activated_rails": True}
6 }
7)
8
9if response.output_data.get("triggered_input_rail"):
10 # Find the input rail that blocked
11 for rail in response.log.activated_rails:
12 if rail.type == "input" and rail.stop:
13 print(f"Blocked by: {rail.name}")
14 # Check the LLM decision
15 for action in rail.executed_actions:
16 for llm_call in action.llm_calls:
17 print(f"Prompt: {llm_call.prompt}")
18 print(f"Completion: {llm_call.completion}")

Understanding Flow Execution

1response = rails.generate(
2 messages=[{"role": "user", "content": "Hello!"}],
3 options={
4 "log": {
5 "internal_events": True,
6 "colang_history": True
7 }
8 }
9)
10
11# View internal events
12for event in response.log.internal_events:
13 print(f"{event['type']}: {event}")
14
15# View Colang history
16print(response.log.colang_history)

Analyzing LLM Performance

1response = rails.generate(
2 messages=[{"role": "user", "content": "Hello!"}],
3 options={"log": {"llm_calls": True}}
4)
5
6total_tokens = 0
7total_duration = 0
8
9for call in response.log.llm_calls:
10 print(f"Task: {call.task}")
11 print(f" Duration: {call.duration:.2f}s")
12 print(f" Tokens: {call.total_tokens}")
13 total_tokens += call.total_tokens
14 total_duration += call.duration
15
16print(f"\nTotal: {total_tokens} tokens in {total_duration:.2f}s")

Server API Logging

When using the server API, include options in the request body:

1{
2 "config_id": "my_config",
3 "messages": [{"role": "user", "content": "Hello!"}],
4 "options": {
5 "output_vars": ["triggered_input_rail"],
6 "log": {
7 "activated_rails": true,
8 "llm_calls": true
9 }
10 }
11}

Complete Debugging Example

1from nemoguardrails import LLMRails, RailsConfig
2
3# Enable verbose mode for console output
4config = RailsConfig.from_path("path/to/config")
5rails = LLMRails(config, verbose=True)
6
7# Generate with full logging
8response = rails.generate(
9 messages=[{"role": "user", "content": "What is the company policy?"}],
10 options={
11 "output_vars": True,
12 "log": {
13 "activated_rails": True,
14 "llm_calls": True,
15 "internal_events": True,
16 "colang_history": True
17 }
18 }
19)
20
21# Print summary
22print("=== Generation Summary ===")
23response.log.print_summary()
24
25# Check for blocked content
26print("\n=== Rail Triggers ===")
27print(f"Input rail triggered: {response.output_data.get('triggered_input_rail')}")
28print(f"Output rail triggered: {response.output_data.get('triggered_output_rail')}")
29
30# Analyze LLM calls
31print("\n=== LLM Calls ===")
32for call in response.log.llm_calls:
33 print(f"{call.task}: {call.total_tokens} tokens, {call.duration:.2f}s")
34
35# View final response
36print(f"\n=== Response ===")
37print(response.response[0]["content"])
  • Tracing - Production monitoring and observability with OpenTelemetry