Reasoning#
Overview#
The ResponseReasoningInterceptor handles models that generate explicit reasoning steps, typically enclosed in special tokens. It removes reasoning content from the final response and tracks reasoning metrics for analysis.
Configuration#
Python Configuration#
CLI Configuration#
--overrides 'target.api_endpoint.adapter_config.use_reasoning=True,target.api_endpoint.adapter_config.end_reasoning_token="</think>",target.api_endpoint.adapter_config.start_reasoning_token="<think>"'
YAML Configuration#
target:
api_endpoint:
adapter_config:
interceptors:
- name: "endpoint"
enabled: true
config: {}
- name: reasoning
config:
start_reasoning_token: "<think>"
end_reasoning_token: "</think>"
add_reasoning: true
enable_reasoning_tracking: true
Configuration Options#
For detailed configuration options, please refer to the nemo_evaluator.adapters.interceptors Python API reference.
Processing Examples#
Basic Reasoning Stripping#
# Original response from model
original_content = "<think>Let me solve this step by step. 2+2 is basic addition. 2 plus 2 equals 4.</think>The answer is 4."
# After reasoning interceptor processing
# The content field has reasoning removed
processed_content = "The answer is 4."
Multi-Step Reasoning#
# Original response with multi-line reasoning
original_content = """<think>
This is a word problem. Let me break it down:
1. John has 5 apples
2. He gives away 2 apples
3. So he has 5 - 2 = 3 apples left
</think>John has 3 apples remaining."""
# After processing: reasoning tokens and content are removed
processed_content = "John has 3 apples remaining."
Tracked Metrics#
The interceptor automatically tracks the following statistics:
Metric |
Description |
|---|---|
|
Total number of responses processed |
|
Number of responses containing reasoning content |
|
Number of responses where reasoning completed (end token found) |
|
Number of responses where reasoning started |
|
Average word count in reasoning content |
|
Average token count in reasoning content |
|
Average word count in original content (before processing) |
|
Average word count in updated content (after processing) |
|
Average token count in updated content |
|
Maximum word count in reasoning content |
|
Maximum token count in reasoning content |
|
|
|
|
|
Maximum token count in updated content |
|
Total word count across all reasoning content |
|
Total token count across all reasoning content |
|
Total word count in original content (before processing) |
|
Total word count in updated content (after processing) |
|
Total token count in updated content |
These statistics are saved to eval_factory_metrics.json under the reasoning key after evaluation completes.
Example: Custom Reasoning Tokens#
target:
api_endpoint:
adapter_config:
interceptors:
- name: reasoning
config:
start_reasoning_token: "[REASONING]"
end_reasoning_token: "[/REASONING]"
add_reasoning: true
enable_reasoning_tracking: true
- name: "endpoint"
enabled: true
config: {}