nemo_evaluator.adapters.interceptors#
- class nemo_evaluator.adapters.interceptors.CachingInterceptor(
- params: Params,
Bases:
RequestToResponseInterceptor,ResponseInterceptorCaching interceptor is special in the sense that it intercepts both requests and responses.
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for caching.
- field cache_dir: str = '/tmp'#
Directory to store cache files
- field max_saved_requests: int | None = None#
Maximum number of requests to save
- field max_saved_responses: int | None = None#
Maximum number of responses to cache. Note: This is automatically set to None if reuse_cached_responses is True
- field reuse_cached_responses: bool = False#
Whether to reuse cached responses. If True, this overrides save_responses (sets it to True) and max_saved_responses (sets it to None)
- field save_requests: bool = False#
Whether to save requests to cache
- field save_responses: bool = True#
Whether to save responses to cache. Note: This is automatically set to True if reuse_cached_responses is True
- headers_cache: Cache#
- intercept_request(
- req: AdapterRequest,
- context: AdapterGlobalContext,
Shall return request if no cache hit, and response if it is. :param req: The adapter request to intercept :type req: AdapterRequest :param context: Global context containing server-level configuration :type context: AdapterGlobalContext
- intercept_response(
- resp: AdapterResponse,
- context: AdapterGlobalContext,
Cache the response if caching is enabled and response is successful.
- requests_cache: Cache#
- responses_cache: Cache#
- class nemo_evaluator.adapters.interceptors.EndpointInterceptor(
- params: Params,
Bases:
RequestToResponseInterceptorRequired interceptor that handles the actual API communication. This interceptor must be present in every configuration as it performs the final request to the target API endpoint. Important: This interceptor should always be placed after the last request interceptor and before the first response interceptor.
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for endpoint interceptor.
- intercept_request(
- ar: AdapterRequest,
- context: AdapterGlobalContext,
Make the actual request to the upstream API.
- Parameters:
ar – The adapter request
context – Global context containing server-level configuration
- Returns:
AdapterResponse with the response from the upstream API
- class nemo_evaluator.adapters.interceptors.PayloadParamsModifierInterceptor(
- params: Params,
Bases:
RequestInterceptorAdapter for modifying request payload by removing, adding, and renaming parameters
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for payload modifier interceptor.
- field params_to_add: Dict[str, Any] | None = None#
Dictionary of parameters to add to payload
- field params_to_remove: List[str] | None = None#
List of parameters to remove from payload
- field params_to_rename: Dict[str, str] | None = None#
Dictionary mapping old parameter names to new names
- intercept_request(
- ar: AdapterRequest,
- context: AdapterGlobalContext,
Function that will be called by AdapterServer on the way upstream.
This interceptor can modify the request but must return an AdapterRequest to continue the chain upstream.
- Parameters:
req – The adapter request to intercept
context – Global context containing server-level configuration
Ex.: This is used for request preprocessing, logging, etc.
- class nemo_evaluator.adapters.interceptors.ProgressTrackingInterceptor(
- params: Params,
Bases:
ResponseInterceptor,PostEvalHookProgress tracking via external webhook.
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for progress tracking interceptor.
- field output_dir: str | None = None#
Evaluation output directory. If provided, the progress tracking will be saved to a file in this directory.
- field progress_tracking_interval: Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Gt(gt=0)])] = 1#
How often (every how many samples) to send a progress information.
- Constraints:
gt = 0
- field progress_tracking_interval_seconds: Annotated[float | None, FieldInfo(annotation=NoneType, required=True, metadata=[Gt(gt=0)])] | None = None#
How often (every N seconds) to send a progress information in addition to progress_tracking_interval.
- field progress_tracking_url: str | None = 'http://localhost:8000'#
URL to post the number of processed samples to. Supports expansion of shell variables if present.
- field request_method: str = 'PATCH'#
Request method to use for updating the evaluation progress.
- intercept_response(
- ar: AdapterResponse,
- context: AdapterGlobalContext,
Function that will be called by AdapterServer on the way downstream.
- Parameters:
resp – The adapter response to intercept
context – Global context containing server-level configuration
- post_eval_hook(
- context: AdapterGlobalContext,
Function that will be called by the evaluation system after evaluation completes.
- Parameters:
context – Global context containing server-level configuration and evaluation results
Ex.: This is used for report generation, cleanup, metrics collection, etc.
- progress_filepath: Path | None#
- progress_tracking_interval: int#
- progress_tracking_url: str | None#
- request_method: str#
- class nemo_evaluator.adapters.interceptors.RaiseClientErrorInterceptor(
- params: Params,
Bases:
ResponseInterceptorAdapter for handling non-retryable client error to raise an exception instead of continuing the benchmark.
- pydantic model Params[source]#
Bases:
BaseModelConfiguration parameters for raise client error interceptor.
- field exclude_status_codes: List[int] | None = [408, 429]#
Status codes to exclude from raising client errors when present in status_code_range.
- field status_code_range_end: int | None = 499#
End range of status codes to raise exception. Use with status_code_range_start to define an inclusive range e.g. [400, 499].
- field status_code_range_start: int | None = 400#
Start range of status codes to raise exception. Use with status_code_range_end to define an inclusive range e.g. [400, 499].
- field status_codes: List[int] | None = None#
List of status codes to raise exception.
- exclude_status_codes: List[int] | None#
- intercept_response(
- resp: AdapterResponse,
- context: AdapterGlobalContext,
Intercept response and handle client errors.
- status_code_range_end: int | None#
- status_code_range_start: int | None#
- status_codes: List[int] | None#
- class nemo_evaluator.adapters.interceptors.RequestLoggingInterceptor(
- params: Params,
Bases:
RequestInterceptorLogs incoming requests.
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for request logging.
- field log_request_body: bool = True#
Whether to log request body
- field log_request_headers: bool = True#
Whether to log request headers
- field max_requests: int | None = 2#
Maximum number of requests to log (None for unlimited)
- intercept_request(
- ar: AdapterRequest,
- context: AdapterGlobalContext,
Log the incoming request.
- log_request_body: bool#
- log_request_headers: bool#
- max_requests: int | None#
- class nemo_evaluator.adapters.interceptors.ResponseLoggingInterceptor(
- params: Params,
Bases:
ResponseInterceptorLogs responses.
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for response logging.
- field log_response_body: bool = True#
Whether to log response body
- field log_response_headers: bool = True#
Whether to log response headers
- field max_responses: int | None = None#
Maximum number of responses to log (None for unlimited)
- intercept_response(
- resp: AdapterResponse,
- context: AdapterGlobalContext,
Log the outgoing response.
- log_response_body: bool#
- log_response_headers: bool#
- max_responses: int | None#
- class nemo_evaluator.adapters.interceptors.ResponseReasoningInterceptor(
- params: Params,
Bases:
ResponseInterceptor,PostEvalHookProcesses reasoning tokens from response. Collects statistics. Strips and/or moves reasoning tokens.
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for reasoning interceptor.
- field add_reasoning: bool = True#
Whether to add reasoning information
- field cache_dir: str = '/tmp/reasoning_interceptor'#
Custom cache directory for reasoning stats interceptor.
- field enable_caching: bool = True#
Whether to enable caching of individual request reasoning statistics and aggregated reasoning stats. Useful for resuming interrupted runs.
- field enable_reasoning_tracking: bool = True#
Enable reasoning tracking and logging
- field end_reasoning_token: str = '</think>'#
Token that marks the end of reasoning section, not used if reasoning_content is provided
- field include_if_not_finished: bool = True#
Include reasoning content if reasoning is not finished (end token not found)
- field logging_aggregated_stats_interval: int = 100#
How often (every how many responses) to log aggregated reasoning statistics. Default is 100.
- field migrate_reasoning_content: bool = False#
If reasoning traces are found in reasoning_content, they will be moved to content field end surrounded by start_reasoning_token and end_reasoning_token
- field start_reasoning_token: str | None = '<think>'#
Token that marks the start of reasoning section, used for tracking if reasoning has started
- field stats_file_saving_interval: int | None = None#
How often (every how many responses) to save stats to a file. If None, stats are only saved via post_eval_hook.
- add_reasoning: bool#
- cache_dir: str | None#
- enable_caching: bool#
- enable_reasoning_tracking: bool#
- end_reasoning_token: str#
- include_if_not_finished: bool#
- intercept_response(
- resp: AdapterResponse,
- context: AdapterGlobalContext,
Remove reasoning tokens from assistant message content in the response and track reasoning info.
- logging_aggregated_stats_interval: int#
- migrate_reasoning_content: bool#
- post_eval_hook(
- context: AdapterGlobalContext,
Write collected reasoning statistics to eval_factory_metrics.json.
- start_reasoning_token: str | None#
- stats_file_saving_interval: int | None#
- class nemo_evaluator.adapters.interceptors.ResponseStatsInterceptor(
- params: Params,
Bases:
ResponseInterceptor,PostEvalHookCollects aggregated statistics from API responses for metrics collection.
Tracks the following statistics: - Token usage (prompt, completion, total) with averages and maximums - Response status codes and counts - Finish reasons and stop reasons - Tool calls and function calls counts - Response latency (average and maximum) - Total response count - Number of runs, inference times (approximated by processing time from the first to the last response)
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for response stats collection.
- field cache_dir: str = '/tmp/response_stats_interceptor'#
Custom cache directory for response stats interceptor.
- field collect_finish_reasons: bool = True#
Whether to collect finish reasons
- field collect_token_stats: bool = True#
Whether to collect token statistics
- field collect_tool_calls: bool = True#
Whether to collect tool call statistics
- field logging_aggregated_stats_interval: int = 100#
How often (every how many responses) to log aggregated response statistics. Default is 100.
- field save_individuals: bool = True#
Whether to save individual request statistics. If True, saves all individuals; if False, saves only aggregated stats.
- field stats_file_saving_interval: int | None = None#
How often (every how many responses) to save stats to a file. If None, stats are only saved via post_eval_hook.
- intercept_response(
- resp: AdapterResponse,
- context: AdapterGlobalContext,
Collect aggregated statistics from the response.
- post_eval_hook(
- context: AdapterGlobalContext,
Write collected response statistics to eval_factory_metrics.json.
- class nemo_evaluator.adapters.interceptors.SystemMessageInterceptor(
- params: Params,
Bases:
RequestInterceptorAdds or replaces system message in requests.
- pydantic model Params[source]#
Bases:
BaseLoggingParamsConfiguration parameters for system message interceptor.
- field system_message: str [Required]#
System message to add to requests
- intercept_request(
- ar: AdapterRequest,
- context: AdapterGlobalContext,
Function that will be called by AdapterServer on the way upstream.
This interceptor can modify the request but must return an AdapterRequest to continue the chain upstream.
- Parameters:
req – The adapter request to intercept
context – Global context containing server-level configuration
Ex.: This is used for request preprocessing, logging, etc.