nat.plugins.langchain.eval.utils#

Attributes#

_MISSING

Functions#

`_import_from_dotted_path`(→ Any)	Import an attribute from a Python dotted path.
`eval_input_item_to_openevals_kwargs`(→ dict[str, Any])	Convert a NAT EvalInputItem to openevals keyword arguments.
`eval_input_item_to_run_and_example`(→ tuple[Any, Any])	Convert a NAT EvalInputItem to synthetic LangSmith Run and Example objects.
`_extract_field`(→ Any)	Extract a value from a nested dict using dot-notation.
`_handle_custom_schema_result`(...)	Handle a raw dict from a custom `output_schema` evaluator.
`_handle_list_result`(...)	Handle a bare list of results (e.g., from `create_json_match_evaluator`).
`_handle_evaluation_result`(...)	Handle an `EvaluationResult` object (from RunEvaluator classes).
`_handle_dict_result`(...)	Handle a plain dict result (from openevals / function evaluators).
`langsmith_result_to_eval_output_item`(...)	Convert a LangSmith/openevals evaluation result to a NAT EvalOutputItem.

Module Contents#

_MISSING#

_import_from_dotted_path( dotted_path: str, *, label: str = 'object', ) → Any#

Import an attribute from a Python dotted path.

Resolves 'module.path.attribute' into the corresponding Python object but does not instantiate classes. Used by langsmith_custom_evaluator._import_evaluator and langsmith_judge._build_create_kwargs (for output_schema).

Args:: dotted_path: Full Python dotted path (e.g., 'my_pkg.module.MyClass'). label: Human-readable label for error messages (e.g., 'evaluator', 'output_schema').
Returns:: The imported attribute.
Raises:: ValueError: If the path does not contain a module/attribute separator. ImportError: If the module cannot be imported. AttributeError: If the attribute cannot be found in the module.

eval_input_item_to_openevals_kwargs( item: nat.data_models.evaluator.EvalInputItem, extra_fields: dict[str, str] | None = None, ) → dict[str, Any]#

Convert a NAT EvalInputItem to openevals keyword arguments.

Maps NAT evaluation data to the (inputs, outputs, reference_outputs) convention used by openevals evaluators. When extra_fields is provided, additional values are pulled from item.full_dataset_entry and included as extra keyword arguments (e.g., context, plan).

Args:: item: NAT evaluation input item. extra_fields: Mapping of kwarg names to dataset field names, looked up in item.full_dataset_entry.
Returns:: Dictionary with at least inputs, outputs, and reference_outputs keys, plus any extra fields.
Raises:: ValueError: If an extra_fields key conflicts with inputs, outputs, or reference_outputs. KeyError: If a requested extra field is not present in the dataset entry.

eval_input_item_to_run_and_example( item: nat.data_models.evaluator.EvalInputItem, ) → tuple[Any, Any]#

Convert a NAT EvalInputItem to synthetic LangSmith Run and Example objects.

Creates minimal Run and Example instances with the data that most LangSmith evaluators need (inputs, outputs, expected outputs).

Args:: item: NAT evaluation input item.
Returns:: Tuple of (Run, Example) instances.

_extract_field(data: dict, field_path: str) → Any#

Extract a value from a nested dict using dot-notation.

Args:: data: The dictionary to extract from. field_path: Dot-separated path (e.g., 'analysis.score').
Returns:: The extracted value.
Raises:: KeyError: If any segment of the path is missing. TypeError: If an intermediate value is not a dict.

_handle_custom_schema_result( item_id: Any, result: dict, score_field: str, ) → nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#

Handle a raw dict from a custom output_schema evaluator.

The score is extracted using _extract_field() with dot-notation.

_handle_list_result( item_id: Any, result: list, ) → nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#

Handle a bare list of results (e.g., from create_json_match_evaluator).

Scores are averaged; per-item details are preserved in reasoning.

_handle_evaluation_result( item_id: Any, result: langsmith.evaluation.evaluator.EvaluationResult, ) → nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#: Handle an EvaluationResult object (from RunEvaluator classes).

_handle_dict_result( item_id: Any, result: dict, ) → nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#: Handle a plain dict result (from openevals / function evaluators).

langsmith_result_to_eval_output_item( item_id: Any, result: dict | list | Any, score_field: str | None = None, ) → nat.plugins.eval.data_models.evaluator_io.EvalOutputItem#

Convert a LangSmith/openevals evaluation result to a NAT EvalOutputItem.

Dispatches to specialised handlers based on the result type:

Custom output_schema dict (when score_field is set)
Bare list (e.g., create_json_match_evaluator)
EvaluationResults batch (dict with "results" key)
EvaluationResult object (from RunEvaluator classes)
Plain dict (from openevals / function evaluators)
Fallback for unexpected types

Args:: item_id: The id from the corresponding EvalInputItem. result: The evaluation result. score_field: Dot-notation path to the score in custom output_schema results (e.g., 'analysis.score').
Returns:: NAT EvalOutputItem with score and reasoning.