nat.plugins.langchain.eval.utils#

Attributes#

Functions#

_import_from_dotted_path(→ Any)

Import an attribute from a Python dotted path.

eval_input_item_to_openevals_kwargs(→ dict[str, Any])

Convert a NAT EvalInputItem to openevals keyword arguments.

eval_input_item_to_run_and_example(→ tuple[Any, Any])

Convert a NAT EvalInputItem to synthetic LangSmith Run and Example objects.

_extract_field(→ Any)

Extract a value from a nested dict using dot-notation.

_handle_custom_schema_result(...)

Handle a raw dict from a custom output_schema evaluator.

_handle_list_result(...)

Handle a bare list of results (e.g., from create_json_match_evaluator).

_handle_evaluation_result(...)

Handle an EvaluationResult object (from RunEvaluator classes).

_handle_dict_result(...)

Handle a plain dict result (from openevals / function evaluators).

langsmith_result_to_eval_output_item(...)

Convert a LangSmith/openevals evaluation result to a NAT EvalOutputItem.

Module Contents#

_MISSING#
_import_from_dotted_path(
dotted_path: str,
*,
label: str = 'object',
) Any#

Import an attribute from a Python dotted path.

Resolves 'module.path.attribute' into the corresponding Python object but does not instantiate classes. Used by langsmith_custom_evaluator._import_evaluator and langsmith_judge._build_create_kwargs (for output_schema).

Args:

dotted_path: Full Python dotted path (e.g., 'my_pkg.module.MyClass'). label: Human-readable label for error messages (e.g., 'evaluator', 'output_schema').

Returns:

The imported attribute.

Raises:

ValueError: If the path does not contain a module/attribute separator. ImportError: If the module cannot be imported. AttributeError: If the attribute cannot be found in the module.

eval_input_item_to_openevals_kwargs(
item: nat.data_models.evaluator.EvalInputItem,
extra_fields: dict[str, str] | None = None,
) dict[str, Any]#

Convert a NAT EvalInputItem to openevals keyword arguments.

Maps NAT evaluation data to the (inputs, outputs, reference_outputs) convention used by openevals evaluators. When extra_fields is provided, additional values are pulled from item.full_dataset_entry and included as extra keyword arguments (e.g., context, plan).

Args:

item: NAT evaluation input item. extra_fields: Mapping of kwarg names to dataset field names, looked up in item.full_dataset_entry.

Returns:

Dictionary with at least inputs, outputs, and reference_outputs keys, plus any extra fields.

Raises:

ValueError: If an extra_fields key conflicts with inputs, outputs, or reference_outputs. KeyError: If a requested extra field is not present in the dataset entry.

eval_input_item_to_run_and_example(
item: nat.data_models.evaluator.EvalInputItem,
) tuple[Any, Any]#

Convert a NAT EvalInputItem to synthetic LangSmith Run and Example objects.

Creates minimal Run and Example instances with the data that most LangSmith evaluators need (inputs, outputs, expected outputs).

Args:

item: NAT evaluation input item.

Returns:

Tuple of (Run, Example) instances.

_extract_field(data: dict, field_path: str) Any#

Extract a value from a nested dict using dot-notation.

Args:

data: The dictionary to extract from. field_path: Dot-separated path (e.g., 'analysis.score').

Returns:

The extracted value.

Raises:

KeyError: If any segment of the path is missing. TypeError: If an intermediate value is not a dict.

_handle_custom_schema_result(
item_id: Any,
result: dict,
score_field: str,
) nat.data_models.evaluator.EvalOutputItem#

Handle a raw dict from a custom output_schema evaluator.

The score is extracted using _extract_field() with dot-notation.

_handle_list_result(
item_id: Any,
result: list,
) nat.data_models.evaluator.EvalOutputItem#

Handle a bare list of results (e.g., from create_json_match_evaluator).

Scores are averaged; per-item details are preserved in reasoning.

_handle_evaluation_result(
item_id: Any,
result: langsmith.evaluation.evaluator.EvaluationResult,
) nat.data_models.evaluator.EvalOutputItem#

Handle an EvaluationResult object (from RunEvaluator classes).

_handle_dict_result(
item_id: Any,
result: dict,
) nat.data_models.evaluator.EvalOutputItem#

Handle a plain dict result (from openevals / function evaluators).

langsmith_result_to_eval_output_item(
item_id: Any,
result: dict | list | Any,
score_field: str | None = None,
) nat.data_models.evaluator.EvalOutputItem#

Convert a LangSmith/openevals evaluation result to a NAT EvalOutputItem.

Dispatches to specialised handlers based on the result type:

  • Custom output_schema dict (when score_field is set)

  • Bare list (e.g., create_json_match_evaluator)

  • EvaluationResults batch (dict with "results" key)

  • EvaluationResult object (from RunEvaluator classes)

  • Plain dict (from openevals / function evaluators)

  • Fallback for unexpected types

Args:

item_id: The id from the corresponding EvalInputItem. result: The evaluation result. score_field: Dot-notation path to the score in custom output_schema results (e.g., 'analysis.score').

Returns:

NAT EvalOutputItem with score and reasoning.