nat.plugins.langchain.eval.langsmith_judge#
Attributes#
Classes#
LLM-as-judge evaluator powered by openevals. |
Functions#
|
Resolve a prompt name to the actual prompt string. |
|
Assemble keyword arguments for |
|
Register an LLM-as-judge evaluator with NAT. |
Module Contents#
- logger#
- _resolve_prompt(prompt_value: str) str#
Resolve a prompt name to the actual prompt string.
Prompt names are resolved dynamically by convention: the short name is uppercased and suffixed with
_PROMPTto form the constant name inopenevals.prompts(e.g.,'correctness'->CORRECTNESS_PROMPT).If the name doesn’t match a constant in
openevals.prompts, it is treated as a literal prompt template string (e.g., a custom f-string).- Args:
prompt_value: A short prompt name (e.g.,
'correctness') or a literal prompt template string.- Returns:
The resolved prompt string.
- class LangSmithJudgeConfig#
Bases:
nat.data_models.evaluator.EvaluatorBaseConfig,nat.data_models.retry_mixin.RetryMixin,nat.plugins.langchain.eval.langsmith_evaluator.LangSmithExtraFieldsMixinLLM-as-judge evaluator powered by openevals.
Uses a prebuilt or custom prompt with a judge LLM to score workflow outputs. Prebuilt prompt names (e.g.,
'correctness','hallucination') are resolved from openevals automatically.Common
create_async_llm_as_judgeparameters are exposed as typed fields for discoverability and validation. Any additional / future parameters can be forwarded via thejudge_kwargspass-through dict.Important: The judge LLM must support structured output (JSON schema mode via
with_structured_output). Models that do not support structured output will produce parsing errors and zero scores. Verify that your chosen model supports this capability before use.- llm_name: nat.data_models.component_ref.LLMRef = None#
- _validate_scoring() LangSmithJudgeConfig#
- _build_create_kwargs(
- config: LangSmithJudgeConfig,
- resolved_prompt: str,
- judge_llm: Any,
Assemble keyword arguments for
openevals.create_async_llm_as_judge.Typed config fields are added first, then optional fields are merged only when set. Finally,
judge_kwargsis merged with overlap detection so that users cannot accidentally shadow typed fields.- Args:
config: The judge evaluator configuration. resolved_prompt: The prompt string, already resolved from a short name or left as-is for custom templates. judge_llm: The LLM instance to use as the judge.
- Returns:
Dictionary of keyword arguments ready for
create_async_llm_as_judge.- Raises:
ValueError: If
judge_kwargskeys overlap with typed fields.
- async register_langsmith_judge(
- config: LangSmithJudgeConfig,
- builder: nat.builder.builder.EvalBuilder,
Register an LLM-as-judge evaluator with NAT.