nat.plugins.langchain.langsmith.langsmith_optimization_callback#
Attributes#
Classes#
Per-trial experiment projects with OTEL trace linking and prompt management. |
Module Contents#
- logger#
- class LangSmithOptimizationCallback( )#
Per-trial experiment projects with OTEL trace linking and prompt management.
Each optimizer trial gets its own experiment project linked to a shared dataset. OTEL traces are routed to per-trial projects via get_trial_project_name(), which also pre-creates the project with reference_dataset_id. After eval, OTEL runs are retroactively linked to dataset examples with feedback and parameter metadata.
- needs_root_span_ids = True#
- _client#
- _project#
- _experiment_prefix = 'NAT'#
- _dataset_name_hint = None#
- _build_base_name() str#
Build the base name used for datasets and run numbering.
Format:
Optimization Benchmark (<dataset>) (<project>)
- get_trial_project_name(trial_number: int) str#
Return the per-trial OTEL project name and pre-create it as an experiment.
Called by the parameter/prompt optimizer BEFORE the eval run starts. Pre-creates the project with reference_dataset_id so OTEL traces land in an experiment project (visible in Datasets & Experiments UI).
- _create_dataset_with_examples( ) None#
Create the LangSmith dataset and populate it with examples.
- Args:
items: List of
(item_id, question, expected)tuples.
- pre_create_experiment(dataset_items: list) None#
Create the dataset upfront (before any trials run).
Must be called BEFORE get_trial_project_name() so the dataset exists when per-trial projects are pre-created with reference_dataset_id. Accepts list[EvalInputItem] from the eval framework.
- classmethod _estimate_retry_budget(expected_count: int) tuple[int, float]#
Estimate the retry budget for OTEL run linking based on dataset size.
Uses the shared indexing constants from
langsmith_evaluation_callback(pipeline latency, throughput, retry delay) with a safety multiplier to scale the retry window proportionally.Formula:
indexing_time = pipeline_latency + (expected_count / throughput) total_budget = indexing_time × safety_multiplier max_retries = clamp(total_budget / retry_delay, min=10, max=60)
Items
Indexing Est.
×3 Safety
Max Retries
Total Budget
5
10.5 s
31.5 s
10 (floor)
100 s
150
25.0 s
75.0 s
10 (floor)
100 s
600
70.0 s
210.0 s
21
210 s
5 000
510.0 s
1 530.0 s
60 (cap)
600 s
Warning
Datasets above 5 000 items per trial may exceed the maximum retry window (600 s). Some runs may not be linked in the LangSmith UI, although all traces will have been delivered.
- Returns:
(max_retries, retry_delay) tuple for
_match_and_link_otel_runs.
- _link_otel_runs(
- trial_number: int,
- eval_result: Any,
- parameters: dict[str, Any] | None = None,
- prompt_commit_tags: dict[str, str] | None = None,
Link OTEL runs in the trial’s project to dataset examples and attach feedback.
- static _format_params(parameters: dict[str, Any]) dict[str, Any]#
Sanitize parameter names (dots->underscores) and round floats.
- static _humanize_param_name(param_name: str) str#
Convert ‘functions.email_phishing_analyzer.prompt’ to ‘Email Phishing Analyzer Prompt’.
- _get_prompt_repo_name(param_name: str) str#
Get or create a unique prompt repo name for this optimization run.
Format:
<project>-<param>-run-<N>e.g.aiq-shallow-researcher-full-optimization-system-prompt-run-1
- VALID_TEMPLATE_FORMATS#
- _JINJA2_MARKERS = ('{%', '{#')#
- _JINJA2_EXPR_KEYWORDS = ('| ', ' if ', ' else ', ' for ')#
- _MUSTACHE_MARKERS = ('{{#', '{{/', '{{>', '{{^')#
- classmethod _detect_template_format(text: str) str#
Auto-detect template format from prompt content.
- Detection priority (first match wins):
Jinja2 block/comment tags (
{%,{#) →"jinja2"Mustache section markers (
{{#,{{/,{{>,{{^) →"mustache"Jinja2 expression keywords inside
{{ }}(pipes, conditionals, loops) →"jinja2"Plain
{{ }}without keywords →"jinja2"(ambiguous with mustache, but Jinja2 is far more common in Python/LangChain prompts)No curly-brace templating detected →
"f-string"
Used as a fallback when
SearchSpace.prompt_formatis not explicitly set.
- classmethod _validate_template_format(fmt: str) str#
Validate that a template format string is supported.
Raises
ValueErrorwith the list of valid options if not.
- _resolve_template_format( ) str#
Resolve the LangChain template_format for a prompt.
- Priority:
Explicit
prompt_formatsfrom TrialResult (set viaSearchSpace.prompt_format)Auto-detection from prompt content
Supported values:
"f-string","jinja2","mustache".
- on_study_end(
- *,
- best_trial: nat.profiler.parameter_optimization.optimizer_callbacks.TrialResult,
- total_trials: int,