nat.atof#
Pydantic models for the Agentic Trajectory Observability Format (ATOF).
ATOF is a JSON-Lines wire format for agent runtime event streams. These
models define the two event kinds (ScopeEvent, MarkEvent), the
behavioral flag enum (Flags), and the canonical category
vocabulary (Category).
See atof-event-format.md for the core wire format. For payload
extraction, see nat.atof.extractors (schema-map-driven LLM
extractors for OpenAI, Anthropic, and Gemini). For the open question
of how producers should declare their schemas to consumers (a future
spec revision), see the DESIGN NOTE block at the top of
nat.atof.schemas.
Submodules#
Attributes#
Discriminated union of the 2 ATOF event kinds, keyed on |
|
Classes#
Point-in-time checkpoint (spec §3.2). |
|
Scope lifecycle event (spec §3.1). |
|
Extracts ATIF-relevant fields from an |
|
Classifies a mark event payload as either a role-lifted step |
|
Declarative description of where ATIF-relevant fields live within a |
|
Generic LLM payload extractor driven by a |
|
Extracts a serialized result string from a |
|
Canonical behavioral flags for scope events (spec §2.1). |
Functions#
|
Install the Anthropic Messages JSON Schema and LLM extractor. |
Install the Gemini generateContent JSON Schema and LLM extractor. |
|
|
Register an LLM payload extractor for |
|
Register a mark payload extractor for |
|
Register a tool payload extractor for |
|
Read an ATOF JSON-Lines file and return a list of typed Event objects. |
|
Write a list of Event objects to a JSON-Lines file. |
|
Return the registered schema for |
|
Register a JSON Schema for ATOF events whose |
Package Contents#
- Category#
- Event#
Discriminated union of the 2 ATOF event kinds, keyed on
kind(spec §3).
- class MarkEvent(/, **data: Any)#
Bases:
_EventBasePoint-in-time checkpoint (spec §3.2).
Unpaired (no start/end semantics). MAY carry
category+category_profileto indicate the kind of work the checkpoint relates to; when both are absent, the mark is a generic named timestamp. Does NOT carryscope_categoryorattributes.Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- kind: Literal['mark'] = 'mark'#
- _validate_category_subtype_coherence() Self#
- _reject_scope_only_fields() Self#
- class ScopeEvent(/, **data: Any)#
Bases:
_EventBaseScope lifecycle event (spec §3.1).
A single scope span produces two
ScopeEventinstances sharing the sameuuid: one withscope_category: "start"when the scope is pushed onto the active scope stack, and one withscope_category: "end"when the scope is popped.Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- kind: Literal['scope'] = 'scope'#
- scope_category: Literal['start', 'end'] = None#
- _validate_category_subtype_coherence() Self#
- ANTHROPIC_MESSAGES_V1_MAP#
- GEMINI_GENERATE_CONTENT_V1_MAP#
- OPENAI_CHAT_COMPLETIONS_V1_MAP#
- class LlmPayloadExtractor#
Bases:
ProtocolExtracts ATIF-relevant fields from an
llmscope event’sdata.Implementations MUST be pure functions over
data— no side effects, no network, no filesystem access. Return empty collections or strings when a field is not present; the converter distinguishes “legitimately empty” from “shape mismatch” at the dispatch layer.- extract_input_messages(data: Any) list[dict[str, Any]]#
Return the chat history messages from an LLM scope-start payload.
Each message SHOULD carry
roleandcontentkeys;contentMAY be a string or a multimodal part list (ATIF v1.6+).
- class MarkPayloadExtractor#
Bases:
ProtocolClassifies a mark event payload as either a role-lifted step (user/system/agent) or an opaque system step.
- extract_role_and_content(data: Any) tuple[str, Any] | None#
If the mark should lift to an ATIF step with a specific
source, return(source, content). Otherwise returnNoneto fall through to the opaque-system-step path.sourceMUST be one of"user","system","agent".contentis passed through as-is (string or part list).
- class SchemaMap#
Declarative description of where ATIF-relevant fields live within a provider’s LLM payload, plus optional hooks for irreducible transforms.
A
SchemaMapcaptures three things:Field paths — dotted paths (with numeric list indices) telling the engine where to find input messages, output text, and output tool calls. Each field accepts a tuple of candidate paths; the engine tries them in order and uses the first hit.
Per-tool-call sub-paths — for providers whose tool-call shape fits the OpenAI flat-or-nested convention. Each tool call is a dict; these paths name where ID/name/arguments live within that dict.
Optional hooks — escape hatches for the three transforms that can’t be expressed declaratively:
normalize_input_messages: inputdata→ ATIF-shaped message list. Use when content is polymorphic (Anthropic string-or-blocks, Gemini parts) and a single field-path can’t flatten it.normalize_output_message: outputdata→(text, tool_calls)pair. Use when output text and tool calls coexist in the same polymorphic structure (Anthropiccontentblocks).transform_tool_call: per-call dict adapter. Use when tool calls don’t carry an ID (Gemini synthesizes from name+index) or use non-OpenAI nesting.
Hooks always win over paths. If
normalize_output_messageis set, the engine ignoresoutput_text_pathsandoutput_tool_calls_paths.Pure-paths providers (OpenAI) leave the hooks at
None. Mixed providers (Anthropic, Gemini) use one or two hooks.- Parameters:
name – Schema name (e.g.
"openai/chat-completions").version – Schema version string.
input_messages_paths – Candidate paths to the input messages array.
output_text_paths – Candidate paths to the output assistant text.
output_tool_calls_paths – Candidate paths to the output tool-calls array.
tool_call_id_paths – Candidate sub-paths for tool-call ID.
tool_call_name_paths – Candidate sub-paths for tool-call function name.
tool_call_args_paths – Candidate sub-paths for tool-call arguments.
tool_call_args_parse_json – When True, parse string arguments as JSON.
role_aliases – Map of provider role values to canonical role values (e.g.,
{"model": "assistant"}for Gemini). Applied to messages extracted via field paths; hooks bypass this.normalize_input_messages – Optional hook overriding path-based input extraction. Signature:
(data) -> list[{"role", "content", ...}].normalize_output_message – Optional hook overriding path-based output extraction. Signature:
(data) -> (text, tool_calls).transform_tool_call – Optional per-call adapter. Signature:
(raw_call_dict, index) -> ATIF-shaped {"tool_call_id", "function_name", "arguments"}. When set, replaces the per-tool-call path resolution entirely.
- role_aliases: collections.abc.Mapping[str, str]#
- class SchemaMapLlmExtractor(schema_map: SchemaMap)#
Generic LLM payload extractor driven by a
SchemaMap.Implements
LlmPayloadExtractorby routing extraction through the map’s hooks (when set) or its declarative field paths (otherwise). A single instance per(name, version)is the intended pattern; register it withregister_llm_extractor().- schema_map#
- class ToolPayloadExtractor#
Bases:
ProtocolExtracts a serialized result string from a
toolscope-end payload.
- register_anthropic_messages_v1() None#
Install the Anthropic Messages JSON Schema and LLM extractor.
Idempotent — safe to call multiple times. Registers
anthropic/messages@1in bothSCHEMA_REGISTRY(validation) andLLM_EXTRACTOR_REGISTRY(extraction). Call this once at process startup before invoking the converter on Anthropic-shaped payloads.
- register_gemini_generate_content_v1() None#
Install the Gemini generateContent JSON Schema and LLM extractor.
Idempotent — safe to call multiple times. Registers
gemini/generate-content@1in bothSCHEMA_REGISTRYandLLM_EXTRACTOR_REGISTRY. Call this once at process startup before invoking the converter on Gemini-shaped payloads.
- register_llm_extractor(
- name: str,
- version: str,
- extractor: LlmPayloadExtractor,
Register an LLM payload extractor for
(name, version).
- register_mark_extractor(
- name: str,
- version: str,
- extractor: MarkPayloadExtractor,
Register a mark payload extractor for
(name, version).
- register_tool_extractor(
- name: str,
- version: str,
- extractor: ToolPayloadExtractor,
Register a tool payload extractor for
(name, version).
- class Flags#
Bases:
enum.StrEnumCanonical behavioral flags for scope events (spec §2.1).
Each flag describes the exceptional runtime property of a scope; absence means the documented default applies.
Initialize self. See help(type(self)) for accurate signature.
- PARALLEL = 'parallel'#
- RELOCATABLE = 'relocatable'#
- STATEFUL = 'stateful'#
- STREAMING = 'streaming'#
- REMOTE = 'remote'#
- read_jsonl(path: str | pathlib.Path) list[nat.atof.events.Event]#
Read an ATOF JSON-Lines file and return a list of typed Event objects.
Each line is parsed as a JSON object and validated against the Event discriminated union. Blank lines are skipped. Events are returned sorted by
.ts_micros(the normalized int-microsecond timestamp, spec §5.1) so downstream consumers get a stable ordering across mixed str/int timestamp streams.
- write_jsonl(
- events: list[nat.atof.events.Event],
- path: str | pathlib.Path,
Write a list of Event objects to a JSON-Lines file.
Each event is serialized as a single JSON line. The file ends with a trailing newline. Optional fields with
Nonevalues are emitted as explicitnullon the wire (matching the spec wire envelope example in atof-event-format.md §1).