Tool Call and Reasoning Parsing
Parse tool calls and reasoning out of model output into OpenAI-compatible tool_calls and reasoning_content
Parse tool calls and reasoning out of model output into OpenAI-compatible tool_calls and reasoning_content
Dynamo parses tool-call and reasoning markup out of raw model output and surfaces it as OpenAI-compatible tool_calls and reasoning_content on the response. Tool calling is controlled by the tool_choice and tools request parameters; reasoning parsing is enabled per-model with a reasoning parser.
There are two ways to parse, depending on whether the parser lives in Dynamo’s own registry or in an upstream engine frontend (vllm serve, sglang serve, or trtllm-serve).
Start with the Dynamo path. Fall back to the engine path only when Dynamo’s registry doesn’t list a parser for your model. For exactly which flags combine and which combinations don’t make sense, see Parser Configuration.
In vllm serve, sglang serve, and trtllm-serve, tool-call and reasoning parsing happen in each engine’s own frontend, with subtle behavioral differences across them. For performance, Dynamo orchestrates routing and tokenization and passes tokens directly to each engine, bypassing the engine’s OpenAI server to avoid duplicate work per request. So Dynamo implements parsing in its frontend as a framework-agnostic Rust layer — one tested OpenAI-compatible contract across vLLM, SGLang, and TRT-LLM, on a hot path that stays concurrent without a Python GIL bottleneck. The vllm/sglang chat processors (engine fallback) opt back into the engine’s own parser when Dynamo doesn’t ship one for your model.
logprobs so issues can be localized