Tool Call Parsing (Engine Fallback)

Use upstream vLLM or SGLang tool-call parsers when Dynamo does not ship one

View as Markdown

When Dynamo’s registry does not list a tool-call parser for your model, fall back to the upstream engine’s parser via a chat-processor swap, which keeps frontend tokenization and KV routing.

For Dynamo-native parsers, see Tool Call Parsing (Dynamo). For the equivalent reasoning fallback, see Reasoning Parsing (Engine Fallback).

Known Issue: Engine-fallback tool call parsing does not currently work with disaggregated serving. Use the Dynamo-native tool call parser for disaggregated deployments.

Configurations

Frontend flagsWorker flagsKV routingNotes
vLLM chat processor--dyn-chat-processor vllm --tool-call-parser <name>(none)YesParsing runs in vLLM’s Python preprocessor. See vLLM Chat Processor.
SGLang chat processor--dyn-chat-processor sglang --tool-call-parser <name>(none)YesParsing runs in SGLang’s Python preprocessor. See SGLang Chat Processor.

Upstream parser names come from the engine’s registry and may differ from Dynamo’s name for the same model (e.g., SGLang’s deepseekv3 vs Dynamo’s deepseek_v3). They are pinned to the engine version shipped in the Dynamo container.

Examples

$# vLLM chat processor
$python -m dynamo.vllm ...
$python -m dynamo.frontend --dyn-chat-processor vllm --tool-call-parser hermes
$
$# SGLang chat processor
$python -m dynamo.sglang ...
$python -m dynamo.frontend --dyn-chat-processor sglang --tool-call-parser kimi_k2

See Also