Tool Call Parsing (Engine Fallback) | NVIDIA Dynamo Documentation

When Dynamo’s registry does not list a tool-call parser for your model, fall back to the upstream engine’s parser via a chat-processor swap, which keeps frontend tokenization and KV routing.

For Dynamo-native parsers, see Tool Call Parsing (Dynamo). For the equivalent reasoning fallback, see Reasoning Parsing (Engine Fallback).

Known Issue: Engine-fallback tool call parsing does not currently work with disaggregated serving. Use the Dynamo-native tool call parser for disaggregated deployments.

Configurations

	Frontend flags	Worker flags	KV routing	Notes
vLLM chat processor	`--dyn-chat-processor vllm --tool-call-parser <name>`	(none)	Yes	Parsing runs in vLLM’s Python preprocessor. See vLLM Chat Processor.
SGLang chat processor	`--dyn-chat-processor sglang --tool-call-parser <name>`	(none)	Yes	Parsing runs in SGLang’s Python preprocessor. See SGLang Chat Processor.

Upstream parser names come from the engine’s registry and may differ from Dynamo’s name for the same model (e.g., SGLang’s deepseekv3 vs Dynamo’s deepseek_v3). They are pinned to the engine version shipped in the Dynamo container.

Examples

$ # vLLM chat processor
$ python -m dynamo.vllm ...
$ python -m dynamo.frontend --dyn-chat-processor vllm --tool-call-parser hermes
$ 
$ # SGLang chat processor
$ python -m dynamo.sglang ...
$ python -m dynamo.frontend --dyn-chat-processor sglang --tool-call-parser kimi_k2

Configurations

Examples

See Also