Reasoning Parsing (Engine Fallback)

Use upstream vLLM or SGLang reasoning parsers when Dynamo does not ship one
View as Markdown

When Dynamo’s registry does not list a reasoning parser for your model, fall back to the upstream engine’s parser via a chat-processor swap, which keeps frontend tokenization and KV routing.

For Dynamo-native parsers, see Reasoning Parsing (Dynamo). For the equivalent tool-call fallback, see Tool Call Parsing (Engine Fallback).

Known Issue: Engine-fallback reasoning parsing does not currently work with disaggregated serving (support coming soon). Use the Dynamo-native reasoning parser for disaggregated deployments today.

Configurations

Frontend flagsWorker flagsKV routingNotes
vLLM chat processor--dyn-chat-processor vllm --reasoning-parser <name>(none)YesParsing runs in vLLM’s Python preprocessor. See vLLM Chat Processor.
SGLang chat processor--dyn-chat-processor sglang --reasoning-parser <name>(none)YesParsing runs in SGLang’s Python preprocessor. See SGLang Chat Processor.
TRTLLM chat processor(work in progress)(work in progress)Engine-fallback support for TRTLLM is in progress. Use the Dynamo-native reasoning parser for TRTLLM today.

--dyn-reasoning-parser selects the Dynamo-native parser path, while --reasoning-parser selects the engine fallback (vLLM or SGLang) parser path. The accepted values for each flag come from a different registry and may differ slightly based on the definitions from each framework (e.g., vLLM’s nemotron_v3 vs Dynamo’s nemotron3).

Examples

$# vLLM chat processor
$python -m dynamo.vllm ...
$python -m dynamo.frontend --dyn-chat-processor vllm --reasoning-parser deepseek_r1
$
$# SGLang chat processor
$python -m dynamo.sglang ...
$python -m dynamo.frontend --dyn-chat-processor sglang --reasoning-parser kimi_k25

See Also