Reasoning Parsing (Engine Fallback)

Use upstream vLLM or SGLang reasoning parsers when Dynamo does not ship one
View as Markdown

When Dynamo’s registry does not list a reasoning parser for your model, fall back to the upstream engine’s parser via a chat-processor swap, which keeps frontend tokenization and KV routing.

For Dynamo-native parsers, see Reasoning Parsing (Dynamo). For the equivalent tool-call fallback, see Tool Call Parsing (Engine Fallback).

Known Issue: Engine-fallback reasoning parsing does not currently work with disaggregated serving. Use the Dynamo-native reasoning parser for disaggregated deployments.

Configurations

Frontend flagsWorker flagsKV routingNotes
vLLM chat processor--dyn-chat-processor vllm --reasoning-parser <name>(none)YesParsing runs in vLLM’s Python preprocessor. See vLLM Chat Processor.
SGLang chat processor--dyn-chat-processor sglang --reasoning-parser <name>(none)YesParsing runs in SGLang’s Python preprocessor. See SGLang Chat Processor.

Upstream parser names come from the engine’s registry and may differ from Dynamo’s name for the same model (e.g., vLLM’s nemotron_v3 vs Dynamo’s nemotron3). They are pinned to the engine version shipped in the Dynamo container.

Examples

$# vLLM chat processor
$python -m dynamo.vllm ...
$python -m dynamo.frontend --dyn-chat-processor vllm --reasoning-parser deepseek_r1
$
$# SGLang chat processor
$python -m dynamo.sglang ...
$python -m dynamo.frontend --dyn-chat-processor sglang --reasoning-parser kimi_k25

See Also