For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Digest
  • Getting Started
    • Quickstart
    • Introduction
    • Local Installation
    • Building from Source
    • Kubernetes Deployment
    • Contribution Guide
  • Resources
    • Support Matrix
    • Feature Matrix
    • Release Artifacts
    • Examples
    • Glossary
  • Digest
    • Dynamo Day 0 support for TokenSpeed
    • Multi-Turn Agentic Harnesses
    • Full-Stack Optimizations for Agentic Inference
    • Flash Indexer: Inter-Galactic KV Routing
  • Kubernetes Deployment
  • User Guides
    • Disaggregated Serving
    • KV Cache Aware Routing
    • KV Cache Offloading
    • Tool Calling
    • Reasoning
      • Reasoning Parsing (Dynamo)
      • Reasoning Parsing (Engine Fallback)
    • Agents
    • Multimodal
    • Diffusion
    • LoRA Adapters
    • Observability (Local)
    • Fault Tolerance
    • Benchmarking
    • Writing Python Workers
  • Backends
    • SGLang
    • TensorRT-LLM
    • vLLM
  • Components
    • Frontend
    • Router
    • Planner
    • Profiler
    • KVBM
  • Integrations
    • LMCache
    • SGLang HiCache
    • FlexKV
    • KV Events for Custom Engines
  • Design Docs
    • Overall Architecture
    • Architecture Flow
    • Disaggregated Serving
    • Distributed Runtime
  • Documentation
    • Dynamo Docs Guide
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
Digest
On this page
  • Configurations
  • Examples
  • See Also
User GuidesReasoning

Reasoning Parsing (Engine Fallback)

Use upstream vLLM or SGLang reasoning parsers when Dynamo does not ship one
||View as Markdown|
Previous

Reasoning Parsing (Dynamo)

Next

Agents

When Dynamo’s registry does not list a reasoning parser for your model, fall back to the upstream engine’s parser via a chat-processor swap, which keeps frontend tokenization and KV routing.

For Dynamo-native parsers, see Reasoning Parsing (Dynamo). For the equivalent tool-call fallback, see Tool Call Parsing (Engine Fallback).

Known Issue: Engine-fallback reasoning parsing does not currently work with disaggregated serving (support coming soon). Use the Dynamo-native reasoning parser for disaggregated deployments today.

Configurations

Frontend flagsWorker flagsKV routingNotes
vLLM chat processor--dyn-chat-processor vllm --reasoning-parser <name>(none)YesParsing runs in vLLM’s Python preprocessor. See vLLM Chat Processor.
SGLang chat processor--dyn-chat-processor sglang --reasoning-parser <name>(none)YesParsing runs in SGLang’s Python preprocessor. See SGLang Chat Processor.
TRTLLM chat processor(work in progress)(work in progress)—Engine-fallback support for TRTLLM is in progress. Use the Dynamo-native reasoning parser for TRTLLM today.

--dyn-reasoning-parser selects the Dynamo-native parser path, while --reasoning-parser selects the engine fallback (vLLM or SGLang) parser path. The accepted values for each flag come from a different registry and may differ slightly based on the definitions from each framework (e.g., vLLM’s nemotron_v3 vs Dynamo’s nemotron3).

Examples

$# vLLM chat processor
$python -m dynamo.vllm ...
$python -m dynamo.frontend --dyn-chat-processor vllm --reasoning-parser deepseek_r1
$
$# SGLang chat processor
$python -m dynamo.sglang ...
$python -m dynamo.frontend --dyn-chat-processor sglang --reasoning-parser kimi_k25

See Also

  • Reasoning Parsing (Dynamo) — Dynamo-native parsers and common pairings
  • Tool Call Parsing (Engine Fallback) — Equivalent fallback for tool-call parsers
  • vLLM Chat Processor — vLLM chat-processor details
  • SGLang Chat Processor — SGLang chat-processor details
  • Frontend Configuration Reference — Full CLI flag reference