For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Digest
  • Getting Started
    • Quickstart
    • Introduction
    • Local Installation
    • Building from Source
    • Kubernetes Deployment
    • Contribution Guide
  • Resources
    • Support Matrix
    • Feature Matrix
    • Release Artifacts
    • Examples
    • Glossary
  • Digest
    • NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes
    • DynoSim: Simulating the Pareto Frontier
    • Dynamo Day 0 support for TokenSpeed
    • Multi-Turn Agentic Harnesses
    • Full-Stack Optimizations for Agentic Inference
    • Flash Indexer: Inter-Galactic KV Routing
  • Kubernetes Deployment
    • API Reference
  • User Guides
    • Disaggregated Serving
    • KV Cache Aware Routing
    • KV Cache Offloading
    • Parser Configuration
    • Parser Engine Fallback
    • Tool Calling
    • Reasoning
    • Agents
    • Multimodal
    • Diffusion
    • LoRA Adapters
    • Fastokens Tokenizer
    • Observability (Local)
    • Fault Tolerance
    • Benchmarking
    • Writing Python Workers
    • Writing Python Unified Backends
    • Writing Rust Unified Backends
  • Backends
    • SGLang
    • TensorRT-LLM
    • vLLM
  • Components
    • Frontend
    • Router
    • Planner
    • Profiler
    • KVBM
  • Integrations
    • LMCache
    • FlexKV
    • KV Events for Custom Engines
  • Design Docs
    • Overall Architecture
    • Architecture Flow
    • Disaggregated Serving
    • Distributed Runtime
  • Documentation
    • Dynamo Docs Guide
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
Digest
On this page
  • The choices
  • The pairing rule
  • What does NOT make sense
  • Examples
  • Parser names and per-stage details
User Guides

Parser Configuration

How —dyn-chat-processor, —dyn-tool-call-parser, and —dyn-reasoning-parser fit together

||View as Markdown|
Previous

KVBM Guide

Next

Parser Engine Fallback

Dynamo turns a model’s raw tool-call and reasoning markup into structured tool_calls and reasoning_content. Two independent choices control how that parsing happens. This page is the single source of truth for which flags combine and which combinations don’t make sense. For the parser names themselves, follow the per-stage links at the bottom.

The choices

1. Who parses — --dyn-chat-processor (a frontend flag; default dynamo):

  • dynamo (default) — Dynamo’s framework-agnostic Rust parser. Works on every backend (vLLM, SGLang, TRT-LLM) and with disaggregated serving.
  • vllm / sglang — delegate parsing to that engine’s own Python parser (“engine fallback”). Use only when Dynamo does not ship a parser for your model.

2. Which parser — the flag name and where it goes depend on choice 1:

Chat processorParser flag(s) and where they goParses withDisaggregated servingBackends
dynamo (default)--dyn-tool-call-parser <name> and/or --dyn-reasoning-parser <name> — on the workerDynamo Rust frontendSupportedvLLM, SGLang, TRT-LLM
vllm--tool-call-parser <name> and/or --reasoning-parser <name> — on the frontendvLLM PythonNot yetvLLM
sglang--tool-call-parser <name> and/or --reasoning-parser <name> — on the frontendSGLang PythonNot yetSGLang

The pairing rule

  • The --dyn-* parser flags pair with the dynamo chat processor and go on the worker: --dyn-tool-call-parser, --dyn-reasoning-parser.
  • The bare --tool-call-parser / --reasoning-parser flags pair with vllm / sglang and go on the frontend.

Tool calling and reasoning are independent — set one, the other, or both — but always from the same family as your chat processor. You never mix the two families.

What does NOT make sense

CombinationWhy it’s wrong
--dyn-chat-processor dynamo + --tool-call-parser / --reasoning-parserThe bare flags drive the engine-fallback path; the default Dynamo path uses the --dyn- flags. Use --dyn-tool-call-parser / --dyn-reasoning-parser.
--dyn-chat-processor vllm/sglang + --dyn-tool-call-parser / --dyn-reasoning-parserThe --dyn- flags only drive Dynamo’s native parser; an engine processor reads its own --tool-call-parser / --reasoning-parser.
--dyn-chat-processor vllm/sglang + disaggregated servingEngine-fallback parsing does not support disaggregated serving. Use the default dynamo processor.
--dyn-chat-processor vllm/sglang on TRT-LLMTRT-LLM engine fallback is a work in progress. Use the default dynamo processor.
Reusing a parser name across familiesThe registries differ — e.g. Dynamo deepseek_v3 vs vLLM/SGLang deepseekv3, Dynamo nemotron3 vs vLLM nemotron_v3. Use the name from the registry that matches your chat processor.

Examples

Default (Dynamo-native) — the common case. The same --dyn-* flags work on every backend; pick one worker. The chat processor defaults to dynamo, so the frontend flag is optional:

$# Frontend — chat processor defaults to `dynamo`, so these two are identical:
$python -m dynamo.frontend
$python -m dynamo.frontend --dyn-chat-processor dynamo
$
$# Worker selects the Dynamo parsers — same flags on vLLM, SGLang, or TRT-LLM:
$python -m dynamo.vllm --model Qwen/Qwen3-0.6B \
> --dyn-tool-call-parser hermes --dyn-reasoning-parser qwen3
$python -m dynamo.sglang --model Qwen/Qwen3-0.6B \
> --dyn-tool-call-parser hermes --dyn-reasoning-parser qwen3
$python -m dynamo.trtllm --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B \
> --dyn-tool-call-parser hermes --dyn-reasoning-parser qwen3

Engine fallback — only when Dynamo lacks a parser for your model. Supported on vLLM and SGLang (not TRT-LLM); the parser flags go on the frontend and use the engine’s own parser names:

$# vLLM chat processor — frontend carries the parser flags, then launch the worker:
$python -m dynamo.frontend --dyn-chat-processor vllm --tool-call-parser hermes --reasoning-parser qwen3
$python -m dynamo.vllm --model Qwen/Qwen3-0.6B
$
$# SGLang chat processor
$python -m dynamo.frontend --dyn-chat-processor sglang --tool-call-parser qwen25 --reasoning-parser qwen3
$python -m dynamo.sglang --model Qwen/Qwen3-0.6B

Parser names and per-stage details

  • Tool calling: Tool Call Parsing (Dynamo) (native parser names).
  • Reasoning: Reasoning Parsing (Dynamo) (native parser names).
  • Engine fallback (vLLM / SGLang): Parser Engine Fallback.
  • Engine processors: vLLM Chat Processor and SGLang Chat Processor.
  • Every frontend flag: Frontend Configuration Reference.