For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Digest
  • Getting Started
    • Quickstart
    • Introduction
    • Local Installation
    • Building from Source
    • Kubernetes Deployment
    • Contribution Guide
  • Resources
    • Support Matrix
    • Feature Matrix
    • Release Artifacts
    • Examples
    • Glossary
  • Digest
    • NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes
    • DynoSim: Simulating the Pareto Frontier
    • Dynamo Day 0 support for TokenSpeed
    • Multi-Turn Agentic Harnesses
    • Full-Stack Optimizations for Agentic Inference
    • Flash Indexer: Inter-Galactic KV Routing
  • Kubernetes Deployment
  • Feature Guides
    • KV Cache Aware Routing
    • Disaggregated Serving
    • KV Cache Offloading
    • Benchmarking
    • Tool Calling & Reasoning Parsing
      • Tool Call Parsing (Dynamo)
      • Reasoning Parsing (Dynamo)
      • Parser Engine Fallback
      • Parser Configuration
      • Tool Calling Probe Snapshot for Dynamo 1.2
      • Troubleshooting Tool Calls
    • Fault Tolerance
    • Observability (Local)
    • Inference Simulation
    • Agents
    • LoRA Adapters
    • Multimodal
    • Diffusion
    • Fastokens Tokenizer
  • Backends
    • SGLang
    • TensorRT-LLM
    • vLLM
  • Components
    • Frontend
    • Router
    • Planner
    • Profiler
    • KVBM
  • Integrations
  • Design Docs
    • Overall Architecture
    • Architecture Flow
    • Disaggregated Serving
    • Distributed Runtime
  • Documentation
    • Dynamo Docs Guide
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
Digest
Feature GuidesTool Calling & Reasoning Parsing

Tool Calling Probe Snapshot for Dynamo 1.2

Static release snapshot of tool-calling probe results across supported model families

||View as Markdown|
Previous

Parser Configuration

Next

Troubleshooting Tool Calls

This page captures a one-time Dynamo 1.2.0 release snapshot from the tool-calling probe harness generated on 2026-06-05 at 07:24 UTC. It is not a live dashboard.

Failures are non-passing probe requests, and lower is better. The same scenario can contribute separate failures for streaming and non-streaming request modes. Dynamo errors counts Dynamo/parser/API-contract failures, including boundary cases. It also counts Dynamo runtime or endpoint/deployment failures where the request timed out before a usable OpenAI response was returned. Other errors counts engine/model behavior and mixed/needs inspection failures. Issue notes use the probe classifier:

  • Dynamo/parser likely: raw model-native tool-call syntax leaked into the OpenAI response instead of structured tool_calls, final assistant text was routed into reasoning output, delimiter-like literal text was not preserved in a structured argument, or the parser/API contract was otherwise not satisfied.
  • Engine/model behavior likely: the endpoint returned a response, but the model behavior did not satisfy the requested tool workflow.
  • Endpoint/deployment: the request timed out before a usable response. These are counted as Dynamo runtime failures in this static release table.
  • Mixed/needs inspection: raw request/response details need follow-up before assigning ownership.

Some current-main rows were run with a different number of probes than the Dynamo 1.2.0 snapshot. Compare each failures / total count directly instead of treating every row as an exact A/B pass-rate comparison.

The release-note cells below are based on the failed request and response artifacts for both Dynamo 1.2.0 and current main.

With this classification, Dynamo runtime/parser/API failures improve on Kimi K2.6, GLM 5.1, and Qwen3.6-35B-A3B. MiniMax 2.7 improves in total failures, but its remaining parser-boundary failure count is unchanged.

ModelTool-call formatDynamo 1.2.0 releaseCurrent mainRelease notes
TotalDynamo errorsOther errorsTotalDynamo errorsOther errorsCurrent failuresImprovement from 1.2 to main
Kimi K2.6Kimi tool-call and reasoning format22 / 362112 / 3602Current main only fails a multi-step search-and-crawl workflow in streaming and non-streaming modes. The model returns no structured tool calls and asks for endpoint clarification instead of executing the workflow. No raw marker leakage was observed in current main.Dynamo 1.2.0 had 18 parser/API-boundary failures and three endpoint timeouts. Model-native tool-call syntax appeared in reasoning instead of structured tool_calls, and some final assistant text was routed away from assistant content. Current main removes those Dynamo failures and leaves two model-workflow failures.
DeepSeek V4 ProDeepSeek tool-call and reasoning format0 / 46000 / 4600No failures in the captured current-main run.No change needed. Dynamo 1.2.0 and current main are both clean.
GLM 5.1GLM tool-call format4 / 48403 / 4830Current main still fails delimiter-literal preservation in streaming and non-streaming modes because delimiter-looking text is not preserved in the structured argument. One non-streaming no-tools request also timed out.Current main improves from 4 to 3 Dynamo/runtime failures by removing a Dynamo 1.2.0 timeout in the multi-step search-and-crawl workflow. The delimiter-string preservation issue remains.
MiniMax 2.7MiniMax tool-call format8 / 46264 / 4622Current main has four failures. A simple arithmetic auto-tool prompt answers in text instead of producing the requested structured tool call in streaming and non-streaming modes. A delimiter-like literal string prompt returns a structured tool call in both modes, but the marker-looking text inside the argument is not preserved exactly; this is counted as a parser/API-boundary failure.Current main now uses the full 46-probe coverage and improves from 8 failures to 4. The multi-step tool-loop workflow and context echo auto-tool prompt that failed in Dynamo 1.2.0 now pass. Dynamo/parser-boundary failures remain at 2, while other failures drop from 6 to 2.
Gemma 4 31B ITGemma tool-call and reasoning format2 / 48202 / 4620Current main still fails delimiter-literal preservation in streaming and non-streaming modes. The response produces a structured tool call, but the SQL string is truncated before the expected literal marker text.No observed failure-count improvement. Dynamo 1.2.0 and current main have the same failure class, with fewer probes in the current-main run.
Qwen3.6-35B-A3BQwen tool-call format1 / 48100 / 4600No failures in the captured current-main run.Current main is clean. The Dynamo 1.2.0 non-streaming timeout in the multi-step search-and-crawl workflow is gone.
GPT-OSS 120BGPT-OSS tool-call format14 / 4821214 / 48212Current main still has 14 failures. Multi-tool and parallel-tool prompts produce only one structured tool call, a simple calculation prompt answers in text instead of calling the tool, a marker-literal string argument omits the requested marker-like text, and the search/crawl final answer still misses the expected evidence. No raw model-native marker leakage was observed.The refreshed GPT-OSS current-main run is no longer worse than Dynamo 1.2.0 by count; both are 14 / 48. The prior main-only required-tool regression is gone, and the streaming multi-step workflow now returns final content instead of an empty assistant message, but the core multi-tool, parallel-tool, literal-marker, and final-answer gaps remain.