Tool-Calling Reliability for Local Inference
Tool-Calling Reliability for Local Inference
Tool-Calling Reliability for Local Inference
Local inference is useful for privacy, cost control, and offline development, but tool-calling agents place stricter demands on the model server than simple chat.
The model server must return structured tool_calls, not a JSON-looking string inside normal assistant text.
Use this page when the TUI shows raw JSON such as:
If that appears as text in the assistant reply, OpenClaw cannot dispatch the tool because the inference response did not include a structured tool call.
Ollama can work well for lightweight local chat and some simple tool surfaces.
For OpenClaw-style agent loops with multiple tools, long instructions, or multi-turn dispatch, use a server that exposes OpenAI-compatible /v1/chat/completions with a tool-call parser.
vLLM is the common local choice.
The common failure mode is:
tool_calls field.This is different from a network or policy block.
nemoclaw <name> status, nemoclaw <name> logs, and nemoclaw debug --quick can all look healthy while tool dispatch still fails inside the conversation.
For persistent NemoClaw use, start vLLM with auto tool choice and the parser that matches your model family, then rerun onboarding and select Local vLLM [experimental] or Other OpenAI-compatible endpoint.
For Hermes 3 style models, a known-good vLLM command shape is:
For a Docker Compose setup:
Then onboard against that endpoint:
If the endpoint does not require authentication, set COMPATIBLE_API_KEY to any non-empty placeholder, such as dummy.
NemoClaw-managed sandboxes normally block direct openclaw config set writes inside the sandbox because those edits do not survive rebuilds.
Prefer rerunning nemoclaw onboard for a persistent provider change.
If you are intentionally testing a mutable OpenClaw config, prepare a batch file like this:
Apply it only in environments where OpenClaw allows config writes:
After testing, persist the working provider through nemoclaw onboard so the sandbox image, OpenShell inference route, and host-managed credentials stay in sync.
After switching to vLLM, ask for an action that should use a tool. Good signs:
nemoclaw <name> status reports the local vLLM or compatible endpoint as the active provider.If JSON still appears as text, confirm that you started vLLM with both --enable-auto-tool-choice and the correct --tool-call-parser value for your model.