Tool-Calling Reliability for Local Inference
Tool-Calling Reliability for Local Inference
Tool-Calling Reliability for Local Inference
Local inference is useful for privacy, cost control, and offline development, but
tool-calling agents place stricter demands on the model server than simple chat.
The model server must return structured tool_calls, not a JSON-looking string
inside normal assistant text.
Use this page when the TUI shows raw JSON such as:
If that appears as text in the assistant reply, OpenClaw cannot dispatch the tool because the inference response did not include a structured tool call.
Ollama can work well for lightweight local chat and some simple tool surfaces.
For OpenClaw-style agent loops with multiple tools, long instructions, or
multi-turn dispatch, use a server that exposes OpenAI-compatible
/v1/chat/completions with a tool-call parser. vLLM is the common local choice.
The common failure mode is:
tool_calls field.This is different from a network or policy block. nemoclaw <name> status,
nemoclaw <name> logs, and nemoclaw debug --quick can all look healthy while
tool dispatch still fails inside the conversation.
For persistent NemoClaw use, start vLLM with auto tool choice and the parser that matches your model family, then rerun onboarding and select Local vLLM [experimental] or Other OpenAI-compatible endpoint.
For Hermes 3 style models, a known-good vLLM command shape is:
For a Docker Compose setup:
Then onboard against that endpoint:
If the endpoint does not require authentication, set COMPATIBLE_API_KEY to any
non-empty placeholder, such as dummy.
NemoClaw-managed sandboxes normally block direct openclaw config set writes
inside the sandbox because those edits do not survive rebuilds. Prefer rerunning
nemoclaw onboard for a persistent provider change.
If you are intentionally testing a mutable OpenClaw config, prepare a batch file like this:
Apply it only in environments where OpenClaw config writes are allowed:
After testing, persist the working provider through nemoclaw onboard so the
sandbox image, OpenShell inference route, and host-managed credentials stay in
sync.
After switching to vLLM, ask for an action that should use a tool. Good signs:
nemoclaw <name> status reports the local vLLM or compatible endpoint as the
active provider.If JSON still appears as text, confirm that vLLM was started with both
--enable-auto-tool-choice and the correct --tool-call-parser value for your
model.