NemoClaw supports multiple inference providers. During onboarding, the NemoClaw onboarding wizard presents a numbered list of providers to choose from. Your selection determines where the agent’s inference traffic is routed.
For OpenClaw onboarding, use nemoclaw onboard.
The provider flow is the same, with the NVIDIA Endpoints route available for OpenClaw Agent.
The agent inside the sandbox talks to inference.local.
It never connects to a provider directly.
OpenShell intercepts inference traffic on the host and forwards it to the provider you selected.
Provider credentials stay on the host.
The sandbox does not receive your API key.
Local Ollama and local vLLM do not require your host OPENAI_API_KEY.
NemoClaw uses provider-specific local tokens for those routes, and rebuilds of legacy local-inference sandboxes migrate away from stale OpenAI credential requirements.
The onboard wizard presents the following provider options by default. The first six are always available. Ollama appears when it is installed or running on the host. Local vLLM appears when NemoClaw detects a running vLLM server. The managed install/start vLLM entry appears by default on DGX Spark and DGX Station, and appears on generic Linux NVIDIA GPU hosts after opt-in.
NVIDIA Nemotron models expose OpenAI-compatible APIs across every supported deployment surface, so two onboarding options can route to Nemotron.
For Option 3, the API key environment variable is COMPATIBLE_API_KEY. Set it to whatever credential your endpoint expects, or any non-empty placeholder if your endpoint does not require auth.
The Model Router option uses the routed inference profile in nemoclaw-blueprint/blueprint.yaml.
When you select it, NemoClaw starts the router proxy on the host, waits for its health endpoint, registers the nvidia-router provider with OpenShell, and creates the sandbox with the same inference.local route the agent uses for other providers.
The sandbox does not call the router port directly.
The router model pool lives in nemoclaw-blueprint/router/pool-config.yaml.
Edit that file to define which models the router can choose from.
The default pool routes between NVIDIA-hosted Nemotron models and uses the tolerance value to choose the lowest-cost model whose predicted quality stays within the configured threshold.
The tolerance parameter controls the accuracy-cost tradeoff.
The router runs on the host, not inside the sandbox.
Credentials flow through the OpenShell provider system. The sandbox never sees raw API keys.
To use the router in scripted setup, set:
The Model Router runs in a host-side virtual environment that NemoClaw creates during onboarding.
NemoClaw probes python3.13, python3.12, python3.11, python3.10, and bare python3, and adopts the first interpreter that satisfies both of:
[3.10, 3.14).ensurepip, pyexpat, ssl, and venv all import without error.If no candidate qualifies, onboarding aborts and prints the real failure for each candidate.
This surfaces issues like Homebrew python@3.14 whose pyexpat extension fails to dlopen against the older system libexpat on macOS.
To pin a specific interpreter, set NEMOCLAW_MODEL_ROUTER_PYTHON to its absolute path before running nemoclaw onboard:
The pin is strict.
NemoClaw probes only that interpreter and aborts with the failure reason if it does not qualify, rather than silently falling back to a different python on PATH.
Relative command names such as python3.12 are rejected; use command -v python3.12 to find the absolute path.
If python -m venv itself fails for a probe-clean interpreter (for example, a corrupt ensurepip seed), NemoClaw retries with the next healthy candidate when no pin is set; with a pin set, the failure stops onboarding so you can fix or repoint the pinned python.
The following local inference options are caveated.
Local NIM and generic Linux managed vLLM install/start require NEMOCLAW_EXPERIMENTAL=1; DGX Spark and DGX Station managed vLLM entries appear by default.
An already-running vLLM server appears directly in the onboarding selection list.
For setup instructions, refer to Use a Local Inference Server.
NemoClaw validates the selected provider and model before creating the sandbox.
If credential validation fails, the wizard asks whether to re-enter the API key, choose a different provider, retry, or exit.
Transient upstream validation failures are retried before the wizard reports a provider failure.
The nvapi- prefix check applies only to NVIDIA_API_KEY.
Other provider credentials, such as OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, and compatible endpoint keys, use provider-aware validation during retry.