Switch Inference Providers at Runtime#
Change the active inference provider while the sandbox is running. No restart is required.
Prerequisites#
A running NemoClaw sandbox.
The OpenShell CLI on your
PATH.
Switch to NVIDIA Cloud#
Set the provider to nvidia-nim and specify a model from build.nvidia.com:
$ openshell inference set --provider nvidia-nim --model nvidia/nemotron-3-super-120b-a12b
This profile requires the NVIDIA_API_KEY environment variable.
The nemoclaw setup command stores this key in ~/.nemoclaw/credentials.json on first run.
Switch to Local vLLM#
Set the provider to vllm-local and specify a model served by vLLM on the host:
$ openshell inference set --provider vllm-local --model nvidia/nemotron-3-nano-30b-a3b
The vLLM server must be running before you switch.
Bind the server to 0.0.0.0 and make sure the host firewall allows the bridge subnet to reach port 8000.
Refer to Set Up Local vLLM for setup instructions.
Switch to Local NIM#
Set the provider to nim-local and specify a model served by a NIM container on your network:
$ openshell inference set --provider nim-local --model nvidia/nemotron-3-super-120b-a12b
This profile requires the NIM_API_KEY environment variable.
Refer to Set Up a Local NIM Service for setup instructions.
Verify the Active Provider#
Run the status command to confirm the change:
$ openclaw nemoclaw status
Add the --json flag for machine-readable output:
$ openclaw nemoclaw status --json
The output includes the active provider, model, and endpoint.
Available Models#
The following table lists the models registered with the nvidia-nim provider.
You can switch to any of these models at runtime.
Model ID |
Label |
Context Window |
Max Output |
|---|---|---|---|
|
Nemotron 3 Super 120B |
131,072 |
8,192 |
|
Nemotron Ultra 253B |
131,072 |
4,096 |
|
Nemotron Super 49B v1.5 |
131,072 |
4,096 |
|
Nemotron 3 Nano 30B |
131,072 |
4,096 |