Switch Inference Providers at Runtime#

Change the active inference provider while the sandbox is running. No restart is required.

Prerequisites#

  • A running NemoClaw sandbox.

  • The OpenShell CLI on your PATH.

Switch to NVIDIA Cloud#

Set the provider to nvidia-nim and specify a model from build.nvidia.com:

$ openshell inference set --provider nvidia-nim --model nvidia/nemotron-3-super-120b-a12b

This profile requires the NVIDIA_API_KEY environment variable. The nemoclaw setup command stores this key in ~/.nemoclaw/credentials.json on first run.

Switch to Local vLLM#

Set the provider to vllm-local and specify a model served by vLLM on the host:

$ openshell inference set --provider vllm-local --model nvidia/nemotron-3-nano-30b-a3b

The vLLM server must be running before you switch. Bind the server to 0.0.0.0 and make sure the host firewall allows the bridge subnet to reach port 8000. Refer to Set Up Local vLLM for setup instructions.

Switch to Local NIM#

Set the provider to nim-local and specify a model served by a NIM container on your network:

$ openshell inference set --provider nim-local --model nvidia/nemotron-3-super-120b-a12b

This profile requires the NIM_API_KEY environment variable. Refer to Set Up a Local NIM Service for setup instructions.

Verify the Active Provider#

Run the status command to confirm the change:

$ openclaw nemoclaw status

Add the --json flag for machine-readable output:

$ openclaw nemoclaw status --json

The output includes the active provider, model, and endpoint.

Available Models#

The following table lists the models registered with the nvidia-nim provider. You can switch to any of these models at runtime.

Model ID

Label

Context Window

Max Output

nvidia/nemotron-3-super-120b-a12b

Nemotron 3 Super 120B

131,072

8,192

nvidia/llama-3.1-nemotron-ultra-253b-v1

Nemotron Ultra 253B

131,072

4,096

nvidia/llama-3.3-nemotron-super-49b-v1.5

Nemotron Super 49B v1.5

131,072

4,096

nvidia/nemotron-3-nano-30b-a3b

Nemotron 3 Nano 30B

131,072

4,096