Architecture

View as Markdown

NemoClaw combines a host CLI, a TypeScript plugin that runs with OpenClaw inside the sandbox, and a versioned YAML blueprint that defines the sandbox image, policies, and inference profiles applied through OpenShell.

System Overview

NVIDIA OpenShell is a general-purpose agent runtime. It provides sandbox containers, a credential-storing gateway, inference proxying, and policy enforcement, but has no opinions about what runs inside. NemoClaw is an opinionated reference stack built on OpenShell that handles what goes in the sandbox and makes the setup accessible.

Deployment Topology

The logical diagram above shows how components relate. This section shows what actually runs where on the host. NemoClaw uses a Docker daemon. The OpenShell gateway runs as a container that embeds a k3s cluster. The sandbox runs as a Kubernetes pod inside that embedded cluster.

Layering from top to bottom:

LayerRuns asRole
Host CLIHost process (nemoclaw on Node.js)Orchestrates OpenShell via openshell CLI calls.
Docker daemonHost serviceRuns the OpenShell gateway container.
Gateway containerDocker containerHosts the credential store, the L7 proxy, and the embedded k3s control plane.
k3sProcess tree inside the gateway containerKubernetes control plane that schedules the sandbox pod.
Sandbox podPod in the embedded k3s clusterRuns the OpenClaw agent and the NemoClaw plugin under Landlock + seccomp + netns.
OpenShell L7 proxyProcess in the gateway containerIntercepts agent egress and rewrites Authorization headers (Bearer/Bot) and URL-path segments to inject the real credential at the network boundary.

NemoClaw never gives the sandbox a raw provider key. At onboard time it registers credentials with OpenShell’s provider/placeholder system, and the L7 proxy substitutes the real value into outbound requests at egress. The CLI helper isInferenceRouteReady (in src/lib/onboard.ts) is a host-side readiness check used by the resume flow to decide whether the active route already covers the chosen provider and model — it is not a runtime component.

For the DGX Spark-specific variant of this topology (cgroup v2, aarch64, unified memory), refer to the NVIDIA Spark playbook.

NemoClaw Plugin

The plugin is a thin TypeScript package that registers an inference provider and the /nemoclaw slash command. It runs in-process with the OpenClaw gateway inside the sandbox. It also registers runtime hooks that keep the agent aware of its environment. Before an agent turn starts, the plugin prepends a short context block with the active sandbox name, sandbox phase, network policy summary, and filesystem policy summary. When the policy or phase changes during a session, the plugin sends a smaller update block instead of repeating the full context.

nemoclaw/
├── src/
│ ├── index.ts Plugin entry: registers all commands
│ ├── cli.ts Commander.js subcommand wiring
│ ├── runtime-context.ts Sandbox and policy context injection
│ ├── commands/
│ │ ├── launch.ts Fresh install into OpenShell
│ │ ├── connect.ts Interactive shell into sandbox
│ │ ├── status.ts Blueprint run state + sandbox health
│ │ ├── logs.ts Stream blueprint and sandbox logs
│ │ └── slash.ts /nemoclaw chat command handler
│ └── blueprint/
│ ├── resolve.ts Version resolution, cache management
│ ├── fetch.ts Download blueprint from OCI registry
│ ├── verify.ts Digest verification, compatibility checks
│ ├── exec.ts Subprocess execution of blueprint runner
│ └── state.ts Persistent state (run IDs)
├── openclaw.plugin.json Plugin manifest
└── package.json Commands declared under openclaw.extensions

NemoClaw Blueprint

The blueprint is a versioned YAML package with its own release stream. The runner resolves, verifies, and applies the blueprint through the OpenShell CLI. The blueprint defines the sandbox shape, default policies, and inference profiles; the runner performs the OpenShell operations.

nemoclaw-blueprint/
├── blueprint.yaml Manifest: version, profiles, compatibility
├── model-specific-setup/ Agent-scoped model/provider compatibility manifests
├── router/ Model Router config and routing engine
├── policies/
│ └── openclaw-sandbox.yaml Default network + filesystem policy

The blueprint runtime (TypeScript) lives in the plugin source tree:

nemoclaw/src/blueprint/
├── runner.ts CLI runner: plan / apply / status / rollback
├── ssrf.ts SSRF endpoint validation (IP + DNS checks)
├── snapshot.ts Migration snapshot / restore lifecycle
├── state.ts Persistent run state management

Blueprint Lifecycle

  1. Resolve. The plugin locates the blueprint artifact and checks the version against min_openshell_version and min_openclaw_version constraints in blueprint.yaml.
  2. Verify. The plugin checks the artifact digest against the expected value.
  3. Plan. The runner determines what OpenShell resources to create or update, such as the gateway, providers, sandbox, inference route, and policy.
  4. Apply. The runner executes the plan by calling openshell CLI commands.
  5. Status. The runner reports current state.

Sandbox Environment

The sandbox runs the ghcr.io/nvidia/openshell-community/sandboxes/openclaw container image. Inside the sandbox:

  • OpenClaw runs with the NemoClaw plugin pre-installed.
  • Inference calls are routed through OpenShell to the configured provider.
  • Network egress is restricted by the baseline policy in openclaw-sandbox.yaml.
  • Filesystem access is confined to /sandbox and /tmp for read-write access, with system paths read-only.
  • The NemoClaw plugin injects sandbox and policy context into agent turns so the agent can report policy blocks accurately.

Inference Routing

Inference requests from the agent never leave the sandbox directly. OpenShell intercepts them and routes to the configured provider:

Agent (sandbox) ──▶ OpenShell gateway ──▶ NVIDIA Endpoint (build.nvidia.com)

When you select the Model Router provider, the OpenShell gateway routes to a host-side router process instead of a single upstream model. The router selects from the configured pool, then calls the upstream NVIDIA endpoint with the credential held outside the sandbox.

Some model and provider combinations need agent-specific compatibility setup. NemoClaw keeps those declarations under nemoclaw-blueprint/model-specific-setup/<agent>/ so OpenClaw and Hermes fixes can be tested and reviewed independently.

Refer to Inference Options for provider configuration details.

Provider Credential Storage

Provider credentials live in the OpenShell gateway store, not on the host filesystem. NemoClaw never writes them to host disk; the OpenShell L7 proxy injects values at egress. See Credential Storage for the inspection, rotation, and migration flow.

Host-Side State and Config

NemoClaw keeps non-secret operator-facing state on the host rather than inside the sandbox.

PathPurpose
~/.nemoclaw/sandboxes.jsonRegistered sandbox metadata, including the default sandbox selection.
~/.openclaw/openclaw.jsonHost OpenClaw configuration that NemoClaw snapshots or restores during migration flows.

The following environment variables configure optional services and local access.

VariablePurpose
TELEGRAM_BOT_TOKENTelegram bot token you provide before nemoclaw onboard. OpenShell stores it in a provider; the sandbox receives placeholders, not the raw secret.
TELEGRAM_ALLOWED_IDSComma-separated Telegram user or chat IDs for allowlists when onboarding applies channel restrictions.
CHAT_UI_URLURL for the optional chat UI endpoint.
NEMOCLAW_DISABLE_DEVICE_AUTHBuild-time-only toggle that disables gateway device pairing when set to 1 before the sandbox image is created.

For normal setup and reconfiguration, prefer nemoclaw onboard over editing these files by hand. Do not treat NEMOCLAW_DISABLE_DEVICE_AUTH as a runtime setting for an already-created sandbox.