Claude Code

Use this guide to observe Claude Code sessions with NeMo Relay. Claude Code is the supported integration target. The Claude application, Claude web, and Claude desktop sessions are unsupported unless they expose the same local hook and gateway controls as Claude Code.

Transparent Run

Use the wrapper for no-install local observability:

$ nemo-relay claude

Pass Claude Code arguments after --:

$ nemo-relay claude -- "summarize this repository"

This shortcut is equivalent to nemo-relay run -- claude. The wrapper starts a gateway on a dynamic 127.0.0.1 port, creates a temporary Claude plugin directory with NeMo Relay hooks, and passes that plugin with --plugin-dir. Because Claude Code gives its first --settings source precedence over the process environment, Relay also creates a private settings overlay that preserves that source and overrides only ANTHROPIC_BASE_URL for the launched process. The source settings and installed plugin enablement remain unchanged. If a Relay plugin is already enabled, its MCP process authenticates and borrows the dynamic gateway instead of launching the fixed sidecar, then monitors that exact gateway while MCP stdio remains open. Its persistent hooks exit without forwarding. Only the wrapper-owned temporary hooks deliver lifecycle payloads, and those hooks authenticate the wrapper gateway before sending a payload, so installed and source-marketplace plugin IDs cannot duplicate the captured stream.

Inspect what would be launched without starting Claude Code:

$ nemo-relay run \
>   --dry-run \
>   --print \
>   -- claude

Persistent Plugin Install

Use persistent plugin installation when Claude Code should load NeMo Relay from its normal plugin marketplace instead of through nemo-relay claude:

$ nemo-relay install claude-code

The installer creates a local marketplace, installs nemo-relay-plugin@nemo-relay-local at user scope, and enables Claude Code provider routing through the local NeMo Relay sidecar. It uses the existing nemo-relay binary on PATH; it does not install a plugin-local Relay binary.

The plugin starts nemo-relay mcp, a lightweight Rust lifecycle client that starts or reuses the shared gateway on 127.0.0.1:47632 immediately when the MCP process launches. The client verifies the gateway identity and effective persistent configuration, heartbeats it while MCP stdio remains open, and performs one coordinated restart if the gateway becomes unhealthy. Claude Code, Codex, and Hermes MCP clients share a compatible gateway, and the gateway exits after the final client’s idle timeout. The MCP server advertises no tools.

The generated entry sets alwaysLoad: true, so Claude Code 2.1.121 or newer waits for the MCP connection before session startup. The installed command hook carries the same install-generation fence, waits for the MCP-owned gateway, and forwards the canonical payload once. MCP owns the session-long gateway lifecycle.

Persistent plugin mode loads system and user Relay configuration only and uses the user configuration directory as its working directory. Use the transparent wrapper for project-specific .nemo-relay configuration. Run nemo-relay install claude-code --force to replace an existing generation-fenced installation safely.

Check or remove the installed plugin with:

$ nemo-relay doctor --plugin claude-code
$ nemo-relay uninstall claude-code

Refer to Coding Agent Installation for install directories, rollback behavior, and source marketplace notes.

Shared Config

Create .nemo-relay/config.toml for project defaults or ~/.config/nemo-relay/config.toml for user defaults:

1 [upstream]
2 anthropic_base_url = "https://api.anthropic.com"
3 
4 [agents.claude]
5 command = "claude"

To forward Claude Code requests to a custom provider host, configure Relay’s endpoint and authentication as described in Provider Upstreams.

Then configure observability with nemo-relay plugins edit --project or .nemo-relay/plugins.toml:

1 version = 1
2 
3 [[components]]
4 kind = "observability"
5 enabled = true
6 
7 [components.config.atif]
8 enabled = true
9 output_directory = ".nemo-relay/atif"
10 
11 [components.config.openinference]
12 enabled = true
13 endpoint = "http://127.0.0.1:4318/v1/traces"

Run nemo-relay run --agent claude to use the configured command and plugin config. User config takes priority over project and system config.

Standalone Gateway

Use the long-running gateway only when you want Claude Code running outside the wrapper (e.g., already configured by an IDE):

$ nemo-relay --bind 127.0.0.1:4040

Launch Claude Code from another terminal with the gateway environment:

$ export ANTHROPIC_BASE_URL=http://127.0.0.1:4040
$ claude

The gateway forwards Anthropic /v1/messages, /v1/messages/count_tokens, and model routes without rewriting provider JSON. Hook events require either the persistent NeMo Relay plugin or a transparent nemo-relay claude or nemo-relay run --agent claude invocation, which injects ephemeral hooks into the launched process.

Captured Events

Generated Claude Code hooks include SessionStart, SessionEnd, UserPromptSubmit, UserPromptExpansion, PreToolUse, PostToolUse, PostToolUseFailure, PermissionRequest, SubagentStart, SubagentStop, Notification, Stop, PreCompact, and PostCompact. Both compaction hooks emit canonical compaction marks. Relay normalizes the other hooks as scope, prompt, tool, notification, subagent, or private LLM correlation events according to the hook payload. UserPromptExpansion records nonempty slash-command expansions as skill.load.inferred; Claude Code does not identify whether an expansion came from a skill or a legacy custom command. The normal Skill pre-tool hook emits an observed skill.load mark.

The wrapper requires Claude Code 2.1.121 or newer. Earlier versions do not support every required hook event, and nemo-relay doctor reports the version mismatch.

Tool hooks preserve canonical fields such as tool_use_id, tool_name, tool_input, error, duration_ms, and is_interrupt. Subagent hooks use agent_id as the subagent identifier and preserve agent_type in metadata.

Claude Code traces are turn-oriented. A multi-turn conversation can produce one root claude-code-turn agent span or trajectory per user turn. That is expected when each span has a real UserPromptSubmit payload and assistant output. NeMo Relay excludes the known Claude Code startup/preflight probe and late uncorrelatable lifecycle hooks from exported user traces so they do not appear as synthetic null, user: test, idle_timeout, or lifecycle-only turns. The gateway records suppressed startup probes as debug-level observability_bypassed events when debug or trace logging is enabled.

Smoke Test

Run a small Claude Code prompt that starts a session and uses one tool. Then check that hook forwarding reaches the gateway:

$ curl -f http://127.0.0.1:4040/healthz
$ printf '{"session_id":"smoke-claude","hook_event_name":"SessionStart"}' \
>   | NEMO_RELAY_GATEWAY_URL=http://127.0.0.1:4040 nemo-relay hook-forward claude --fail-closed

The response should be valid Claude Code hook JSON. For most lifecycle events it is an allow/continue response. A full observability smoke fixture should produce expected OpenInference spans, raw ATOF events, and ATIF trajectories for at least one user turn, one LLM call, and one tool call.

Verify Export

End the Claude Code session and confirm that session-end closed the NeMo Relay agent scope and wrote Agent Trajectory Interchange Format (ATIF):

$ ls .nemo-relay/atif

The gateway exports the session’s ATIF file on SessionEnd. Unless you change filename_template, the file is named nemo-relay-atif-<session-id>.json. If no file appears, confirm that SessionEnd hooks fire, plugins.toml enables the ATIF exporter, and the gateway process can write to the configured directory.

Troubleshoot LLM Lifecycle

Missing hooks usually means Claude Code did not load the local hook config or the nemo-relay binary is not on PATH.

Missing LLM spans with present hook spans means Anthropic traffic is not routed through the gateway. Verify ANTHROPIC_BASE_URL in the Claude Code process environment and confirm that requests hit /v1/messages.

If LLM spans exist but attach to the session instead of a subagent, pass x-nemo-relay-subagent-id on gateway requests or include shared conversation_id, generation_id, or request_id values in both hook payloads and provider requests.

When Relay cannot prove a stronger LLM owner, it keeps the span under the active turn and adds correlation metadata instead of guessing. Inspect llm_correlation_status, llm_correlation_source, and llm_correlation_subagent_id on LLM spans. Common statuses include explicit, single_hint, request_affinity, agent_fallback, and ambiguous_fallback. Tool spans expose the same diagnostic pattern with tool_correlation_status, tool_correlation_source, and tool_correlation_subagent_id.

Late SubagentStop hooks can arrive after Claude Code has already closed the turn, especially when Claude reports background stop-hook state. If the subagent id has no matching SubagentStart and there is no active turn, Relay logs the missing subagent and suppresses the hook from exported observability so it does not create a synthetic null turn. If an unknown subagent end arrives while a turn is active, Relay may emit a subagent_end_without_start diagnostic mark under that turn.

Hook Limitations

Claude Code hooks are available only when Claude Code loads the persistent NeMo Relay plugin or the ephemeral plugin generated by nemo-relay claude or nemo-relay run --agent claude. The standalone gateway can still observe Anthropic LLM traffic, but it cannot invent missing tool, prompt, compaction, notification, or subagent hooks.

UserPromptSubmit contributes prompt and LLM correlation data. Stop is a private correlation and turn-boundary hint rather than a standalone user-visible mark. The other generated hooks normalize to their corresponding scope, tool, mark, notification, compaction, or subagent semantics.