The nemo-relay binary observes coding agents that do not expose every
LLM call site directly. It combines agent-specific hook endpoints with a
passthrough LLM gateway so NeMo Relay owns both the agent lifecycle and the model
request lifecycle.
Use the gateway when you need one observability boundary for OpenAI Codex, Claude Code, Cursor, and Hermes without replacing each agent’s canonical hook payload.
Each hook endpoint accepts the agent’s native hook JSON directly. Do not wrap the payload in a shared gateway envelope.
POST /hooks/codex accepts Codex hook JSON and returns the Codex-compatible
hook response object.POST /hooks/claude-code accepts Claude Code hook JSON and returns
Claude-compatible fields such as continue and permission decisions when the
hook event supports them.POST /hooks/cursor accepts Cursor hook JSON and returns Cursor-compatible
fields such as continue, permission, user_message, and agent_message
when the hook event supports them.POST /hooks/hermes accepts Hermes shell hook JSON and returns the empty JSON
object expected by Hermes hook commands.The adapters preserve vendor fields such as session IDs, working directories, transcript paths, model names, tool payloads, shell payloads, MCP payloads, file payloads, user identity, and subagent metadata in NeMo Relay event metadata.
Route all coding-agent LLM traffic through the gateway when full LLM lifecycle observability is required.
POST /v1/responsesPOST /v1/chat/completionsPOST /v1/messagesPOST /v1/messages/count_tokensGET /v1/modelsThe gateway forwards raw provider JSON without rewriting OpenAI or Anthropic payload schemas. It removes only hop-by-hop transport headers, forwards streaming responses as streams, and emits NeMo Relay LLM start and end events under the active session scope.
Use the agent shortcuts for no-install local observability. The wrapper starts
a gateway on a dynamic 127.0.0.1 port, injects the resolved hook and gateway
configuration into the launched coding agent, and stops the gateway when the
agent exits.
Use nemo-relay run -- <command> when you want to launch an explicit command
instead of the built-in shortcut:
If a launcher or wrapper hides the real agent name, set that wrapper as the
configured command and pass --agent. The same pattern applies to Claude Code,
Codex, Cursor, and Hermes:
Hermes is different from the other transparent modes: run --agent hermes
starts the gateway and exports the dynamic NEMO_RELAY_GATEWAY_URL, but Hermes
shell hooks still need to be installed or otherwise approved in Hermes config.
Use --dry-run --print to inspect the generated hook config, gateway
environment, gateway URL, and final command without launching the agent.
Shared TOML config is optional. The gateway loads defaults, then system config, then project config, then user config. User config takes priority over system and project config. CLI flags and environment variables override file config.
Config file locations are:
/etc/nemo-relay/config.toml.nemo-relay/config.toml$XDG_CONFIG_HOME/nemo-relay/config.toml~/.config/nemo-relay/config.tomlExample:
Observability exporters are configured in plugins.toml. Use
nemo-relay plugins edit for the user file, nemo-relay plugins edit --project
for .nemo-relay/plugins.toml, or write the plugin config directly:
Transparent runs always bind the managed gateway to 127.0.0.1:0. The selected
port is discovered by the wrapper and exposed to hooks through
NEMO_RELAY_GATEWAY_URL.
Common environment variables for direct gateway server use are:
NEMO_RELAY_GATEWAY_BINDNEMO_RELAY_OPENAI_BASE_URLNEMO_RELAY_ANTHROPIC_BASE_URLPlugin configuration controls process-level Observability exporters. Per-session configuration controls structured metadata on the top-level agent begin event and the plugin configuration metadata associated with the session.
hook-forward can also pass per-session configuration through headers:
x-nemo-relay-config-profilex-nemo-relay-session-metadatax-nemo-relay-plugin-configx-nemo-relay-gateway-modeThe accepted gateway mode values are hook-only, passthrough, and
required. The gateway records this value as session metadata so downstream
exporters and review tooling can distinguish hook-only traces from sessions
where provider traffic was expected to pass through the gateway.
The gateway normalizes vendor hook payloads into private internal events before calling NeMo Relay APIs.
ScopeType::Agent scope on a dedicated
ScopeStackHandle.ScopeType::Agent scope. Subagent stop closes
that scope when it is still active.Gateway requests can provide explicit correlation identifiers with these headers:
x-nemo-relay-session-idx-nemo-relay-subagent-idx-nemo-relay-conversation-idx-nemo-relay-generation-idx-nemo-relay-request-idWhen those headers are absent, the gateway also looks for
conversation_id/conversationId/conversation.id,
generation_id/generationId/generation.id, and
request_id/requestId/request.id fields in the provider request body.
Correlation hints expire after five minutes. If the gateway cannot select one
unambiguous hint, it falls back to the previous LLM owner, then to the only
active subagent, then to the top-level agent scope.
Every gateway LLM event includes llm_correlation_status metadata. Possible
values are explicit, single_hint, matched_hint, sticky_last_owner,
active_subagent, agent_fallback, and ambiguous_fallback. Matched hints can
also add llm_correlation_source, llm_correlation_subagent_id,
llm_correlation_conversation_id, llm_correlation_generation_id,
llm_correlation_request_id, and llm_correlation_agent_type.
Generated hook bundles subscribe to the events needed for that mapping:
Cursor hook-only mode observes agent, subagent, and tool lifecycle. To observe Cursor LLM lifecycle completely, configure Cursor model traffic to use the gateway.
Hooks generated by the wrapper (Claude/Codex/Cursor ephemeral, Hermes via
setup) invoke nemo-relay hook-forward <agent> from stdin. Inside the wrapper
the gateway URL comes from NEMO_RELAY_GATEWAY_URL injected on every run;
outside the wrapper (Hermes standalone, IDE-launched Claude/Codex) the hook
command falls back to its embedded --gateway-url.
hook-forward reads the canonical hook payload from standard input, sends it
to the matching endpoint, and prints the endpoint response. It fails open by
default so observability outages do not block the coding agent. Add
--fail-closed only when policy requires hook delivery to block the agent.
Optional flags map to gateway headers:
--session-metadata sets x-nemo-relay-session-metadata.--plugin-config sets x-nemo-relay-plugin-config.--profile sets x-nemo-relay-config-profile.--gateway-mode sets x-nemo-relay-gateway-mode.Use the per-agent guide for end-to-end setup, smoke tests, and GUI or application-mode caveats.
Each guide covers transparent run setup, gateway routing, hook smoke tests, Agent Trajectory Interchange Format (ATIF) export verification on session end, and troubleshooting missing LLM lifecycle data.