For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Transparent Run
  • Shared Config
  • Hermes Hook Setup
  • Smoke Test
  • Verify Export
  • Troubleshoot LLM Lifecycle
NVIDIA NeMo Relay CLI

Hermes Agent

||View as Markdown|
Previous

Cursor

Next

About

Use this guide to observe local Hermes Agent sessions with NeMo Relay through Hermes shell hooks and the nemo-relay gateway.

Hermes shell hooks provide session, subagent, tool, and LLM lifecycle events. For NeMo Relay contract work, pre_api_request, post_api_request, and api_request_error are the authoritative LLM lifecycle hooks. The legacy pre_llm_call and post_llm_call hooks still exist, but only as private hint-style signals. Complete LLM request and response observability still requires model traffic to route through the gateway.

Transparent Run

Use the wrapper when you want the gateway lifetime managed for a local Hermes process:

$nemo-relay hermes

Pass Hermes arguments after --:

$nemo-relay hermes -- chat --provider custom

Once NeMo Relay config exists, this shortcut is equivalent to nemo-relay run --agent hermes. The wrapper starts a gateway on a dynamic 127.0.0.1 port and exports NEMO_RELAY_GATEWAY_URL for the launched process. After initial NeMo Relay setup exists, Hermes hook configuration is temporary in this mode: the launcher merges the NeMo Relay hook-forward commands into the configured Hermes hook file for the run and restores the original file afterward. The wrapper also sets HERMES_ACCEPT_HOOKS=1 so Hermes can use the injected hook commands without extra manual approval prompts.

If no NeMo Relay config exists yet, nemo-relay hermes triggers the setup flow first and then launches Hermes through the same wrapped pipeline.

Inspect what would be launched without starting Hermes:

$nemo-relay run \
> --dry-run \
> --print \
> -- hermes

Shared Config

Create .nemo-relay/config.toml for project defaults or ~/.config/nemo-relay/config.toml for user defaults:

1[agents.hermes]
2command = "hermes"

Then configure observability with nemo-relay plugins edit --project or .nemo-relay/plugins.toml:

1version = 1
2
3[[components]]
4kind = "observability"
5enabled = true
6
7[components.config.atif]
8enabled = true
9output_directory = ".nemo-relay/atif"
10
11[components.config.openinference]
12enabled = true
13endpoint = "http://127.0.0.1:4318/v1/traces"

Run nemo-relay run --agent hermes to use the configured command and plugin config. User config takes priority over project and system config.

Hermes Hook Setup

Unlike the other agents, Hermes reads hooks from .hermes/config.yaml. The setup wizard writes that file for you when you select hermes — running nemo-relay config (or nemo-relay config hermes to scope to one agent) merges NeMo Relay hook commands into the YAML, preserving any existing config, and records the path under [agents.hermes].hooks_path in .nemo-relay/config.toml.

The generated Hermes hooks cover on_session_start, on_session_end, on_session_finalize, on_session_reset, pre_llm_call, post_llm_call, pre_api_request, post_api_request, api_request_error, pre_tool_call, post_tool_call, subagent_start, and subagent_stop.

The API hooks are the main Hermes LLM lifecycle path for NeMo Relay. The legacy LLM hooks remain installed because they can still provide useful private hints, but they are not treated as equal peers to the API hooks in the observability contract.

Hermes hook forwarding prefers NEMO_RELAY_GATEWAY_URL when set (this is what nemo-relay hermes injects on every run). When launched outside the wrapper — e.g., bare hermes against a long-running gateway — the hook command falls back to --gateway-url http://127.0.0.1:4040.

For standalone gateway mode, start the daemon manually:

$nemo-relay --bind 127.0.0.1:4040

Then point Hermes provider traffic at http://127.0.0.1:4040 for any provider mode that exposes a local OpenAI-compatible or Anthropic-compatible base URL. This is the practical routed-provider validation path today.

Important distinction:

  • Wrapped execution is the authoritative path for Hermes hook-path validation.
  • Wrapped execution does not automatically rewrite Hermes provider base_url, custom_providers, or api_mode.
  • Routed /v1/messages, /v1/chat/completions, or /v1/responses validation therefore requires explicit Hermes provider configuration in addition to the wrapped or standalone gateway.

Smoke Test

Run a small Hermes session that starts, invokes one tool, and exits. Then check hook forwarding directly:

$curl -f http://127.0.0.1:4040/healthz
$printf '{"session_id":"smoke-hermes","hook_event_name":"on_session_start"}' \
> | NEMO_RELAY_GATEWAY_URL=http://127.0.0.1:4040 nemo-relay hook-forward hermes --fail-closed

The response should be {}. If Hermes prompts for hook consent, approve the NeMo Relay hook command interactively or through Hermes configuration before relying on unattended capture.

Verify Export

End a Hermes turn or finalize the session and confirm Agent Trajectory Interchange Format (ATIF) exists:

$ls .nemo-relay/atif

The gateway writes or updates an ATIF snapshot when it receives on_session_end, on_session_finalize, or on_session_reset. on_session_end is a per-turn snapshot boundary: it does not close the NeMo Relay session and does not emit a visible trajectory mark. on_session_finalize and on_session_reset close the session.

Troubleshoot LLM Lifecycle

If hook events appear but LLM spans are missing, Hermes model traffic is not routed through the gateway. If LLM spans exist but attach to the top-level agent instead of a subagent, include shared identifiers in Hermes hook payloads and gateway requests, such as conversation_id, generation_id, request_id, or x-nemo-relay-subagent-id.