About | NVIDIA NeMo Relay

Use this section when you own an application, agent harness, or workflow and can route tool and LLM calls through NeMo Relay directly.

Direct instrumentation puts NeMo Relay at the boundaries where work happens. Scopes define request and agent ownership, managed execution helpers wrap tool and LLM calls, middleware applies policy and transformation, and subscribers receive lifecycle events. This path gives the runtime a complete view of agent work while keeping the application callback result unchanged.

Start Here When

Use this guide when you need to:

Trace nested agent work across tools and model calls
Redact or normalize event payloads before export
Block unsafe or invalid calls before execution
Wrap calls with timing, routing, retries, or fallback behavior
Isolate request-specific middleware and subscribers

If the tool or LLM boundary is owned by a framework, use Integrate into Frameworks instead.

Guides

These guides show how to instrument applications with scopes, tool calls, LLM calls, middleware, and direct API examples.

Adding Scopes and Marks shows how to create ownership boundaries and checkpoint events before adding call instrumentation.
Instrument a Tool Call shows the smallest managed tool wrapper with event validation.
Instrument an LLM Call shows the smallest managed model-provider wrapper with event validation.
Add Middleware shows how to add guardrails, request intercepts, execution intercepts, and scope-local behavior.
Code Examples collects direct API examples for tools, LLMs, streaming calls, scopes, and partial middleware helpers.

Start with scopes and marks, then instrument the call boundaries your application owns. Add one middleware behavior at a time after the tool or LLM wrapper is emitting the expected lifecycle events.

For production usage, keep tool names stable, keep payloads JSON-compatible, use sanitize guardrails for sensitive fields, and prefer scope-local middleware when behavior should apply to one request, tenant, or experiment.