For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • What You Build
  • Before You Start
  • Choose the Middleware Type
  • Add a Tool Policy
  • Scope Middleware to One Request
  • Middleware Registration Families
  • Validate the Middleware
  • Debug Middleware Order
  • Common Issues
  • Next Steps
Instrument Applications

Add Middleware

||View as Markdown|
Previous

Instrument an LLM Call

Next

Code Examples

Use this guide when instrumentation is working and you want NeMo Relay to enforce policy, transform requests, wrap execution, or sanitize observability payloads around tool and LLM calls.

What You Build

You will add middleware to an instrumented application and verify that it runs in the expected part of the pipeline:

  • Request intercepts transform the real request before execution.
  • Sanitize guardrails transform only the payload recorded on events.
  • Conditional-execution guardrails can block execution.
  • Execution intercepts wrap the callback and can add timing, retries, routing, or fallback behavior.

Before You Start

Complete Instrument a Tool Call or Instrument an LLM Call. Middleware only runs when the call goes through a NeMo Relay managed lifecycle API.

Choose the Middleware Type

Use this table to match the behavior you need with the correct middleware family.

NeedMiddleware TypeChanges Real Execution
Redact event payloadsSanitize-request or sanitize-response guardrailNo
Normalize tool arguments or model requestsRequest interceptYes
Block unsafe or invalid workConditional-execution guardrailYes, by rejecting
Add timing, retries, routing, or fallbackExecution interceptYes
Wrap streaming model outputLLM stream execution interceptYes

Use the narrowest middleware type that matches the behavior. For example, do not use a request intercept when you only need to hide a secret from exported events.

Add a Tool Policy

This example adds three behaviors around a search tool:

  • Redact api_key from emitted request events.
  • Reject empty queries before execution.
  • Measure execution duration.
Python
Node.js
Rust
1import time
2
3import nemo_relay
4
5def redact_api_key(tool_name, args):
6 safe_args = dict(args)
7 if "api_key" in safe_args:
8 safe_args["api_key"] = "<redacted>"
9 return safe_args
10
11def require_query(tool_name, args):
12 if not args.get("query"):
13 return "search.query is required"
14 return None
15
16async def measure_tool(tool_name, args, next_call):
17 started = time.perf_counter()
18 try:
19 return await next_call(args)
20 finally:
21 elapsed_ms = round((time.perf_counter() - started) * 1000, 2)
22 print(f"{tool_name} completed in {elapsed_ms} ms")
23
24nemo_relay.guardrails.register_tool_sanitize_request("search.redact_api_key", 10, redact_api_key)
25nemo_relay.guardrails.register_tool_conditional_execution("search.require_query", 20, require_query)
26nemo_relay.intercepts.register_tool_execution("search.measure", 30, measure_tool)

Scope Middleware to One Request

Use scope-local middleware when a policy applies only to one request, tenant, experiment, or agent run.

  1. Create or receive the active scope handle.
  2. Register middleware with the scope-local helper for that handle.
  3. Execute tools or LLM calls inside that scope.
  4. Let the scope end remove the scope-local registrations automatically.

Use global middleware for process-wide behavior, such as organization-wide redaction. Use scope-local middleware for request-specific policy, such as tenant routing or an A/B test.

Middleware Registration Families

NeMo Relay exposes the same core middleware families for tools and LLMs:

FamilyTool RegistrationLLM RegistrationChanges Real Execution
Sanitize requestregister_tool_sanitize_requestregister_llm_sanitize_requestNo
Sanitize responseregister_tool_sanitize_responseregister_llm_sanitize_responseNo
Conditional executionregister_tool_conditional_executionregister_llm_conditional_executionYes, by rejecting
Request interceptregister_tool_requestregister_llm_requestYes
Execution interceptregister_tool_executionregister_llm_executionYes
Stream execution interceptNot applicableregister_llm_stream_executionYes

Sanitize guardrails affect only the payload recorded on emitted events. Request intercepts affect the real request that reaches the tool or provider. Execution intercepts wrap the callback itself and are only available when the invocation uses managed execution.

Scope-local variants are available through nemo_relay.scope_local.register_*, Node.js scopeRegister* helpers, and Rust scope_register_* functions.

Validate the Middleware

Run one allowed request and one rejected request:

  • The allowed request should return the same business result as before.
  • The rejected request should fail before the tool callback executes.
  • Subscriber output should show redacted api_key values.
  • The timing intercept should print once for each executed tool call.

Debug Middleware Order

Middleware runs by ascending priority inside each middleware family. Families and lifecycle emission run in this order for managed tool calls:

  1. Conditional-execution guardrails.
  2. Request intercepts.
  3. Sanitize-request guardrails and start-event emission.
  4. Execution intercepts and the real callback or replacement.
  5. Sanitize-response guardrails and end-event emission.

If a later middleware does not run, check whether an earlier conditional-execution guardrail rejected the call or a request intercept raised an error.

Common Issues

Check these symptoms first when the workflow does not behave as expected.

  • Sanitized data reaches the real tool: Use a sanitize guardrail only for event payloads. Use a request intercept when the real request should change.
  • Middleware affects unrelated requests: Register it scope-locally instead of globally.
  • Duplicate names replace behavior: Middleware names are registry keys. Use stable, unique names for each behavior.
  • Execution intercept never prints: Confirm that the application uses the managed execute helper and that no guardrail rejected the request.

Next Steps

Use these links to continue from this workflow into the next related task.

  • Use Middleware to review execution order.
  • Use Code Examples for direct registration and partial-execution examples.
  • Use Handle Non-Serializable Data if middleware needs to work with framework objects.