Code Examples

View as Markdown

This page collects concrete examples for the surrounding guide area.

Dynamic Header Injection

Use an LLM request intercept when a plugin needs to inject tenant or routing metadata into every provider request.

LLM request intercepts receive three arguments: name, request, and annotated. The request object is immutable, however it is possible to return a new instance of the request with edits, the exception to this is when the intercept is written in Rust.

1from typing import Any
2
3import nemo_relay
4
5class HeaderPlugin:
6 def validate(self, plugin_config: dict[str, Any]) -> list[dict[str, str]]:
7 if "header_name" not in plugin_config or "value" not in plugin_config:
8 return [{
9 "level": "error",
10 "code": "header-plugin.invalid_config",
11 "message": "header_name and value are required",
12 }]
13 return []
14
15 def register(self, plugin_config: dict[str, Any], context: nemo_relay.plugin.PluginContext):
16 def add_header(
17 name: str,
18 request: nemo_relay.LLMRequest,
19 annotated: nemo_relay.AnnotatedLLMRequest | None
20 ) -> tuple[nemo_relay.LLMRequest, nemo_relay.AnnotatedLLMRequest | None]:
21 # The request object is immutable, however we can return a new instance with updated headers.
22 headers = request.headers.copy()
23 headers[plugin_config["header_name"]] = plugin_config["value"]
24 return nemo_relay.LLMRequest(headers=headers, content=request.content), annotated
25
26 context.register_llm_request_intercept("inject-header", 100, False, add_header)
27
28nemo_relay.plugin.register("header-plugin", HeaderPlugin())

This pattern is useful for:

  • Tenant identity
  • Trace correlation
  • Region or deployment routing

OpenInference Export

Use a subscriber-oriented plugin when the component should watch the full lifecycle rather than rewrite requests.

1import nemo_relay
2
3class OpenInferencePlugin:
4 def validate(self, plugin_config):
5 if "endpoint" not in plugin_config:
6 return [{
7 "level": "error",
8 "code": "openinference-export.invalid_config",
9 "message": "endpoint is required",
10 }]
11 return []
12
13 def register(self, plugin_config, context):
14 endpoint = plugin_config["endpoint"]
15
16 def on_event(event):
17 print("export", endpoint, event.kind, event.name)
18
19 context.register_subscriber("openinference-export", on_event)
20
21nemo_relay.plugin.register("openinference-export", OpenInferencePlugin())

This is the right pattern when the component:

  • Exports traces or metrics
  • Aggregates events across tools and LLMs
  • Should not change execution behavior

Multi-Surface Policy Bundle

A plugin can register more than one runtime surface when one configuration document controls a related behavior bundle.

For example, a policy bundle can install:

  • A telemetry subscriber
  • LLM request intercepts for request metadata
  • Tool guardrails for policy enforcement
  • Sanitize guardrails for exported payloads
  • Shared component-local state used by those hooks

Use this pattern when the configured behavior is easier to reason about as one component than as several unrelated plugin components. Keep each registered surface small and make the component config explicit about which surfaces are enabled.

Framework-Facing Plugins

Plugins can stay framework-agnostic if they operate on the normalized runtime data rather than framework-specific objects.

Good examples:

  • Rewrite provider headers
  • Emit tracing data
  • Attach scheduling hints
  • Apply cross-framework safety policies