Code Examples

View as Markdown

Use these examples when you need the direct runtime surfaces behind the application instrumentation guides.

Invocation API Selection

The following table shows which API to use based on your integration need:

NeedPreferred APIUse When
Run a tool with full instrumentationtools.execute, toolCallExecute, tool_call_executeApplication code owns the callback.
Run an LLM call with full instrumentationllm.execute, llmCallExecute, llm_call_executeApplication code owns the provider call.
Run a streaming LLM callllm_stream_execute, typedLlmStreamExecute, llm_stream_call_executeYou need chunk collection and one final aggregate end event.
Emit start/end manuallycall and call_end helpersA framework owns the real invocation boundary.
Emit a checkpointscope.event, eventYou need milestone visibility inside an active scope.
Attach work to one requestScope-local registration helpersMiddleware or subscribers should disappear when that scope closes.

Manual Tool Lifecycle

Use manual lifecycle calls only when the surrounding code owns the real tool invocation and only exposes reliable start and finish hooks. If you are replaying events or bridging a framework clock, pass an explicit timestamp to the manual start, end, or mark helpers. Python accepts timezone-aware datetime values, Node.js and WebAssembly accept Unix microseconds since epoch, Rust accepts DateTime<Utc>, and Go accepts time.Time.

1import nemo_relay
2
3handle = nemo_relay.tools.call("search", {"query": "weather"}, data={"attempt": 1})
4try:
5 result = {"hits": 2}
6finally:
7 nemo_relay.tools.call_end(handle, result)

Managed LLM Execution

Use managed execution when NeMo Relay should run the full middleware pipeline around the provider call.

1import nemo_relay
2from nemo_relay import LLMRequest
3
4request = LLMRequest({}, {"messages": [{"role": "user", "content": "hello"}]})
5
6async def invoke(req: LLMRequest):
7 return {"text": "hi", "request": req.content}
8
9response = await nemo_relay.llm.execute(
10 "demo-provider",
11 request,
12 invoke,
13 model_name="demo-model",
14)

Streaming LLM Execution

Use the streaming helper when subscribers need chunk collection plus one final response payload.

1from dataclasses import dataclass
2
3from nemo_relay import LLMRequest
4from nemo_relay.typed import DataclassCodec, llm_stream_execute
5
6@dataclass
7class Chunk:
8 delta: str
9
10@dataclass
11class FinalResponse:
12 text: str
13
14request = LLMRequest({}, {"messages": [{"role": "user", "content": "hello"}]})
15collected: list[Chunk] = []
16
17async def stream_impl(_request: LLMRequest):
18 yield Chunk(delta="hi")
19
20stream = await llm_stream_execute(
21 "demo-provider",
22 request,
23 stream_impl,
24 collector=collected.append,
25 finalizer=lambda: FinalResponse(text="".join(chunk.delta for chunk in collected)),
26 chunk_json_codec=DataclassCodec(Chunk),
27 response_json_codec=DataclassCodec(FinalResponse),
28)

Partial Middleware Calls

These helpers are useful when framework code cannot use managed execution but still wants a request rewrite or block decision.

1import nemo_relay
2from nemo_relay import LLMRequest
3
4tool_args = nemo_relay.tools.request_intercepts("search", {"query": "weather"})
5nemo_relay.tools.conditional_execution("search", tool_args)
6
7llm_request = LLMRequest({}, {"messages": [{"role": "user", "content": "hello"}]})
8llm_request = nemo_relay.llm.request_intercepts("demo-provider", llm_request)
9nemo_relay.llm.conditional_execution(llm_request)

Scope and Context Helpers

Use normal scope helpers first. Reach for explicit stack helpers only when work crosses thread, task, worker, or request boundaries.

1from concurrent.futures import ThreadPoolExecutor
2
3import nemo_relay
4
5with nemo_relay.scope.scope("request", nemo_relay.ScopeType.Agent):
6 nemo_relay.scope.event("started", data={"ok": True})
7 shared = nemo_relay.propagate_scope_to_thread()
8
9 def worker() -> None:
10 nemo_relay.set_thread_scope_stack(shared)
11 nemo_relay.scope.event("worker-ran")
12
13 with ThreadPoolExecutor() as pool:
14 pool.submit(worker).result()

Middleware Registration Families

The runtime exposes the same registration families for tool and LLM calls:

  • Sanitize-request guardrails change emitted start-event payloads only
  • Sanitize-response guardrails change emitted end-event payloads only
  • Conditional-execution guardrails return an allow-or-block decision
  • Request intercepts change the real request before execution
  • Execution intercepts wrap the callback and may post-process or short-circuit
  • LLM stream execution intercepts wrap streaming provider callbacks

Every family also has a scope-local surface:

  • Python: nemo_relay.scope_local.register_*
  • Node.js: scopeRegister*
  • Rust: middleware scope_register_* functions under nemo_relay::api::registry; subscriber scope registration under nemo_relay::api::subscriber

Use Add Middleware for an end-to-end policy example and API Reference for symbol-level details.