Code Examples

Use these examples when you need the direct runtime surfaces behind the application instrumentation guides.

Invocation API Selection

The following table shows which API to use based on your integration need:

Need	Preferred API	Use When
Run a tool with full instrumentation	`tools.execute`, `toolCallExecute`, `tool_call_execute`	Application code owns the callback.
Run an LLM call with full instrumentation	`llm.execute`, `llmCallExecute`, `llm_call_execute`	Application code owns the provider call.
Run a streaming LLM call	`llm_stream_execute`, `typedLlmStreamExecute`, `llm_stream_call_execute`	You need chunk collection and one final aggregate end event.
Emit start/end manually	`call` and `call_end` helpers	A framework owns the real invocation boundary.
Emit a checkpoint	`scope.event`, `event`	You need milestone visibility inside an active scope.
Attach work to one request	Scope-local registration helpers	Middleware or subscribers should disappear when that scope closes.

Manual Tool Lifecycle

Use manual lifecycle calls only when the surrounding code owns the real tool invocation and only exposes reliable start and finish hooks. If you are replaying events or bridging a framework clock, pass an explicit timestamp to the manual start, end, or mark helpers. Python accepts timezone-aware datetime values, Node.js and WebAssembly accept Unix microseconds since epoch, Rust accepts DateTime<Utc>, and Go accepts time.Time.

Python

Node.js

Rust

1 import nemo_relay
2 
3 handle = nemo_relay.tools.call("search", {"query": "weather"}, data={"attempt": 1})
4 try:
5     result = {"hits": 2}
6 finally:
7     nemo_relay.tools.call_end(handle, result)

Managed LLM Execution

Use managed execution when NeMo Relay should run the full middleware pipeline around the provider call.

Python

Node.js

Rust

1 import nemo_relay
2 from nemo_relay import LLMRequest
3 
4 request = LLMRequest({}, {"messages": [{"role": "user", "content": "hello"}]})
5 
6 async def invoke(req: LLMRequest):
7     return {"text": "hi", "request": req.content}
8 
9 response = await nemo_relay.llm.execute(
10     "demo-provider",
11     request,
12     invoke,
13     model_name="demo-model",
14 )

Streaming LLM Execution

Use the streaming helper when subscribers need chunk collection plus one final response payload.

Python

Node.js

Rust

1 from dataclasses import dataclass
2 
3 from nemo_relay import LLMRequest
4 from nemo_relay.typed import DataclassCodec, llm_stream_execute
5 
6 @dataclass
7 class Chunk:
8     delta: str
9 
10 @dataclass
11 class FinalResponse:
12     text: str
13 
14 request = LLMRequest({}, {"messages": [{"role": "user", "content": "hello"}]})
15 collected: list[Chunk] = []
16 
17 async def stream_impl(_request: LLMRequest):
18     yield Chunk(delta="hi")
19 
20 stream = await llm_stream_execute(
21     "demo-provider",
22     request,
23     stream_impl,
24     collector=collected.append,
25     finalizer=lambda: FinalResponse(text="".join(chunk.delta for chunk in collected)),
26     chunk_json_codec=DataclassCodec(Chunk),
27     response_json_codec=DataclassCodec(FinalResponse),
28 )

Partial Middleware Calls

These helpers are useful when framework code cannot use managed execution but still wants a request rewrite or block decision.

Python

Node.js

Rust

1 import nemo_relay
2 from nemo_relay import LLMRequest
3 
4 tool_args = nemo_relay.tools.request_intercepts("search", {"query": "weather"})
5 nemo_relay.tools.conditional_execution("search", tool_args)
6 
7 llm_request = LLMRequest({}, {"messages": [{"role": "user", "content": "hello"}]})
8 llm_request = nemo_relay.llm.request_intercepts("demo-provider", llm_request)
9 nemo_relay.llm.conditional_execution(llm_request)

Scope and Context Helpers

Use normal scope helpers first. Reach for explicit stack helpers only when work crosses thread, task, worker, or request boundaries.

Python

Node.js

Rust

1 from concurrent.futures import ThreadPoolExecutor
2 
3 import nemo_relay
4 
5 with nemo_relay.scope.scope("request", nemo_relay.ScopeType.Agent):
6     nemo_relay.scope.event("started", data={"ok": True})
7     shared = nemo_relay.propagate_scope_to_thread()
8 
9     def worker() -> None:
10         nemo_relay.set_thread_scope_stack(shared)
11         nemo_relay.scope.event("worker-ran")
12 
13     with ThreadPoolExecutor() as pool:
14         pool.submit(worker).result()

Middleware Registration Families

The runtime exposes the same registration families for tool and LLM calls:

Sanitize-request guardrails change emitted start-event payloads only
Sanitize-response guardrails change emitted end-event payloads only
Conditional-execution guardrails return an allow-or-block decision
Request intercepts change the real request before execution
Execution intercepts wrap the callback and may post-process or short-circuit
LLM stream execution intercepts wrap streaming provider callbacks

Every family also has a scope-local surface:

Python: nemo_relay.scope_local.register_*
Node.js: scopeRegister*
Rust: middleware scope_register_* functions under nemo_relay::api::registry; subscriber scope registration under nemo_relay::api::subscriber

Use Add Middleware for an end-to-end policy example and API Reference for symbol-level details.