> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/relay/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/relay/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/relay/_mcp/server.

# Struct LlmStream Wrapper

> Wraps an inner `Stream<Item = Result<Json>>` of raw chunks and:

Generated from `cargo doc --no-deps -p nemo-relay -p nemo-relay-adaptive -p nemo-relay-ffi`.

```rust
pub struct LlmStreamWrapper { /* private fields */ }
```

Wraps an inner `Stream<Item = Result<Json>>` of raw chunks and:

1. Passes each chunk to the user-supplied **collector** closure. If the collector returns `Err`, the stream terminates with that error.
2. On stream exhaustion, calls the **finalizer** to produce an aggregated [`Json`](/reference/api/rust-library-reference/nemo-relay/json/type-json) response, runs sanitize response guardrails on it, then emits the LLM END event.

This type is returned by [`crate::api::llm::llm_stream_call_execute`](/reference/api/rust-library-reference/nemo-relay/api/llm/fn-llm-stream-call-execute) and is usually consumed as an ordinary async stream. The wrapper preserves the originating scope stack so end-of-stream bookkeeping still uses the correct scope-local middleware and subscribers even when polling happens elsewhere.

## Implementations

### `impl LlmStreamWrapper`

<pre />

#### `new`

<pre />

Create a new `LlmStreamWrapper` around the given raw stream.

Captures the current [`ScopeStackHandle`](/reference/api/rust-library-reference/nemo-relay/api/runtime/scope_stack/type-scopestackhandle) at creation time so the correct scope stack is used when the stream is later polled, even if polling happens on a different task or thread.

##### Parameters

* `inner`: Raw stream of JSON chunks from the provider callback.
* `handle`: [`LlmHandle`](/reference/api/rust-library-reference/nemo-relay/api/llm/struct-llmhandle) identifying the managed LLM span.
* `collector`: Per-chunk callback used to accumulate stream state or forward chunks elsewhere. Returning `Err` terminates the stream.
* `finalizer`: One-shot callback invoked when the stream finishes to synthesize the aggregated response payload.
* `data`: Retained compatibility payload; Agent Trajectory Observability Format (ATOF) end data is the finalized response.
* `metadata`: Optional event metadata merged into the emitted LLM-end event.
* `response_codec`: Optional codec used to derive annotated response metadata from the aggregated final payload.

##### Returns

A new [`LlmStreamWrapper`](/reference/api/rust-library-reference/nemo-relay/stream/struct-llmstreamwrapper) ready to be polled.

#### `scope_stack`

<pre />

Return the captured scope stack handle for this stream.

Callers can use this to bind the correct scope stack when spawning the stream on a different task via `TASK_SCOPE_STACK.scope(...)`.

##### Returns

A shared reference to the [`ScopeStackHandle`](/reference/api/rust-library-reference/nemo-relay/api/runtime/scope_stack/type-scopestackhandle) captured when the stream wrapper was created.

## Trait Implementations

### `impl Drop for LlmStreamWrapper`

<pre />

#### `drop`

<pre />

### `impl Stream for LlmStreamWrapper`

<pre />

#### `Item`

<pre />

#### `poll_next`

<pre />

#### `size_hint`

<pre />