> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/relay/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/relay/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/relay/_mcp/server.

# Module stream

> Streaming LLM response wrapper.

Generated from `cargo doc --no-deps -p nemo-relay -p nemo-relay-adaptive -p nemo-relay-ffi`.

Streaming LLM response wrapper.

This module provides [`LlmStreamWrapper`](/reference/api/rust-library-reference/nemo-relay/stream/struct-llmstreamwrapper), a \[`Stream`] adapter that sits between the raw stream from an LLM API and the consumer. It feeds chunks to a user-supplied collector, and automatically emits lifecycle events when the stream ends.

### Pipeline

```rust
raw chunk (Json) -> collector(chunk) -> Ok(()) -> yield chunk
                                     -> Err(e) -> terminate stream with error
upstream error -> terminate stream with error -> finalizer() -> Json -> SanitizeResponseGuardrails -> END event
stream ends -> finalizer() -> Json -> SanitizeResponseGuardrails -> END event
```

The **collector** receives each chunk (Json) and can accumulate state (e.g., concatenating tokens). If the collector returns `Err`, the stream terminates immediately with that error. Upstream stream errors also terminate the stream immediately. The **finalizer** is called once when the stream terminates and returns the aggregated response as [`Json`](/reference/api/rust-library-reference/nemo-relay/json/type-json). That aggregated response then flows through sanitize response guardrails before being included in the END event.

## Structs

* [LlmStreamWrapper](/reference/api/rust-library-reference/nemo-relay/stream/struct-llmstreamwrapper): Wraps an inner `Stream<Item = Result<Json>>` of raw chunks and: