> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/relay/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/relay/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/relay/_mcp/server.

# Module streaming

> Streaming response codecs for the managed LLM execution pipeline.

Generated from `cargo doc --no-deps -p nemo-relay -p nemo-relay-adaptive -p nemo-relay-ffi`.

Streaming response codecs for the managed LLM execution pipeline.

[`crate::codec::traits::LlmResponseCodec`](/reference/api/rust-library-reference/nemo-relay/codec/traits/trait-llmresponsecodec) decodes a complete provider response into a normalized [`AnnotatedLlmResponse`](/reference/api/rust-library-reference/nemo-relay/codec/response/struct-annotatedllmresponse). For streaming providers, the analogous job is to:

1. consume per-chunk events as they arrive on a streaming HTTP response, and
2. assemble a single non-streaming-shape JSON payload at end of stream.

Once assembled, the payload can be fed back through the matching [`crate::codec::traits::LlmResponseCodec`](/reference/api/rust-library-reference/nemo-relay/codec/traits/trait-llmresponsecodec) to produce an [`AnnotatedLlmResponse`](/reference/api/rust-library-reference/nemo-relay/codec/response/struct-annotatedllmresponse) - meaning streaming and non-streaming requests converge on the same observability output without per-route shape duplication.

[`StreamingCodec`](/reference/api/rust-library-reference/nemo-relay/codec/streaming/trait-streamingcodec) is the trait that bundles the two functions ([`LlmCollectorFn`](/reference/api/rust-library-reference/nemo-relay/api/runtime/callbacks/type-llmcollectorfn), [`LlmFinalizerFn`](/reference/api/rust-library-reference/nemo-relay/api/runtime/callbacks/type-llmfinalizerfn)) used by [`crate::api::llm::llm_stream_call_execute`](/reference/api/rust-library-reference/nemo-relay/api/llm/fn-llm-stream-call-execute). Each provider supplies one impl whose internal state holds whatever incremental information is needed to materialize the final payload.

## Structs

* [SseEvent](/reference/api/rust-library-reference/nemo-relay/codec/streaming/struct-sseevent): One decoded SSE frame, paired with the parsed `data:` payload.
* [SseEventDecoder](/reference/api/rust-library-reference/nemo-relay/codec/streaming/struct-sseeventdecoder): Incremental decoder for `text/event-stream` byte streams that yields one JSON object per complete `data:` payload.

## Traits

* [StreamingCodec](/reference/api/rust-library-reference/nemo-relay/codec/streaming/trait-streamingcodec): Per-provider streaming codec used with [`crate::api::llm::llm_stream_call_execute`](/reference/api/rust-library-reference/nemo-relay/api/llm/fn-llm-stream-call-execute).