Module stream

View as Markdown

Generated from cargo doc --no-deps -p nemo-relay -p nemo-relay-adaptive -p nemo-relay-ffi.

Streaming LLM response wrapper.

This module provides LlmStreamWrapper, a [Stream] adapter that sits between the raw stream from an LLM API and the consumer. It feeds chunks to a user-supplied collector, and automatically emits lifecycle events when the stream ends.

Pipeline

1raw chunk (Json) -> collector(chunk) -> Ok(()) -> yield chunk
2 -> Err(e) -> terminate stream with error
3upstream error -> terminate stream with error -> finalizer() -> Json -> SanitizeResponseGuardrails -> END event
4stream ends -> finalizer() -> Json -> SanitizeResponseGuardrails -> END event

The collector receives each chunk (Json) and can accumulate state (e.g., concatenating tokens). If the collector returns Err, the stream terminates immediately with that error. Upstream stream errors also terminate the stream immediately. The finalizer is called once when the stream terminates and returns the aggregated response as Json. That aggregated response then flows through sanitize response guardrails before being included in the END event.

Structs

  • LlmStreamWrapper: Wraps an inner Stream<Item = Result<Json>> of raw chunks and: