For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
      • Python Library Reference
      • Node.js Library Reference
      • Rust Library Reference
        • nemo-relay
          • api
          • codec
            • anthropic
            • openai_chat
            • openai_responses
              • OpenAIResponsesCodec
              • OpenAIResponsesStreamingCodec
            • request
            • response
            • streaming
            • traits
          • config_editor
          • error
          • json
          • observability
          • plugin
          • plugins
          • stream
          • editor_config
        • nemo-relay-adaptive
        • nemo-relay-ffi
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Strategy
  • Implementations
  • impl OpenAIResponsesStreamingCodec
  • new
  • Trait Implementations
  • impl Default for OpenAIResponsesStreamingCodec
  • default
  • impl StreamingCodec for OpenAIResponsesStreamingCodec
  • collector
  • finalizer
ReferenceAPIsRust Library Referencenemo-relaycodecopenai_responses

Struct OpenAI Responses Streaming Codec

||View as Markdown|
Previous

Struct OpenAI Responses Codec

Next

Module request

Generated from cargo doc --no-deps -p nemo-relay -p nemo-relay-adaptive -p nemo-relay-ffi.

1pub struct OpenAIResponsesStreamingCodec { /* private fields */ }

Streaming counterpart to OpenAIResponsesCodec.

Replays the OpenAI Responses SSE event sequence into the same JSON shape the API returns for a non-streaming request ({id, model, status, output, usage, incomplete_details, ...}). Once finalized, the assembled JSON can be fed back through OpenAIResponsesCodec::decode_response to produce the canonical AnnotatedLlmResponse.

Strategy

The Responses API is a relatively forgiving streaming target because every event carries either the full response snapshot (response.created, response.in_progress, response.completed, response.failed, response.incomplete) or the final-state output item (response.output_item.done). We:

  1. Track the latest response snapshot - terminal events (completed/failed/incomplete) typically carry the complete state including output, so we prefer those when present.
  2. Track output items by output_index - output_item.done events deliver the final per-item state, used as a fallback when the terminal response.output is missing or empty.
  3. Per-token output_text.delta and function_call_arguments.delta events are ignored because their content is redelivered in the matching output_item.done event. Skipping deltas keeps the codec resilient to schema additions and avoids double-accumulation.

Internal state lives behind Arc<Mutex<...>> so the &self-produced collector and finalizer closures share access. Each instance is single-use because LlmFinalizerFn consumes the finalize step.

Implementations

impl OpenAIResponsesStreamingCodec

impl OpenAIResponsesStreamingCodec

new

pub fn new() -> Self

Creates a fresh streaming codec with empty accumulator state.

Trait Implementations

impl Default for OpenAIResponsesStreamingCodec

impl Default for OpenAIResponsesStreamingCodec

default

fn default() -> Self

impl StreamingCodec for OpenAIResponsesStreamingCodec

impl StreamingCodec for OpenAIResponsesStreamingCodec

collector

fn collector(&self) -> LlmCollectorFn

finalizer

fn finalizer(&self) -> LlmFinalizerFn