nemoguardrails.rails.llm.buffer

View as Markdown

Module Contents

Classes

NameDescription
BufferStrategyAbstract base class for buffer strategies in streaming output rails.
ChunkBatchRepresents a batch of processed chunks from a buffer strategy.
RollingBufferA rolling buffer strategy for streaming output rails processing.

Functions

NameDescription
get_buffer_strategyCreate a buffer strategy from the given configuration.

Data

__all__

API

class nemoguardrails.rails.llm.buffer.BufferStrategy()
Abstract

Abstract base class for buffer strategies in streaming output rails.

This class defines the interface for buffer strategies that manage how streaming chunks are buffered and processed for output rails. Concrete implementations should handle the accumulation and yielding of chunks in a way that optimizes output rails processing while maintaining streaming performance.

The interface separates concerns:

  • Buffer management logic (process_stream)
  • Chunk representation formatting (format_chunks)
nemoguardrails.rails.llm.buffer.BufferStrategy.__call__(
streaming_handler
) -> typing.AsyncGenerator[nemoguardrails.rails.llm.buffer.ChunkBatch, None]
async

Callable interface that delegates to process_stream.

It delegates to the process_stream method and can be extended to add common functionality like validation, logging, or error handling.

Parameters:

streaming_handler

An async iterator that yields individual string chunks from the LLM stream.

nemoguardrails.rails.llm.buffer.BufferStrategy.format_chunks(
chunks: typing.List[str]
) -> str
abstract

Format chunks into a string representation for user consumption.

This method defines how chunks should be formatted into a string representation. Different strategies might join chunks differently (e.g., preserving spaces, adding separators, etc.).

Parameters:

chunks
List[str]

List of chunk tokens to be formatted.

Returns: str

String representation of the chunks ready for consumers.

classmethodabstract

Create a buffer strategy instance from configuration.

Parameters:

config
OutputRailsStreamingConfig

Configuration object containing buffer strategy parameters.

Returns: BufferStrategy

A configured buffer strategy instance.

nemoguardrails.rails.llm.buffer.BufferStrategy.process_stream(
streaming_handler
) -> typing.AsyncGenerator[nemoguardrails.rails.llm.buffer.ChunkBatch, None]
asyncabstract

Process streaming chunks and yield chunk batches.

This is the main method that concrete buffer strategies must implement. It defines how chunks from the streaming handler should be buffered, processed, and yielded as ChunkBatch objects.

Parameters:

streaming_handler

An async iterator that yields individual string chunks from the LLM stream.

class nemoguardrails.rails.llm.buffer.ChunkBatch()

Bases: NamedTuple

Represents a batch of processed chunks from a buffer strategy.

This class contains the raw chunk data from buffer processing. For string representation of chunks, use the buffer strategy’s format_chunks() method.

processing_context
List[str]
user_output_chunks
List[str]
class nemoguardrails.rails.llm.buffer.RollingBuffer(
buffer_context_size: int = 5,
buffer_chunk_size: int = 10
)

Bases: BufferStrategy

A rolling buffer strategy for streaming output rails processing.

This strategy accumulates incoming chunks in a buffer and yields them in batches when the buffer reaches the specified chunk size. It maintains context from previous chunks to ensure continuity in processing output rails.

The buffer operates by:

  1. Accumulating incoming chunks until reaching the chunk size threshold
  2. Yielding a processing buffer (with context) and new chunks to process
  3. Retaining context tokens for the next processing round
  4. Yielding any remaining chunks at the end of the stream

Parameters:

buffer_context_size
intDefaults to 5

Number of tokens carried over from previous chunks to provide context for continuity. Defaults to 5.

buffer_chunk_size
intDefaults to 10

Number of tokens in each processing chunk. This determines the size of token blocks on which output rails are applied. Defaults to 10.

total_yielded
= 0
nemoguardrails.rails.llm.buffer.RollingBuffer.format_chunks(
chunks: typing.List[str]
) -> str

Generate string representation of chunks preserving original token format.

The RollingBuffer strategy preserves the original token format by joining chunks without modification, maintaining spaces and formatting as they appeared in the original LLM output.

Parameters:

chunks
List[str]

List of chunk tokens to be formatted.

Returns: str

String representation preserving original token spacing and format.

nemoguardrails.rails.llm.buffer.RollingBuffer.from_config(
config: nemoguardrails.rails.llm.config.OutputRailsStreamingConfig
)
classmethod

Create a RollingBuffer instance from a streaming configuration.

Parameters:

config
OutputRailsStreamingConfig

Configuration object containing context_size and chunk_size parameters.

Returns:

A new RollingBuffer instance configured with the provided parameters.

nemoguardrails.rails.llm.buffer.RollingBuffer.process_stream(
streaming_handler
) -> typing.AsyncGenerator[nemoguardrails.rails.llm.buffer.ChunkBatch, None]
async

Process streaming chunks using rolling buffer strategy.

This method implements the rolling buffer logic, accumulating chunks and yielding them in batches with context for output rails processing. The buffer maintains a sliding window of context tokens for continuity.

Parameters:

streaming_handler

An async iterator that yields individual string chunks from the LLM stream.

Create a buffer strategy from the given configuration.

Parameters:

config
OutputRailsStreamingConfig

Configuration object specifying the buffer strategy parameters.

Returns: BufferStrategy

A configured buffer strategy instance. Currently returns a RollingBuffer instance.

nemoguardrails.rails.llm.buffer.__all__ = ['ChunkBatch', 'BufferStrategy', 'RollingBuffer', 'get_buffer_strategy']