nemoguardrails.rails.llm.buffer
Module Contents
Classes
Functions
Data
API
Abstract base class for buffer strategies in streaming output rails.
This class defines the interface for buffer strategies that manage how streaming chunks are buffered and processed for output rails. Concrete implementations should handle the accumulation and yielding of chunks in a way that optimizes output rails processing while maintaining streaming performance.
The interface separates concerns:
- Buffer management logic (process_stream)
- Chunk representation formatting (format_chunks)
Callable interface that delegates to process_stream.
It delegates to the process_stream method and can
be extended to add common functionality like validation, logging,
or error handling.
Parameters:
An async iterator that yields individual string chunks from the LLM stream.
Format chunks into a string representation for user consumption.
This method defines how chunks should be formatted into a string representation. Different strategies might join chunks differently (e.g., preserving spaces, adding separators, etc.).
Parameters:
List of chunk tokens to be formatted.
Returns: str
String representation of the chunks ready for consumers.
Create a buffer strategy instance from configuration.
Parameters:
Configuration object containing buffer strategy parameters.
Returns: BufferStrategy
A configured buffer strategy instance.
Process streaming chunks and yield chunk batches.
This is the main method that concrete buffer strategies must implement. It defines how chunks from the streaming handler should be buffered, processed, and yielded as ChunkBatch objects.
Parameters:
An async iterator that yields individual string chunks from the LLM stream.
Bases: NamedTuple
Represents a batch of processed chunks from a buffer strategy.
This class contains the raw chunk data from buffer processing. For string representation of chunks, use the buffer strategy’s format_chunks() method.
Bases: BufferStrategy
A rolling buffer strategy for streaming output rails processing.
This strategy accumulates incoming chunks in a buffer and yields them in batches when the buffer reaches the specified chunk size. It maintains context from previous chunks to ensure continuity in processing output rails.
The buffer operates by:
- Accumulating incoming chunks until reaching the chunk size threshold
- Yielding a processing buffer (with context) and new chunks to process
- Retaining context tokens for the next processing round
- Yielding any remaining chunks at the end of the stream
Parameters:
Number of tokens carried over from previous chunks to provide context for continuity. Defaults to 5.
Number of tokens in each processing chunk. This determines the size of token blocks on which output rails are applied. Defaults to 10.
Generate string representation of chunks preserving original token format.
The RollingBuffer strategy preserves the original token format by joining chunks without modification, maintaining spaces and formatting as they appeared in the original LLM output.
Parameters:
List of chunk tokens to be formatted.
Returns: str
String representation preserving original token spacing and format.
Create a RollingBuffer instance from a streaming configuration.
Parameters:
Configuration object containing context_size and chunk_size parameters.
Returns:
A new RollingBuffer instance configured with the provided parameters.
Process streaming chunks using rolling buffer strategy.
This method implements the rolling buffer logic, accumulating chunks and yielding them in batches with context for output rails processing. The buffer maintains a sliding window of context tokens for continuity.
Parameters:
An async iterator that yields individual string chunks from the LLM stream.
Create a buffer strategy from the given configuration.
Parameters:
Configuration object specifying the buffer strategy parameters.
Returns: BufferStrategy
A configured buffer strategy instance. Currently returns a RollingBuffer instance.