nemoguardrails.embeddings.basic

View as Markdown

Module Contents

Classes

NameDescription
BasicEmbeddingsIndexBasic implementation of an embeddings index.

Data

EmbeddingMatrix

log

API

class nemoguardrails.embeddings.basic.BasicEmbeddingsIndex(
embedding_model: str = 'sentence-transformers/all-...,
embedding_engine: str = 'SentenceTransformers',
embedding_params: typing.Optional[typing.Dict[str, typing.Any]] = None,
index: typing.Optional[nemoguardrails.embeddings.basic.EmbeddingMatrix] = None,
cache_config: typing.Optional[typing.Union[nemoguardrails.rails.llm.config.EmbeddingsCacheConfig, typing.Dict[str, typing.Any]]] = None,
search_threshold: float = float('inf'),
use_batching: bool = False,
max_batch_size: int = 10,
max_batch_hold: float = 0.01
)

Bases: EmbeddingsIndex

Basic implementation of an embeddings index.

It uses the sentence-transformers/all-MiniLM-L6-v2 model to compute embeddings. Exact cosine nearest-neighbor search is performed over a NumPy matrix of L2-normalized embeddings, so search results are exact (no approximation).

_cache_config
= EmbeddingsCacheConfig(**cache_config)
_current_batch_finished_event
Optional[Event] = None
_current_batch_full_event
Optional[Event] = None
_current_batch_submitted
Event = asyncio.Event()
_embedding_size
= 0
_embeddings
List[List[float]] = []
_index
Optional[EmbeddingMatrix] = None
_items
List[IndexItem] = []
_model
Optional[EmbeddingModel] = None
_req_idx
int = 0
_req_queue
Dict[int, str] = {}
_req_results
Dict[int, List[float]] = {}
cache_config

Get the cache configuration.

embedding_params
= embedding_params or {}
embedding_size

Get the size of the embeddings.

embeddings

Get the computed embeddings.

embeddings_index
Optional[EmbeddingMatrix]

Get the current embedding index

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._batch_get_embeddings(
text: str
) -> typing.List[float]
async
nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._get_embeddings(
texts: typing.List[str]
) -> typing.List[typing.List[float]]
async

Compute embeddings for a list of texts.

Parameters:

texts
List[str]

The list of texts to compute embeddings for.

Returns: List[List[float]]

List[List[float]]: The computed embeddings.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._init_model()

Initialize the model used for computing the embeddings.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._run_batch()
async

Runs the current batch of embeddings.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._validate_index(
index: typing.Any,
path: typing.Optional[str] = None
) -> nemoguardrails.embeddings.basic.EmbeddingMatrix
staticmethod
nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.add_item(
item: nemoguardrails.embeddings.index.IndexItem
)
async

Add a single item to the index.

Parameters:

item
IndexItem

The item to add to the index.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.add_items(
items: typing.List[nemoguardrails.embeddings.index.IndexItem]
)
async

Add multiple items to the index at once.

Parameters:

items
List[IndexItem]

The list of items to add to the index.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.build()
async

Builds the embeddings index.

Stores an L2-normalized float32 matrix of the computed embeddings. Because rows are normalized, the dot product between a normalized query and a row equals their cosine similarity. search ranks by this exact cosine value and converts it to the previous Annoy-compatible score for thresholding.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.load(
path: str
) -> None

Restore a previously persisted index from disk.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.save(
path: str
) -> None

Persist the built index to disk as a NumPy .npy file.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.search(
text: str,
max_results: int = 20,
threshold: typing.Optional[float] = None
) -> typing.List[nemoguardrails.embeddings.index.IndexItem]
async

Search the closest max_results items.

Parameters:

text
str

The text to search for.

max_results
intDefaults to 20

The maximum number of results to return. Defaults to 20.

Returns: List[IndexItem]

List[IndexItem]: The closest items found.

nemoguardrails.embeddings.basic.EmbeddingMatrix = NDArray[np.float32]
nemoguardrails.embeddings.basic.log = logging.getLogger(__name__)