nemoguardrails.embeddings.basic | NVIDIA NeMo Guardrails Library Developer Guide

Module Contents

Classes

Name	Description
`BasicEmbeddingsIndex`	Basic implementation of an embeddings index.

Data

EmbeddingMatrix

log

API

class nemoguardrails.embeddings.basic.BasicEmbeddingsIndex(
    embedding_model: str = 'sentence-transformers/all-...,
    embedding_engine: str = 'SentenceTransformers',
    embedding_params: typing.Optional[typing.Dict[str, typing.Any]] = None,
    index: typing.Optional[nemoguardrails.embeddings.basic.EmbeddingMatrix] = None,
    cache_config: typing.Optional[typing.Union[nemoguardrails.rails.llm.config.EmbeddingsCacheConfig, typing.Dict[str, typing.Any]]] = None,
    search_threshold: float = float('inf'),
    use_batching: bool = False,
    max_batch_size: int = 10,
    max_batch_hold: float = 0.01
)

Bases: EmbeddingsIndex

Basic implementation of an embeddings index.

It uses the sentence-transformers/all-MiniLM-L6-v2 model to compute embeddings. Exact cosine nearest-neighbor search is performed over a NumPy matrix of L2-normalized embeddings, so search results are exact (no approximation).

_cache_config

= EmbeddingsCacheConfig(**cache_config)

_current_batch_finished_event

Optional[Event] = None

_current_batch_full_event

Optional[Event] = None

_current_batch_submitted

Event = asyncio.Event()

_embedding_size

= 0

_embeddings

List[List[float]] = []

_index

Optional[EmbeddingMatrix] = None

_items

List[IndexItem] = []

_model

Optional[EmbeddingModel] = None

_req_idx

int = 0

_req_queue

Dict[int, str] = {}

_req_results

Dict[int, List[float]] = {}

cache_config

Get the cache configuration.

embedding_params

= embedding_params or {}

embedding_size

Get the size of the embeddings.

embeddings

Get the computed embeddings.

embeddings_index

Optional[EmbeddingMatrix]

Get the current embedding index

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._batch_get_embeddings(
    text: str
) -> typing.List[float]

async

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._get_embeddings(
    texts: typing.List[str]
) -> typing.List[typing.List[float]]

async

Compute embeddings for a list of texts.

Parameters:

texts

List[str]

The list of texts to compute embeddings for.

Returns: List[List[float]]

List[List[float]]: The computed embeddings.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._init_model()

Initialize the model used for computing the embeddings.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._run_batch()

async

Runs the current batch of embeddings.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex._validate_index(
    index: typing.Any,
    path: typing.Optional[str] = None
) -> nemoguardrails.embeddings.basic.EmbeddingMatrix

staticmethod

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.add_item(
    item: nemoguardrails.embeddings.index.IndexItem
)

async

Add a single item to the index.

Parameters:

item

IndexItem

The item to add to the index.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.add_items(
    items: typing.List[nemoguardrails.embeddings.index.IndexItem]
)

async

Add multiple items to the index at once.

Parameters:

items

List[IndexItem]

The list of items to add to the index.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.build()

async

Builds the embeddings index.

Stores an L2-normalized float32 matrix of the computed embeddings. Because rows are normalized, the dot product between a normalized query and a row equals their cosine similarity. search ranks by this exact cosine value and converts it to the previous Annoy-compatible score for thresholding.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.load(
    path: str
) -> None

Restore a previously persisted index from disk.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.save(
    path: str
) -> None

Persist the built index to disk as a NumPy .npy file.

nemoguardrails.embeddings.basic.BasicEmbeddingsIndex.search(
    text: str,
    max_results: int = 20,
    threshold: typing.Optional[float] = None
) -> typing.List[nemoguardrails.embeddings.index.IndexItem]

async

Search the closest max_results items.

Parameters:

text

str

The text to search for.

max_results

intDefaults to 20

The maximum number of results to return. Defaults to 20.

Returns: List[IndexItem]

List[IndexItem]: The closest items found.

nemoguardrails.embeddings.basic.EmbeddingMatrix = NDArray[np.float32]

nemoguardrails.embeddings.basic.log = logging.getLogger(__name__)