aiq.retriever.nemo_retriever.retriever#
Attributes#
Exceptions#
Common base class for all non-exit exceptions. |
Classes#
Client for retrieving document chunks from a Nemo Retriever service. |
|
Abstract base class for a Document retrieval system. |
Functions#
|
|
|
|
|
Module Contents#
- logger#
Bases:
aiq.retriever.models.RetrieverError
Common base class for all non-exit exceptions.
Initialize self. See help(type(self)) for accurate signature.
- class NemoRetriever( )#
Bases:
aiq.retriever.interface.AIQRetriever
Client for retrieving document chunks from a Nemo Retriever service.
- base_url = ''#
- timeout = 60#
- _search_func#
- api_key#
- _bound_params = []#
- bind(**kwargs) None #
Bind default values to the search method. Cannot bind the ‘query’ parameter.
- Args:
kwargs (dict): Key value pairs corresponding to the default values of search parameters.
- get_unbound_params() list[str] #
Returns a list of unbound parameters which will need to be passed to the search function.
- async get_collections(client) list[Collection] #
Get a list of all available collections as pydantic
Collection
objects
- async get_collection_by_name(collection_name, client) Collection #
Retrieve a collection using it’s name. Will return the first collection found if the name is ambiguous.
- class NemoLangchainRetriever(/, **data: Any)#
Bases:
langchain_core.retrievers.BaseRetriever
,pydantic.BaseModel
Abstract base class for a Document retrieval system.
A retrieval system is defined as something that can take string queries and return the most ‘relevant’ Documents from some source.
Usage:
A retriever follows the standard Runnable interface, and should be used via the standard Runnable methods of
invoke
,ainvoke
,batch
,abatch
.Implementation:
When implementing a custom retriever, the class should implement the
_get_relevant_documents
method to define the logic for retrieving documents.Optionally, an async native implementations can be provided by overriding the
_aget_relevant_documents
method.Example: A retriever that returns the first 5 documents from a list of documents
from langchain_core.documents import Document from langchain_core.retrievers import BaseRetriever from typing import List class SimpleRetriever(BaseRetriever): docs: List[Document] k: int = 5 def _get_relevant_documents(self, query: str) -> List[Document]: """Return the first k documents from the list of documents""" return self.docs[:self.k] async def _aget_relevant_documents(self, query: str) -> List[Document]: """(Optional) async native implementation.""" return self.docs[:self.k]
Example: A simple retriever based on a scikit-learn vectorizer
from sklearn.metrics.pairwise import cosine_similarity class TFIDFRetriever(BaseRetriever, BaseModel): vectorizer: Any docs: List[Document] tfidf_array: Any k: int = 4 class Config: arbitrary_types_allowed = True def _get_relevant_documents(self, query: str) -> List[Document]: # Ip -- (n_docs,x), Op -- (n_docs,n_Feats) query_vec = self.vectorizer.transform([query]) # Op -- (n_docs,1) -- Cosine Sim with each doc results = cosine_similarity(self.tfidf_array, query_vec).reshape((-1,)) return [self.docs[i] for i in results.argsort()[-self.k :][::-1]]
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.- client: NemoRetriever#
- abstractmethod _get_relevant_documents(query, *, run_manager, **kwargs)#
Get documents relevant to a query.
- Args:
query: String to find relevant documents for. run_manager: The callback handler to use.
- Returns:
List of relevant documents.
- async _aget_relevant_documents(query, *, run_manager, **kwargs)#
Asynchronously get documents relevant to a query.
- Args:
query: String to find relevant documents for run_manager: The callback handler to use
- Returns:
List of relevant documents