*** layout: overview slug: nemo-curator/nemo\_curator/models/client/llm\_client title: nemo\_curator.models.client.llm\_client ---------------------------------------------- ## Module Contents ### Classes | Name | Description | | --------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | | [`AsyncLLMClient`](#nemo_curator-models-client-llm_client-AsyncLLMClient) | Interface representing a client connecting to an LLM inference server | | [`ConversationFormatter`](#nemo_curator-models-client-llm_client-ConversationFormatter) | Represents a way of formatting a conversation with an LLM | | [`GenerationConfig`](#nemo_curator-models-client-llm_client-GenerationConfig) | Configuration class for LLM generation parameters. | | [`LLMClient`](#nemo_curator-models-client-llm_client-LLMClient) | Interface representing a client connecting to an LLM inference server | ### API ```python class nemo_curator.models.client.llm_client.AsyncLLMClient( max_concurrent_requests: int = 5, max_retries: int = 3, base_delay: float = 1.0 ) ``` Abstract Interface representing a client connecting to an LLM inference server and making requests asynchronously ```python nemo_curator.models.client.llm_client.AsyncLLMClient._query_model_impl( messages: collections.abc.Iterable, model: str, conversation_formatter: nemo_curator.models.client.llm_client.ConversationFormatter | None = None, generation_config: nemo_curator.models.client.llm_client.GenerationConfig | dict | None = None ) -> list[str] ``` async abstract Internal implementation of query\_model without retry/concurrency logic. Subclasses should implement this method instead of query\_model. ```python nemo_curator.models.client.llm_client.AsyncLLMClient.query_model( messages: collections.abc.Iterable, model: str, conversation_formatter: nemo_curator.models.client.llm_client.ConversationFormatter | None = None, generation_config: nemo_curator.models.client.llm_client.GenerationConfig | dict | None = None ) -> list[str] ``` async Query the model with automatic retry and concurrency control. ```python nemo_curator.models.client.llm_client.AsyncLLMClient.setup() -> None ``` abstract Setup the client. ```python class nemo_curator.models.client.llm_client.ConversationFormatter() ``` Abstract Represents a way of formatting a conversation with an LLM such that it can response appropriately ```python nemo_curator.models.client.llm_client.ConversationFormatter.format_conversation( conv: list[dict] ) -> str ``` abstract ```python class nemo_curator.models.client.llm_client.GenerationConfig( max_tokens: int | None = 2048, n: int | None = 1, seed: int | None = 0, stop: str | None | list[str] = None, stream: bool = False, temperature: float | None = 0.0, top_k: int | None = None, top_p: float | None = 0.95, extra_kwargs: dict | None = None ) ``` Dataclass Configuration class for LLM generation parameters. ```python class nemo_curator.models.client.llm_client.LLMClient() ``` Abstract Interface representing a client connecting to an LLM inference server and making requests synchronously ```python nemo_curator.models.client.llm_client.LLMClient.query_model( messages: collections.abc.Iterable, model: str, conversation_formatter: nemo_curator.models.client.llm_client.ConversationFormatter | None = None, generation_config: nemo_curator.models.client.llm_client.GenerationConfig | dict | None = None ) -> list[str] ``` abstract ```python nemo_curator.models.client.llm_client.LLMClient.setup() -> None ``` abstract Setup the client.