***
layout: overview
slug: nemo-curator/nemo\_curator/models/client/llm\_client
title: nemo\_curator.models.client.llm\_client
----------------------------------------------
## Module Contents
### Classes
| Name | Description |
| --------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| [`AsyncLLMClient`](#nemo_curator-models-client-llm_client-AsyncLLMClient) | Interface representing a client connecting to an LLM inference server |
| [`ConversationFormatter`](#nemo_curator-models-client-llm_client-ConversationFormatter) | Represents a way of formatting a conversation with an LLM |
| [`GenerationConfig`](#nemo_curator-models-client-llm_client-GenerationConfig) | Configuration class for LLM generation parameters. |
| [`LLMClient`](#nemo_curator-models-client-llm_client-LLMClient) | Interface representing a client connecting to an LLM inference server |
### API
```python
class nemo_curator.models.client.llm_client.AsyncLLMClient(
max_concurrent_requests: int = 5,
max_retries: int = 3,
base_delay: float = 1.0
)
```
Abstract
Interface representing a client connecting to an LLM inference server
and making requests asynchronously
```python
nemo_curator.models.client.llm_client.AsyncLLMClient._query_model_impl(
messages: collections.abc.Iterable,
model: str,
conversation_formatter: nemo_curator.models.client.llm_client.ConversationFormatter | None = None,
generation_config: nemo_curator.models.client.llm_client.GenerationConfig | dict | None = None
) -> list[str]
```
async
abstract
Internal implementation of query\_model without retry/concurrency logic.
Subclasses should implement this method instead of query\_model.
```python
nemo_curator.models.client.llm_client.AsyncLLMClient.query_model(
messages: collections.abc.Iterable,
model: str,
conversation_formatter: nemo_curator.models.client.llm_client.ConversationFormatter | None = None,
generation_config: nemo_curator.models.client.llm_client.GenerationConfig | dict | None = None
) -> list[str]
```
async
Query the model with automatic retry and concurrency control.
```python
nemo_curator.models.client.llm_client.AsyncLLMClient.setup() -> None
```
abstract
Setup the client.
```python
class nemo_curator.models.client.llm_client.ConversationFormatter()
```
Abstract
Represents a way of formatting a conversation with an LLM
such that it can response appropriately
```python
nemo_curator.models.client.llm_client.ConversationFormatter.format_conversation(
conv: list[dict]
) -> str
```
abstract
```python
class nemo_curator.models.client.llm_client.GenerationConfig(
max_tokens: int | None = 2048,
n: int | None = 1,
seed: int | None = 0,
stop: str | None | list[str] = None,
stream: bool = False,
temperature: float | None = 0.0,
top_k: int | None = None,
top_p: float | None = 0.95,
extra_kwargs: dict | None = None
)
```
Dataclass
Configuration class for LLM generation parameters.
```python
class nemo_curator.models.client.llm_client.LLMClient()
```
Abstract
Interface representing a client connecting to an LLM inference server
and making requests synchronously
```python
nemo_curator.models.client.llm_client.LLMClient.query_model(
messages: collections.abc.Iterable,
model: str,
conversation_formatter: nemo_curator.models.client.llm_client.ConversationFormatter | None = None,
generation_config: nemo_curator.models.client.llm_client.GenerationConfig | dict | None = None
) -> list[str]
```
abstract
```python
nemo_curator.models.client.llm_client.LLMClient.setup() -> None
```
abstract
Setup the client.