nemo_curator.models.vllm_model

Module Contents

Classes

Name	Description
`LLM`	-
`SamplingParams`	-
`VLLMModel`	Generic vLLM language model wrapper for text generation.

Data

VLLM_AVAILABLE

API

class nemo_curator.models.vllm_model.LLM()

class nemo_curator.models.vllm_model.SamplingParams()

class nemo_curator.models.vllm_model.VLLMModel(
    model: str,
    max_model_len: int | None = None,
    tensor_parallel_size: int | None = None,
    max_num_batched_tokens: int = 4096,
    temperature: float = 0.7,
    top_p: float = 0.8,
    top_k: int = 20,
    min_p: float = 0.0,
    max_tokens: int | None = None,
    cache_dir: str | None = None
)

Bases: ModelInterface

Generic vLLM language model wrapper for text generation.

_final_max_model_len

int | None = None

_is_qwen3

bool = False

_llm

LLM | None = None

_sampling_params

SamplingParams | None = None

model_id_names

list[str]

Return the model identifier.

nemo_curator.models.vllm_model.VLLMModel.generate(
    prompts: list[str]
) -> list[str]

Generate text from prompts.

Parameters:

prompts

list[str]

List of prompt strings or list of message dicts (for chat template).

Returns: list[str]

List of generated text strings.

Raises:

RuntimeError: If the model is not set up or generation fails.

nemo_curator.models.vllm_model.VLLMModel.get_tokenizer() -> typing.Any

Get the tokenizer from the LLM instance.

nemo_curator.models.vllm_model.VLLMModel.setup() -> None

Set up the vLLM model and sampling parameters.

nemo_curator.models.vllm_model.VLLM_AVAILABLE = True