For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • API Reference
    • Overview
        • Nemo Curator
          • Backends
          • Config
          • Core
          • Metrics
          • Models
            • Aesthetics
            • Base
            • Client
            • Clip
            • Cosmos Embed1
            • Nemotron 3 Nano Omni
            • Nemotron H Vl
            • Nsfw
            • Prompt Formatter
            • Qwen Lm
            • Qwen Vl
            • Transnetv2
            • Vllm Model
          • Package Info
          • Pipeline
          • Stages
          • Tasks
          • Utils
    • Pipeline
    • ProcessingStage
    • CompositeStage
    • Resources
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Curator
On this page
  • Module Contents
  • Classes
  • Data
  • API
API ReferenceFull Library ReferenceNemo CuratorNemo CuratorModels

nemo_curator.models.vllm_model

||View as Markdown|
Previous

nemo_curator.models.transnetv2

Next

nemo_curator.package_info

Module Contents

Classes

NameDescription
LLM-
SamplingParams-
VLLMModelGeneric vLLM language model wrapper for text generation.

Data

VLLM_AVAILABLE

API

class nemo_curator.models.vllm_model.LLM()
class nemo_curator.models.vllm_model.SamplingParams()
class nemo_curator.models.vllm_model.VLLMModel(
model: str,
max_model_len: int | None = None,
tensor_parallel_size: int | None = None,
max_num_batched_tokens: int = 4096,
temperature: float = 0.7,
top_p: float = 0.8,
top_k: int = 20,
min_p: float = 0.0,
max_tokens: int | None = None,
cache_dir: str | None = None
)

Bases: ModelInterface

Generic vLLM language model wrapper for text generation.

_final_max_model_len
int | None = None
_is_qwen3
bool = False
_llm
LLM | None = None
_sampling_params
SamplingParams | None = None
model_id_names
list[str]

Return the model identifier.

nemo_curator.models.vllm_model.VLLMModel.generate(
prompts: list[str]
) -> list[str]

Generate text from prompts.

Parameters:

prompts
list[str]

List of prompt strings or list of message dicts (for chat template).

Returns: list[str]

List of generated text strings.

Raises:

  • RuntimeError: If the model is not set up or generation fails.
nemo_curator.models.vllm_model.VLLMModel.get_tokenizer() -> typing.Any

Get the tokenizer from the LLM instance.

nemo_curator.models.vllm_model.VLLMModel.setup() -> None

Set up the vLLM model and sampling parameters.

nemo_curator.models.vllm_model.VLLM_AVAILABLE = True