nemo_export.vllm.model_config#

Module Contents#

Classes#

NemoModelConfig

This class pretents to be a vllm.config.ModelConfig (with extra fields) but skips some of its initialization code, and initializes the configuration from a Nemo checkpoint instead.

API#

class nemo_export.vllm.model_config.NemoModelConfig(
nemo_checkpoint: str,
model_dir: str,
model_type: str,
tokenizer_mode: str,
dtype: Union[str, torch.dtype],
seed: int,
revision: Optional[str] = None,
override_neuron_config: Optional[Dict[str, Any]] = None,
code_revision: Optional[str] = None,
rope_scaling: Optional[dict] = None,
rope_theta: Optional[float] = None,
tokenizer_revision: Optional[str] = None,
max_model_len: Optional[int] = None,
quantization: Optional[str] = None,
quantization_param_path: Optional[str] = None,
enforce_eager: bool = False,
max_seq_len_to_capture: Optional[int] = 8192,
max_logprobs: int = 5,
disable_sliding_window: bool = False,
disable_cascade_attn: bool = False,
use_async_output_proc: bool = False,
disable_mm_preprocessor_cache: bool = False,
logits_processor_pattern: Optional[str] = None,
override_pooler_config: Optional[vllm.config.PoolerConfig] = None,
override_generation_config: Optional[Dict[str, Any]] = None,
enable_sleep_mode: bool = False,
model_impl: Union[str, vllm.config.ModelImpl] = ModelImpl.AUTO,
)#

Bases: vllm.config.ModelConfig

This class pretents to be a vllm.config.ModelConfig (with extra fields) but skips some of its initialization code, and initializes the configuration from a Nemo checkpoint instead.

Initialization

static _change_paths_to_absolute_paths(
tokenizer_config: Dict[Any, Any],
nemo_checkpoint: pathlib.Path,
) Dict[Any, Any]#

Creates absolute path to the local tokenizers. Used for NeMo 2.0.

Parameters:
  • tokenizer_config (dict) – Parameters for instantiating the tokenizer.

  • nemo_checkpoint (path) – Path to the NeMo2 checkpoint.

Returns:

Updated tokenizer config.

Return type:

dict

_load_hf_arguments(
nemo_config: Dict[str, Any],
) Dict[str, Any]#

Maps argument names used in NeMo to their corresponding names in HF.

try_get_generation_config(*args, **kwargs)#

Prevent vLLM from trying to load a generation config.