nemo_export.vllm.model_config
#
Module Contents#
Classes#
This class pretents to be a vllm.config.ModelConfig (with extra fields) but skips some of its initialization code, and initializes the configuration from a Nemo checkpoint instead. |
API#
- class nemo_export.vllm.model_config.NemoModelConfig(
- nemo_checkpoint: str,
- model_dir: str,
- model_type: str,
- tokenizer_mode: str,
- dtype: Union[str, torch.dtype],
- seed: int,
- revision: Optional[str] = None,
- override_neuron_config: Optional[Dict[str, Any]] = None,
- code_revision: Optional[str] = None,
- rope_scaling: Optional[dict] = None,
- rope_theta: Optional[float] = None,
- tokenizer_revision: Optional[str] = None,
- max_model_len: Optional[int] = None,
- quantization: Optional[str] = None,
- quantization_param_path: Optional[str] = None,
- enforce_eager: bool = False,
- max_seq_len_to_capture: Optional[int] = 8192,
- max_logprobs: int = 5,
- disable_sliding_window: bool = False,
- disable_cascade_attn: bool = False,
- use_async_output_proc: bool = False,
- disable_mm_preprocessor_cache: bool = False,
- logits_processor_pattern: Optional[str] = None,
- override_pooler_config: Optional[vllm.config.PoolerConfig] = None,
- override_generation_config: Optional[Dict[str, Any]] = None,
- enable_sleep_mode: bool = False,
- model_impl: Union[str, vllm.config.ModelImpl] = ModelImpl.AUTO,
Bases:
vllm.config.ModelConfig
This class pretents to be a vllm.config.ModelConfig (with extra fields) but skips some of its initialization code, and initializes the configuration from a Nemo checkpoint instead.
Initialization
- static _change_paths_to_absolute_paths(
- tokenizer_config: Dict[Any, Any],
- nemo_checkpoint: pathlib.Path,
Creates absolute path to the local tokenizers. Used for NeMo 2.0.
- Parameters:
tokenizer_config (dict) – Parameters for instantiating the tokenizer.
nemo_checkpoint (path) – Path to the NeMo2 checkpoint.
- Returns:
Updated tokenizer config.
- Return type:
dict
- _load_hf_arguments(
- nemo_config: Dict[str, Any],
Maps argument names used in NeMo to their corresponding names in HF.
- try_get_generation_config(*args, **kwargs)#
Prevent vLLM from trying to load a generation config.