nemo_export.vllm.model_converters#

Module Contents#

Classes#

ModelConverter

Abstract class that defines the interface for a converter that implements model-specific conversion functions for deploying NeMo checkpoints on vLLM.

LlamaConverter

MixtralConverter

GemmaConverter

Starcoder2Converter

Functions#

register_model_converter

Establishes a mapping from short model type to a class that converts the model from Nemo format to a vLLM compatible format.

get_model_converter

Returns an instance of the the model conversion class for the given model type, or None.

Data#

API#

class nemo_export.vllm.model_converters.ModelConverter(model_type: str)[source]#

Bases: abc.ABC

Abstract class that defines the interface for a converter that implements model-specific conversion functions for deploying NeMo checkpoints on vLLM.

Initialization

abstractmethod get_architecture() Optional[str][source]#

Returns the HF architecture name for the current model, such as ā€˜LlamaForCausalLM’.

convert_config(nemo_model_config: dict, hf_config: dict) None[source]#

Implements any custom HF configuration adjustments in the ā€˜hf_config’ dict that are necessary for this model after the common translation takes place in NemoModelConfig’s constructor.

abstractmethod convert_weights(
nemo_model_config: dict,
state_dict: dict,
) Generator[Tuple[str, torch.tensor], None, None][source]#

Returns or yields a sequence of (name, tensor) tuples that contain model weights in the HF format.

requires_bos_token() bool[source]#

Returns True if the model requires a ā€˜bos’ token to be used at the beginning of the input sequence.

NeMo checkpoints do not store this information.

class nemo_export.vllm.model_converters.LlamaConverter(model_type: str)[source]#

Bases: nemo_export.vllm.model_converters.ModelConverter

get_architecture()[source]#
convert_weights(nemo_model_config, state_dict)[source]#
requires_bos_token()[source]#
class nemo_export.vllm.model_converters.MixtralConverter(model_type: str)[source]#

Bases: nemo_export.vllm.model_converters.ModelConverter

get_architecture()[source]#
convert_weights(nemo_model_config, state_dict)[source]#
requires_bos_token()[source]#
class nemo_export.vllm.model_converters.GemmaConverter(model_type: str)[source]#

Bases: nemo_export.vllm.model_converters.ModelConverter

get_architecture()[source]#
convert_weights(nemo_model_config, state_dict)[source]#
requires_bos_token()[source]#
class nemo_export.vllm.model_converters.Starcoder2Converter(model_type: str)[source]#

Bases: nemo_export.vllm.model_converters.ModelConverter

get_architecture()[source]#
convert_config(nemo_model_config, hf_config)[source]#
convert_weights(nemo_model_config, state_dict)[source]#
nemo_export.vllm.model_converters._MODEL_CONVERTERS = None#
nemo_export.vllm.model_converters.register_model_converter(model_type, cls)[source]#

Establishes a mapping from short model type to a class that converts the model from Nemo format to a vLLM compatible format.

nemo_export.vllm.model_converters.get_model_converter(
model_type,
) Optional[nemo_export.vllm.model_converters.ModelConverter][source]#

Returns an instance of the the model conversion class for the given model type, or None.