core.models.huggingface.module#

Module Contents#

Classes#

HuggingFaceModule

Basic module for huggingface.

AutoHuggingFaceModel

Wrapper for HuggingFace AutoModel

Functions#

get_hf_model_type

Get the Huggingface model type.

build_hf_model

Builds Huggingface wrapper model given config and model path.

API#

class core.models.huggingface.module.HuggingFaceModule(config)#

Bases: megatron.core.transformer.module.MegatronModule

Basic module for huggingface.

Initialization

set_input_tensor(input_tensor)#

Dummy function for set_input_tensor

__setattr__(name: str, value)#

Set average_gradients_across_tp_domain attribute true on all params so that during finalize_model_grads an all-reduce is performed on this module’s gradients across tensor parallel ranks. This keeps replicated weights synchronized and prevents drift due to non determinism in HF models producing slightly different grads in replicated models on the same inputs.

class core.models.huggingface.module.AutoHuggingFaceModel(config)#

Bases: core.models.huggingface.module.HuggingFaceModule

Wrapper for HuggingFace AutoModel

Initialization

forward(*args, **kwargs)#

Forward function

core.models.huggingface.module.get_hf_model_type(model_path)#

Get the Huggingface model type.

core.models.huggingface.module.build_hf_model(config, model_path)#

Builds Huggingface wrapper model given config and model path.