core.models.huggingface.module#
Module Contents#
Classes#
Basic module for huggingface. |
|
Wrapper for HuggingFace AutoModel |
Functions#
Get the Huggingface model type. |
|
Builds Huggingface wrapper model given config and model path. |
API#
- class core.models.huggingface.module.HuggingFaceModule(config)#
Bases:
megatron.core.transformer.module.MegatronModuleBasic module for huggingface.
Initialization
- set_input_tensor(input_tensor)#
Dummy function for set_input_tensor
- __setattr__(name: str, value)#
Set average_gradients_across_tp_domain attribute true on all params so that during finalize_model_grads an all-reduce is performed on this module’s gradients across tensor parallel ranks. This keeps replicated weights synchronized and prevents drift due to non determinism in HF models producing slightly different grads in replicated models on the same inputs.
- class core.models.huggingface.module.AutoHuggingFaceModel(config)#
Bases:
core.models.huggingface.module.HuggingFaceModuleWrapper for HuggingFace AutoModel
Initialization
- forward(*args, **kwargs)#
Forward function
- core.models.huggingface.module.get_hf_model_type(model_path)#
Get the Huggingface model type.
- core.models.huggingface.module.build_hf_model(config, model_path)#
Builds Huggingface wrapper model given config and model path.