bridge.models.distillation_provider#
Module Contents#
Classes#
Provider for Megatron Core GPT models in distillation mode. |
Functions#
Convert a given model provider to a DistillationProvider. |
Data#
API#
- bridge.models.distillation_provider.logger#
‘getLogger(…)’
- class bridge.models.distillation_provider.DistillationProvider(*args, **kwargs)#
Bases:
megatron.bridge.models.transformer_config.TransformerConfigProvider for Megatron Core GPT models in distillation mode.
Please use
convert_to_distillation_provider()to create an instance of this class.Initialization
- teacher: Optional[megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider]#
None
- kd_config: Optional[megatron.bridge.training.post_training.distillation.ModelOptDistillConfig]#
None
- __post_init__()#
- provide(
- pre_process=None,
- post_process=None,
- vp_stage=None,
Configure and instantiate a ModelOpt DistillationModel based on this configuration.
- Parameters:
pre_process – Whether to include pre-processing in the model, defaults to first pipeline stage
post_process – Whether to include post-processing in the model, defaults to last pipeline stage
vp_stage – Virtual pipeline stage
- Returns:
Configured ModelOpt DistillationModel instance
- Return type:
MCoreGPTModel
- to_cfg_dict() dict[str, Any]#
Custom method to save equivalent to the original provider class.
Used by
_ConfigContainerBaseto serialize the mainConfigContainerto YAML. There is no need to restore aDistillationProviderfrom the run config file, as it can always be re-converted using the original student provider.- Returns:
Dictionary representation of this provider class
- __setattr__(name, value)#
- bridge.models.distillation_provider.convert_to_distillation_provider(
- student_provider: megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider,
- teacher_provider: megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider,
- kd_config: Optional[megatron.bridge.training.post_training.distillation.ModelOptDistillConfig] = None,
Convert a given model provider to a DistillationProvider.