bridge.models.distillation_provider#

Module Contents#

Classes#

DistillationProvider

Provider for Megatron Core GPT models in distillation mode.

Functions#

convert_to_distillation_provider

Convert a given model provider to a DistillationProvider.

Data#

API#

bridge.models.distillation_provider.logger#

‘getLogger(…)’

class bridge.models.distillation_provider.DistillationProvider(*args, **kwargs)#

Bases: megatron.bridge.models.transformer_config.TransformerConfig

Provider for Megatron Core GPT models in distillation mode.

Please use convert_to_distillation_provider() to create an instance of this class.

Initialization

teacher: Optional[megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider]#

None

kd_config: Optional[megatron.bridge.training.post_training.distillation.ModelOptDistillConfig]#

None

__post_init__()#
provide(
pre_process=None,
post_process=None,
vp_stage=None,
) megatron.core.models.gpt.GPTModel#

Configure and instantiate a ModelOpt DistillationModel based on this configuration.

Parameters:
  • pre_process – Whether to include pre-processing in the model, defaults to first pipeline stage

  • post_process – Whether to include post-processing in the model, defaults to last pipeline stage

  • vp_stage – Virtual pipeline stage

Returns:

Configured ModelOpt DistillationModel instance

Return type:

MCoreGPTModel

to_cfg_dict() dict[str, Any]#

Custom method to save equivalent to the original provider class.

Used by _ConfigContainerBase to serialize the main ConfigContainer to YAML. There is no need to restore a DistillationProvider from the run config file, as it can always be re-converted using the original student provider.

Returns:

Dictionary representation of this provider class

__setattr__(name, value)#
bridge.models.distillation_provider.convert_to_distillation_provider(
student_provider: megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider,
teacher_provider: megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider,
kd_config: Optional[megatron.bridge.training.post_training.distillation.ModelOptDistillConfig] = None,
) bridge.models.distillation_provider.DistillationProvider#

Convert a given model provider to a DistillationProvider.