`bridge.models.distillation_provider`#

Module Contents#

Classes#

DistillationProvider

Provider for Megatron Core GPT models in distillation mode.

Functions#

convert_to_distillation_provider

Convert a given model provider to a DistillationProvider.

Data#

logger

API#

bridge.models.distillation_provider.logger#: ‘getLogger(…)’

class bridge.models.distillation_provider.DistillationProvider(*args, **kwargs)#

Bases: megatron.bridge.models.transformer_config.TransformerConfig

Provider for Megatron Core GPT models in distillation mode.

Please use convert_to_distillation_provider() to create an instance of this class.

Initialization

teacher: Optional[megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider]#: None

kd_config: Optional[megatron.bridge.training.post_training.distillation.ModelOptDistillConfig]#: None

__post_init__()#

provide( pre_process=None, post_process=None, vp_stage=None, ) → megatron.core.models.gpt.GPTModel#

Configure and instantiate a ModelOpt DistillationModel based on this configuration.

Parameters:

pre_process – Whether to include pre-processing in the model, defaults to first pipeline stage
post_process – Whether to include post-processing in the model, defaults to last pipeline stage
vp_stage – Virtual pipeline stage

Returns:

Configured ModelOpt DistillationModel instance

Return type:

MCoreGPTModel

to_cfg_dict() → dict[str, Any]#

Custom method to save equivalent to the original provider class.

Used by _ConfigContainerBase to serialize the main ConfigContainer to YAML. There is no need to restore a DistillationProvider from the run config file, as it can always be re-converted using the original student provider.

Returns:: Dictionary representation of this provider class

__setattr__(name, value)#

bridge.models.distillation_provider.convert_to_distillation_provider( student_provider: megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider, teacher_provider: megatron.bridge.models.gpt_provider.GPTModelProvider | megatron.bridge.models.mamba.mamba_provider.MambaModelProvider, kd_config: Optional[megatron.bridge.training.post_training.distillation.ModelOptDistillConfig] = None, ) → bridge.models.distillation_provider.DistillationProvider#: Convert a given model provider to a DistillationProvider.

bridge.models.distillation_provider#