bridge.models.deepseek.deepseek_v3_bridge#
Module Contents#
Classes#
Megatron Bridge for DeepSeek-V3. |
API#
- class bridge.models.deepseek.deepseek_v3_bridge.DeepSeekV3Bridge#
Bases:
megatron.bridge.models.conversion.model_bridge.MegatronModelBridgeMegatron Bridge for DeepSeek-V3.
- provider_bridge(
- hf_pretrained: megatron.bridge.models.hf_pretrained.causal_lm.PreTrainedCausalLM,
- mapping_registry() megatron.bridge.models.conversion.mapping_registry.MegatronMappingRegistry#
- maybe_modify_converted_hf_weight(
- task: megatron.bridge.models.conversion.model_bridge.WeightConversionTask,
- converted_weights_dict: Dict[str, torch.Tensor],
- hf_state_dict: Mapping[str, torch.Tensor],
Add rotary embedding inverse frequency parameter if needed.