bridge.models.gemma.modules#
Module Contents#
Classes#
A mixin class for scaling embeddings in Megatron GPT.
The scaling is applied only if the configuration (accessible via |
Functions#
Apply mixins to a class instance after creation |
API#
- bridge.models.gemma.modules.extend_instance(obj, mixin)#
Apply mixins to a class instance after creation
- class bridge.models.gemma.modules.EmbeddingScalingMixin#
Bases:
torch.nn.ModuleA mixin class for scaling embeddings in Megatron GPT. The scaling is applied only if the configuration (accessible via
self.config) includesapply_embedding_scalingset to True.- forward(**kwargs)#
Forward pass that scales the output embeddings from the
forwardmethod of the superclass by the square root of the hidden size specified in the configuration.