core.models.retro.encoder_spec#
Specs for Retro encoder.
Module Contents#
Functions#
Retro encoder TE spec (uses Transformer Engine components). |
|
Retro encoder local spec (uses Megatron-Core components). |
|
Retro encoder block spec. |
API#
- core.models.retro.encoder_spec.get_retro_encoder_layer_te_spec() megatron.core.transformer.ModuleSpec#
Retro encoder TE spec (uses Transformer Engine components).
A Retro encoder layer uses custom attention, bias-dropout-add, and layernorm operators to encode neighboring chunks that are retrieved from the chunk database. Each operator is responsible for iterating the retrieved chunks and processing them individually.
- Returns:
A module spec if Transformer Engine modules.
- core.models.retro.encoder_spec.get_retro_encoder_layer_local_spec() megatron.core.transformer.ModuleSpec#
Retro encoder local spec (uses Megatron-Core components).
A Retro encoder layer uses custom attention, bias-dropout-add, and layernorm operators to encode neighboring chunks that are retrieved from the chunk database. Each operator is responsible for iterating the retrieved chunks and processing them individually.
- Returns:
A module spec if local modules.
- core.models.retro.encoder_spec.get_retro_encoder_block_spec(
- config: megatron.core.models.retro.config.RetroConfig,
- use_transformer_engine: bool,
Retro encoder block spec.
The retro encoder block consists of one customized Retro encoder layer (layer 1), and all of the following layers are standard GPT layers.
- Parameters:
config (RetroConfig) – Retro config.
use_transformer_engine (bool) – If True, use Transformer Engine (instead of local modules).
- Returns:
Transformer block submodules for the given spec.