core.models.retro.encoder_spec#

Specs for Retro encoder.

Module Contents#

Functions#

get_retro_encoder_layer_te_spec

Retro encoder TE spec (uses Transformer Engine components).

get_retro_encoder_layer_local_spec

Retro encoder local spec (uses Megatron-Core components).

get_retro_encoder_block_spec

Retro encoder block spec.

API#

core.models.retro.encoder_spec.get_retro_encoder_layer_te_spec() megatron.core.transformer.ModuleSpec#

Retro encoder TE spec (uses Transformer Engine components).

A Retro encoder layer uses custom attention, bias-dropout-add, and layernorm operators to encode neighboring chunks that are retrieved from the chunk database. Each operator is responsible for iterating the retrieved chunks and processing them individually.

Returns:

A module spec if Transformer Engine modules.

core.models.retro.encoder_spec.get_retro_encoder_layer_local_spec() megatron.core.transformer.ModuleSpec#

Retro encoder local spec (uses Megatron-Core components).

A Retro encoder layer uses custom attention, bias-dropout-add, and layernorm operators to encode neighboring chunks that are retrieved from the chunk database. Each operator is responsible for iterating the retrieved chunks and processing them individually.

Returns:

A module spec if local modules.

core.models.retro.encoder_spec.get_retro_encoder_block_spec(
config: megatron.core.models.retro.config.RetroConfig,
use_transformer_engine: bool,
) megatron.core.transformer.transformer_block.TransformerBlockSubmodules#

Retro encoder block spec.

The retro encoder block consists of one customized Retro encoder layer (layer 1), and all of the following layers are standard GPT layers.

Parameters:
  • config (RetroConfig) – Retro config.

  • use_transformer_engine (bool) – If True, use Transformer Engine (instead of local modules).

Returns:

Transformer block submodules for the given spec.