nemo_automodel.components.models.gpt2
nemo_automodel.components.models.gpt2
GPT-2 model utility wrappers for NeMo Automodel.
The canonical way to instantiate a GPT-2 with custom sizes is to pass a
transformers.GPT2Config into NeMoAutoModelForCausalLM.from_config. For
YAML-driven workflows, however, specifying the entire nested config can be
verbose. This module provides a single-level builder function that exposes
the most common GPT-2 hyper-parameters directly.
Example (YAML):
Module Contents
Classes
Functions
Data
API
Bases: Module
Multi-head self-attention with a causal mask.
Bases: Module
Minimal GPT-2 Causal-LM with tied input/output embeddings.
Parameter initialization following GPT-2 conventions.
Bases: Module
GPT-2 feed-forward network (GEGLU → Linear).
Bases: Module
A single transformer block (LN → Attn → Add → LN → MLP → Add).
Instantiate and return a pure-PyTorch GPT-2 language model.
The function intentionally keeps the same signature as the original wrapper so existing YAML/CLI configurations continue to work. Extra keyword arguments are quietly ignored.