Skip to main content
Ctrl+K
NeMo-AutoModel - Home NeMo-AutoModel - Home

NeMo-AutoModel

  • GitHub
NeMo-AutoModel - Home NeMo-AutoModel - Home

NeMo-AutoModel

  • GitHub

Table of Contents

Get Started

  • Introduction to the NeMo Automodel Repository
  • Install NeMo Automodel
  • Run on Your Local Workstation

Guides

  • Fine-Tune Gemma 3 and Gemma 3n

E2E Examples

  • Supervised Fine-Tuning (SFT) with NeMo Automodel
  • Parameter-Efficient Fine-Tuning (PEFT) with NeMo Automodel
  • Pretraining using Megatron-Core Datasets with NeMo Automodel
  • NanoGPT-style Pre-Training with NeMo Automodel

Model Coverage

  • Large Language Models (LLMs)
  • Vision Language Models (VLMs)

Development

  • Checkpointing in NeMo AutoModel
  • 🚀 Gradient (Activation) Checkpointing in NeMo-AutoModel
  • Knowledge Distillation with NeMo-AutoModel
  • Pipeline Parallelism with AutoPipeline
  • Integrate Your Own Text Dataset
  • Integrate Your Own Multi-Modal Dataset
  • Use the ColumnMappedTextInstructionDataset
  • FP8 Training in NeMo Automodel
  • API Reference
    • nemo_automodel
      • nemo_automodel.components
        • nemo_automodel.components._peft
        • nemo_automodel.components._transformers
        • nemo_automodel.components.attention
        • nemo_automodel.components.checkpoint
        • nemo_automodel.components.config
        • nemo_automodel.components.datasets
        • nemo_automodel.components.distributed
        • nemo_automodel.components.launcher
        • nemo_automodel.components.loggers
        • nemo_automodel.components.loss
        • nemo_automodel.components.models
        • nemo_automodel.components.moe
        • nemo_automodel.components.optim
        • nemo_automodel.components.quantization
        • nemo_automodel.components.training
        • nemo_automodel.components.utils
      • nemo_automodel.shared
        • nemo_automodel.shared.import_utils
        • nemo_automodel.shared.utils
  • API Reference
  • nemo_automodel
  • nemo_automodel.components
  • nemo_automodel.components.models

nemo_automodel.components.models#

Convenience model builders for NeMo Automodel.

Currently includes: • build_gpt2_model – returns a GPT-2 causal language model (Flash-Attention-2 by default).

Subpackages#

  • nemo_automodel.components.models.deepseek_v3
    • nemo_automodel.components.models.deepseek_v3.layers
    • nemo_automodel.components.models.deepseek_v3.model
    • nemo_automodel.components.models.deepseek_v3.state_dict_adapter
  • nemo_automodel.components.models.gpt_oss
    • nemo_automodel.components.models.gpt_oss.layers
    • nemo_automodel.components.models.gpt_oss.model
    • nemo_automodel.components.models.gpt_oss.state_dict_adapter

Submodules#

  • nemo_automodel.components.models.gpt2

Package Contents#

Data#

__all__

API#

nemo_automodel.components.models.__all__#

[‘build_gpt2_model’]

previous

nemo_automodel.components.loss.te_parallel_ce

next

nemo_automodel.components.models.deepseek_v3

On this page
  • Subpackages
  • Submodules
  • Package Contents
    • Data
    • API
      • __all__
NVIDIA NVIDIA
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2025, NVIDIA Corporation.