Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

SpeechLLM API

Model Classes

class nemo.collections.nlp.models.language_modeling.megatron_base_model.MegatronBaseModel(*args: Any, **kwargs: Any)

Bases: nemo.collections.nlp.models.nlp_model.NLPModel

Megatron base class. All NeMo Megatron models inherit from this class.

  • Initialize the model parallel world for nemo.

  • Turn on all of the nvidia optimizations.

  • If cfg.tokenizer is available, it loads the tokenizer and pad the vocab to the correct size for tensor model parallelism.

  • If using distributed optimizer, configure to be compatible with O2 level optimizations and/or model parallelism.

  • Perform gradient clipping: grad_clip_pl_default triggers the PyTorch Lightning default implementation, with_distributed_adam triggers the distributed optimizer’s implementation, megatron_amp_O2 triggers gradient clipping on the main grads, and otherwise gradient clipping is performed on the model grads.

__init__(cfg: omegaconf.dictconfig.DictConfig, trainer: pytorch_lightning.trainer.trainer.Trainer, no_lm_init=True)

Base class from which all NeMo models should inherit

Parameters
  • cfg (DictConfig) –

    configuration object. The cfg object should have (optionally) the following sub-configs:

    • train_ds - to instantiate training dataset

    • validation_ds - to instantiate validation dataset

    • test_ds - to instantiate testing dataset

    • optim - to instantiate optimizer with learning rate scheduler

  • trainer (Optional) – Pytorch Lightning Trainer instance

Modules

Dataset Classes