Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.
Known Issues
Fixes for the following issues will be released shortly:
To handle model structure updates when loading checkpoints trained with Transformer Engine versions earlier than v1.10, use
model.dist_ckpt_load_strictness=log_all
when working with Transformer Engine v1.10 or higher.For data preparation of GPT models, use your own dataset or an online dataset legally approved by your organization
Race condition in NeMo experiment manager
Mistral & Mixtral tokenizers require Hugging Face login
Gemma, Starcoder, and Falcon 7B export to TRT-LLM only works with a single GPU and if the user wants to export, there is no descriptive error message shown to the user.
The following notebooks have functional issues and will be fixed in the next release
ASR_with_NeMo.ipynb
ASR_with_Subword_Tokenization.ipynb
AudioTranslationSample.ipynb
Megatron_Synthetic_Tabular_Data_Generation.ipynb
SpellMapper_English_ASR_Customization.ipynb
FastPitch_ChineseTTS_Training.ipynb
NeVA Tutorial.ipynb
Export
Export Llama70B vLLM has an out of memory issue. It requires more time for the root cause analysis.
Export vLLM will not support LoRA, P-tuning, and LoRA support will be added in the next release.
In-framework (PyTorch level) deployment with 8GPUs is giving an error and requires time to understand the reason behind it.
Multimodal - LITA tutorial issue: tutorials/multimodal/LITA_Tutorial.ipynb The data preparation part requires users to manually download the youmakeup dataset instead of using the provided script. - Additional argument exp_manager.checkpoint_callback_params.save_nemo_on_train_end=True should be added to Neva Notebook Pretraining Part to ensure e2e workflow.
ASR - Timestamps miss alignment with Fastconformer ASR models when using diarization with ASR decoder. Related Issue: #8438