Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

Long Context Recipe#

Long context model training enhances the capability of large language models to handle long context inputs. Longer sequence lengths are beneficial for many NLP tasks, such as document-level summarization, long document classification, and long document question answering. NeMo Framework provides a recipe to train long context models like Llama-3, Mixtral, and Nemotron.