Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to the Migration Guide for information on getting started.

Access Long Context Recipe

NeMo 2.0 is providing tested recipes to train long context models. The recipe is available in the NeMo LLM recipes directory.

The following charts for the Llama 3, Mixtral, and Nemotron models show the different sequence lengths supported by each model at various sizes.

Llama 3

Sequence Length

8B

70B

16k

Yes

Yes

64k

Yes

Yes

Mixtral

Sequence Length

8x3B

8x7B

16k

Yes

Yes

64k

Yes

Yes

Nemotron

Important

Not yet supported in NeMo 2.0