Note
Attention: Dedicated Container for StarCoder2
For StarCoder2 models, please use the nvcr.io/nvidia/nemo:24.05
container. Also check our StarCoder2 playbooks.
Checkpoint Conversion
NVIDIA provides scripts to convert the external StarCoder2 checkpoints from HuggingFace format to .nemo
format. The .nemo
checkpoint will be used for SFT, PEFT, and inference. NVIDIA also provides scripts to convert .nemo
format back to HuggingFace format.
Run the container using the following command:
docker run --gpus device=1 --shm-size=2g --net=host --ulimit memlock=-1 --rm -it -v ${PWD}:/workspace -w /workspace -v ${PWD}/results:/results nvcr.io/nvidia/nemo:24.05 bash
Convert the HuggingFace StarCoder2 model to .nemo model:
python3 /opt/NeMo/scripts/checkpoint_converters/convert_starcoder2_hf_to_nemo.py \
--input_name_or_path /path/to/starcoder2/checkpoints/hf \
--output_path /path/to/starcoder2.nemo
The generated starcoder2.nemo file uses distributed checkpointing and can be loaded with any tensor parallel (tp) or pipeline parallel (pp) combination without reshaping/splitting.
Convert the Starcoder2 .nemo model to HuggingFace
python3 /opt/NeMo/scripts/checkpoint_converters/convert_starcoder2_hf_to_nemo.py \
--input_name_or_path /path/to/starcoder2/nemo/checkpoint \
--output_path /path/to/hf/folder
The generated HuggingFace checkpoint folder can be loaded using HuggingFace transformers pipeline and upload to the hub.