Step #4: Convert the BigNLP Model from PyTorch to NeMo
The NeMo model may already have been uploaded to the NGC model registry. If it has been, some of this section can be skipped.
Attempt to download the NeMo model and upload to workspace using commands below.
ngc registry model \
--org nv-launchpad-bc \
--team no-team \
download-version "nv-launchpad-bc/bignlp_model:nemo_126m_bf16_o2"
If you receive an error (see below), the model does not yet exist in the registry and you will need to convert the ckpt file created in the previous step.
Error: 'nv-launchpad-bc/bignlp_model:nemo_126m_bf16_o2' could not be found.
If the model downloads successfully, use the following command to upload it to your workspace, and move on to the next step.
ngc workspace upload \
--org nv-launchpad-bc \
--team no-team \
--ace nv-launchpad-bc-iad1 \
--source bignlp_model_vnemo_126m_bf16_o2/megatron_gpt_126m_bf16.nemo \
--destination results/126m/convert_nemo/ \
jdoe_workspace
Otherwise, follow the steps below to convert. The trained model ckpt is in PyTorch format. We will use the NeMo format for the inference examples later.
Convert from PyTorch format to NeMo format.
The tensor_model_parallel_size should be set to 1 for the 126m model, 2 for 5b, and 8 for 20b. The ngc cli command for this conversion is shown below. It should not take more than five minutes.
ngc batch run \ --name "bignlp_convert_126m_bf16_to_nemo" \ --org nv-launchpad-bc \ --team no-team \ --ace nv-launchpad-bc-iad1 \ --instance dgxa100.80g.8.norm \ --replicas 2 \ --array-type PYTORCH \ --image "nvcr.io/nv-launchpad-bc/bignlp-training:22.02-py3" \ --result /results \ --workspace jdoe_workspace:/mount_workspace:RW \ --total-runtime 5m \ --commandline "\ set -x && \ python3 /opt/bignlp/bignlp-scripts/main.py \ cluster_type=bcp \ run_data_preparation=False \ run_training=False \ run_conversion=True \ run_evaluation=False \ conversion=convert_gpt3 \ bignlp_path=/opt/bignlp/bignlp-scripts \ base_results_dir=/mount_workspace/results \ conversion.run.model_train_name=gpt3_126m \ conversion.run.train_dir=/mount_workspace/results/gpt3_126m \ conversion.run.output_path=/mount_workspace/results/gpt3_126m/convert_nemo \ conversion.run.results_dir=/mount_workspace/results/gpt3_126m/convert_nemo \ conversion.run.nemo_file_name=megatron_gpt_126m_bf16.nemo \ conversion.model.checkpoint_folder=/mount_workspace/results/gpt3_126m/checkpoints \ conversion.model.checkpoint_name=megatron_gpt*-last.ckpt \ conversion.model.vocab_file=/mount_workspace/data/bpe/vocab.json \ conversion.model.merge_file=/mount_workspace/data/bpe/merges.txt \ conversion.model.tensor_model_parallel_size=1 \ > >(tee -a /mount_workspace/results/gpt3_126m/convert_nemo/convert_bf16.log) 2>&1 && \ rsync -P -rvh /mount_workspace/results/gpt3_126m /results"
The job results and workspace should now contain the NeMo model. The NeMo formatted model can optionally be uploaded to the NGC model registry as well.
(Optional) Download the NeMo formatted file from workspace then upload to the NGC model registry.
ngc workspace download \ --org nv-launchpad-bc \ --team no-team \ --ace nv-launchpad-bc-iad1 \ --file results/gpt3_126m/convert_nemo/megatron_gpt_126m_bf16.nemo \ jdoe_workspace
(Optional) Use the NGC web UI to upload the downloaded model to the model registry. Refer to the documentation for steps to upload models to the model registry.