Parameter Efficient Fine-Tuning (PEFT)

User Guide (Latest Version)

The PEFT stage can execute various PEFT methods, such as P-Tuning, LoRA, Adapters, and IA3, within a single stage by configuring different PEFT schemes. This functionality is implemented using the adapter_mixins framework, which ensures a consistent style. Additionally, the mix-and-match PEFT scheme, like adapter_and_ptuning, can be easily extended to include combinations like ia3_and_ptuning or lora_and_ptuning.

The feature that allowed P-Tuning to insert prompt tokens anywhere in the input is no longer necessary and has been removed to simplify the process.

  1. To specify the configuration for ptuning (LoRA, Adapter, or IA3 learning), include all the run parameters to define the job specific config:

Copy
Copied!
            

run: name: ${.task_name}_${.model_train_name} time_limit: "04:00:00" dependency: "singleton" convert_name: convert_nemo model_train_name: nemotron task_name: "squad" results_dir: ${base_results_dir}/${.model_train_name}/ptuning_${.task_name}

  1. To specify which language model checkpoint to load and its definition, use the model parameter. For Nemotron 340B, use the following model parallel settings for PEFT:

Copy
Copied!
            

model: language_model_path: ${base_results_dir}/${peft.run.model_train_name}/${peft.run.convert_name}/nemotron.nemo tensor_model_parallel_size: 8 pipeline_model_parallel_size: 3

  1. Set the configuration for a Slurm cluster in the conf/cluster/bcm.yaml file:

Copy
Copied!
            

partition: null account: null exclusive: True gpus_per_task: null gpus_per_node: 8 mem: 0 overcommit: False job_name_prefix: "nemo-megatron-"

  1. To run only the evaluation pipeline, while excluding the data preparation, training, conversion, and inference pipelines, set the conf/config.yaml file to:

Copy
Copied!
            

stages: - peft

  1. Next, run the following Python script:

Copy
Copied!
            

python3 main.py \ peft=nemotron/squad \ stages=["peft"] \ peft.model.peft.peft_scheme="ptuning" \ peft.model.megatron_amp_O2=True \ peft.model.restore_from_path=${LANGUAGE_MODEL_PATH}\ peft.exp_manager.exp_dir=${BASE_RESULTS_DIR}/${RUN_NAME}/ptuning \

To run the P-Tuning learning script on the Base Command Platform, you n eed to adjust the cluster_type parameter in the conf/config.yaml file to either bcp or interactive. Alternatively, you can override this setting directly from the command line using hydra.

To run the P-Tuning pipeline on a converted checkpoint in the Nemotron model, use the following command::

Copy
Copied!
            

export HYDRA_FULL_ERROR=1 export TORCH_CPP_LOG_LEVEL=INFO NCCL_DEBUG=INFO TRAIN="[/mount/workspace/databricks-dolly-15k-train.jsonl]" VALID="[/mount/workspace/databricks-dolly-15k-val.jsonl]" VALID_NAMES="[peft-squad]" CONCAT_SAMPLING_PROBS="[1]" PEFT_SCHEME="ptuning" PEFT_EXP_DIR="/results/nemo_launcher/ptuning" LOG_DIR="/results/nemo_launcher/ptuning_log" TP_SIZE=8 PP_SIZE=3 python3 /opt/NeMo-Framework-Launcher/launcher_scripts/main.py \ peft=nemotron/squad \ stages=[peft] \ cluster_type=interactive \ launcher_scripts_path=/opt/NeMo-Framework-Launcher/launcher_scripts \ peft.model.peft.peft_scheme=${PEFT_SCHEME} \ peft.trainer.precision=bf16 \ peft.trainer.max_steps=100 \ peft.trainer.devices=2 \ peft.trainer.val_check_interval=10 \ peft.model.megatron_amp_O2=True \ peft.model.restore_from_path=/mount/workspace/nemotron.nemo \ peft.model.tensor_model_parallel_size=${TP_SIZE} \ peft.model.pipeline_model_parallel_size=${PP_SIZE} \ peft.model.optim.lr=1e-4 \ peft.model.answer_only_loss=True \ peft.model.data.train_ds.file_names=${TRAIN} \ peft.model.data.train_ds.micro_batch_size=1 \ peft.model.data.train_ds.global_batch_size=32 \ peft.model.data.train_ds.concat_sampling_probabilities=${CONCAT_SAMPLING_PROBS} \ peft.model.data.validation_ds.micro_batch_size=1 \ peft.model.data.validation_ds.global_batch_size=32 \ peft.model.data.validation_ds.file_names=${VALID} \ peft.model.data.validation_ds.names=${VALID_NAMES} \ peft.model.data.test_ds.micro_batch_size=1 \ peft.model.data.test_ds.global_batch_size=128 \ peft.model.data.train_ds.num_workers=0 \ peft.model.data.validation_ds.num_workers=0 \ peft.model.data.test_ds.num_workers=0 \ peft.model.data.validation_ds.metric.name=loss \ peft.model.data.test_ds.metric.name=loss \ peft.exp_manager.exp_dir=${PEFT_EXP_DIR} \ peft.exp_manager.explicit_log_dir=${LOG_DIR} \ peft.exp_manager.resume_if_exists=True \ peft.exp_manager.resume_ignore_no_checkpoint=True \ peft.exp_manager.create_checkpoint_callback=True \ peft.exp_manager.checkpoint_callback_params.monitor=validation_loss

The above command presumes that you’ve mounted the data workspace at /mount/workspace/ and the results workspace at /results. The sample script uses the databricks-dolly-15k dataset.

For different PEFT jobs, you need to specify different directories for peft.exp_manager.exp_dir. The standard output (stdout) and standard error (stderr) will be redirected to /results/nemo_launcher/ptuning_log, enabling you to download the logs from NVIDIA NGC. You can also add any other parameter to the command to alter its functionality.

Previous Model Evaluation
Next Supervised Fine-tuning (SFT)
© | | | | | | |. Last updated on Jun 19, 2024.