The PEFT stage can execute various PEFT methods, such as P-Tuning, LoRA, Adapters, and IA3, within a single stage by configuring different PEFT schemes. This functionality is implemented using the adapter_mixins framework, which ensures a consistent style. Additionally, the mix-and-match PEFT scheme, like adapter_and_ptuning, can be easily extended to include combinations like ia3_and_ptuning or lora_and_ptuning.
The feature that allowed P-Tuning to insert prompt tokens anywhere in the input is no longer necessary and has been removed to simplify the process.
To specify the configuration for ptuning (LoRA, Adapter, or IA3 learning), include all the run parameters to define the job specific config:
run:
name: ${.task_name}_${.model_train_name}
time_limit: "04:00:00"
dependency: "singleton"
convert_name: convert_nemo
model_train_name: nemotron
task_name: "squad"
results_dir: ${base_results_dir}/${.model_train_name}/ptuning_${.task_name}
To specify which language model checkpoint to load and its definition, use the model parameter. For Nemotron 340B, use the following model parallel settings for PEFT:
model:
language_model_path: ${base_results_dir}/${peft.run.model_train_name}/${peft.run.convert_name}/nemotron.nemo
tensor_model_parallel_size: 8
pipeline_model_parallel_size: 3
Set the configuration for a Slurm cluster in the conf/cluster/bcm.yaml file:
partition: null
account: null
exclusive: True
gpus_per_task: null
gpus_per_node: 8
mem: 0
overcommit: False
job_name_prefix: "nemo-megatron-"
To run only the evaluation pipeline, while excluding the data preparation, training, conversion, and inference pipelines, set the conf/config.yaml file to:
stages:
- peft
Next, run the following Python script:
python3 main.py \
peft=nemotron/squad \
stages=["peft"] \
peft.model.peft.peft_scheme="ptuning" \
peft.model.megatron_amp_O2=True \
peft.model.restore_from_path=${LANGUAGE_MODEL_PATH}\
peft.exp_manager.exp_dir=${BASE_RESULTS_DIR}/${RUN_NAME}/ptuning \
To run the P-Tuning learning script on the Base Command Platform, you n eed to adjust the cluster_type
parameter in the conf/config.yaml file to either bcp
or interactive
. Alternatively, you can override this setting directly from the command line using hydra
.
To run the P-Tuning pipeline on a converted checkpoint in the Nemotron model, use the following command::
export HYDRA_FULL_ERROR=1
export TORCH_CPP_LOG_LEVEL=INFO NCCL_DEBUG=INFO
TRAIN="[/mount/workspace/databricks-dolly-15k-train.jsonl]"
VALID="[/mount/workspace/databricks-dolly-15k-val.jsonl]"
VALID_NAMES="[peft-squad]"
CONCAT_SAMPLING_PROBS="[1]"
PEFT_SCHEME="ptuning"
PEFT_EXP_DIR="/results/nemo_launcher/ptuning"
LOG_DIR="/results/nemo_launcher/ptuning_log"
TP_SIZE=8
PP_SIZE=3
python3 /opt/NeMo-Framework-Launcher/launcher_scripts/main.py \
peft=nemotron/squad \
stages=[peft] \
cluster_type=interactive \
launcher_scripts_path=/opt/NeMo-Framework-Launcher/launcher_scripts \
peft.model.peft.peft_scheme=${PEFT_SCHEME} \
peft.trainer.precision=bf16 \
peft.trainer.max_steps=100 \
peft.trainer.devices=2 \
peft.trainer.val_check_interval=10 \
peft.model.megatron_amp_O2=True \
peft.model.restore_from_path=/mount/workspace/nemotron.nemo \
peft.model.tensor_model_parallel_size=${TP_SIZE} \
peft.model.pipeline_model_parallel_size=${PP_SIZE} \
peft.model.optim.lr=1e-4 \
peft.model.answer_only_loss=True \
peft.model.data.train_ds.file_names=${TRAIN} \
peft.model.data.train_ds.micro_batch_size=1 \
peft.model.data.train_ds.global_batch_size=32 \
peft.model.data.train_ds.concat_sampling_probabilities=${CONCAT_SAMPLING_PROBS} \
peft.model.data.validation_ds.micro_batch_size=1 \
peft.model.data.validation_ds.global_batch_size=32 \
peft.model.data.validation_ds.file_names=${VALID} \
peft.model.data.validation_ds.names=${VALID_NAMES} \
peft.model.data.test_ds.micro_batch_size=1 \
peft.model.data.test_ds.global_batch_size=128 \
peft.model.data.train_ds.num_workers=0 \
peft.model.data.validation_ds.num_workers=0 \
peft.model.data.test_ds.num_workers=0 \
peft.model.data.validation_ds.metric.name=loss \
peft.model.data.test_ds.metric.name=loss \
peft.exp_manager.exp_dir=${PEFT_EXP_DIR} \
peft.exp_manager.explicit_log_dir=${LOG_DIR} \
peft.exp_manager.resume_if_exists=True \
peft.exp_manager.resume_ignore_no_checkpoint=True \
peft.exp_manager.create_checkpoint_callback=True \
peft.exp_manager.checkpoint_callback_params.monitor=validation_loss
The above command presumes that you’ve mounted the data workspace at /mount/workspace/
and the results workspace at /results
. The sample script uses the databricks-dolly-15k dataset.
For different PEFT jobs, you need to specify different directories for peft.exp_manager.exp_dir
. The standard output (stdout) and standard error (stderr) will be redirected to /results/nemo_launcher/ptuning_log
, enabling you to download the logs from NVIDIA NGC. You can also add any other parameter to the command to alter its functionality.