Parameter Efficient Fine-Tuning (PEFT)

To run PEFT update conf/config.yaml:

Copy
Copied!
            

defaults: - peft: llama/squad stages: - peft

For CodeLlama, replace llama/squad with code_llama/human_eval.

Execute launcher pipeline: python3 main.py

Configuration

Default configurations for PEFT with squad can be found in conf/peft/llama/squad.yaml. Fine-tuning configuration is divided into four sections run, trainer, exp_manger and model.

Copy
Copied!
            

run: name: peft_llama_8b time_limit: "04:00:00" dependency: "singleton" convert_name: convert_nemo model_train_name: llama3_8b convert_dir: ${base_results_dir}/${peft.run.model_train_name}/${peft.run.convert_name} task_name: "squad" results_dir: ${base_results_dir}/${.model_train_name}/peft_${.task_name}

Set the number of nodes and devices for fine-tuning:

Copy
Copied!
            

trainer: num_nodes: 1 devices: 8

Copy
Copied!
            

model: restore_from_path: ${peft.run.convert_dir}/results/megatron_llama.nemo

restore_from_path sets the path to the .nemo checkpoint to run fine-tuning.

Set tensor parallel and pipelien parallel size for different model sizes.

For 8B/13B PEFT:

Copy
Copied!
            

model: tensor_model_parallel_size: 2 pipeline_model_parallel_size: 1

For 70B PEFT:

Copy
Copied!
            

model: tensor_model_parallel_size: 8 pipeline_model_parallel_size: 1

Set PEFT specific configruation:

Copy
Copied!
            

model: peft: peft_scheme: "lora"

peft_scheme sets the fine-tuning scheme to be used. Supported schemes include: lora, adapter, ia3, ptuning.

Previous Model Evaluation
Next Model Export to TensorRT-LLM
© Copyright 2023-2024, NVIDIA. Last updated on May 17, 2024.