peft/automodel#
This step trains a low-rank adaptation (LoRA) adapter on top of a Hugging Face (HF) format base model by using the NeMo AutoModel library.
The training loop matches sft/automodel, with a LoRA adapter wired in by default to keep large base models practical for adapter tuning.
The step produces a checkpoint_lora artifact that you can merge with the base model by using the convert/merge_lora step.
Syntax#
nemotron steps run peft/automodel \
[-c <config-name-or-path>] \
[-r <run-profile> | -b <batch-profile>] \
[-d] \
[--force-squash] \
[<dotlist-overrides>...] \
[<passthrough-args>...]
Refer to the Nemotron Steps CLI Reference for the shared flag set.
Configuration Files#
The step ships two configuration files under src/nemotron/steps/peft/automodel/config/.
File |
Purpose |
|---|---|
|
Full-shape LoRA tuning on |
|
Short validation run with a small dataset slice and a short training schedule. |
Pass the configuration name with -c:
$ nemotron steps run peft/automodel -c tiny
$ nemotron steps run peft/automodel -c default
Inputs and Outputs#
Direction |
Artifact Type |
Required |
Description |
|---|---|---|---|
Consumes |
|
Yes |
Chat-format JSON Lines with a |
Produces |
|
— |
LoRA adapter weights. Merge the adapter with the base model by using |
Supported Models#
The manifest declares three reference base models.
Other Hugging Face causal language models that load through AutoModelForCausalLM.from_pretrained also work.
Model |
Minimum GPUs |
Default |
Notes |
|---|---|---|---|
|
8 |
Yes |
Mixture-of-experts (MoE) base model used by |
|
2 |
No |
Common dense baseline for single-node LoRA tuning. |
|
1 |
No |
Strong default for single-GPU LoRA tuning. |
Override the base model from the command line:
$ nemotron steps run peft/automodel -c default \
model.pretrained_model_name_or_path=mistralai/Mistral-7B-Instruct-v0.3
Step Parameters#
The manifest declares two LoRA-specific parameters. Pass them as dotlist overrides.
- peft.dim=<n>#
The LoRA rank. Values in the eight-to-thirty-two range work well for most tasks. Raise the rank when the downstream task is harder than the base task.
Default:
16.Example:
peft.dim=32
- peft.alpha=<n>#
The LoRA alpha scaling factor. A value equal to twice the rank works well in practice.
Default:
32.Example:
peft.alpha=64
Frequently used dotlist overrides drawn from the AutoModel recipe include the following.
- step_scheduler.max_steps=<n>#
The maximum number of optimizer steps for the run.
Example:
step_scheduler.max_steps=200
- step_scheduler.global_batch_size=<n>#
The global batch size across all data-parallel workers.
Example:
step_scheduler.global_batch_size=64
- dataset.path_or_dataset_id=<id-or-path>#
The Hugging Face dataset identifier or a local path that resolves to a JSON Lines chat dataset.
Example:
dataset.path_or_dataset_id=/data/my-instructions.jsonl
- peft.dropout=<float>#
The LoRA dropout rate applied during training.
Example:
peft.dropout=0.05
Strategies#
The manifest records two operator strategies for peft/automodel.
When the run targets a single GPU or memory is tight, keep
peft.dimlow, in the eight-to-sixteen range, and prefer a Mistral-class base model.When the adapter is intended for deployment, run
convert/merge_loraafter training to merge the adapter with the base model and produce a standalone Hugging Face checkpoint.
Common Errors#
- oom#
Cause: GPU memory is exhausted during forward or backward passes.
Recovery: reduce
peft.dim, lowerstep_scheduler.local_batch_size, lower the maximum sequence length, or move to a smaller base model.
Command Examples#
Run the tiny validation configuration locally:
$ nemotron steps run peft/automodel -c tiny
Compile the default configuration on a Lepton profile without submitting the job:
$ nemotron steps run peft/automodel -c default -r lepton_peft_automodel --dry-run
Submit an attached LoRA run with a higher adapter rank:
$ nemotron steps run peft/automodel -c default -r lepton_peft_automodel \
peft.dim=32 \
peft.alpha=64 \
step_scheduler.max_steps=500
Submit a detached single-GPU LoRA run on a Slurm profile against a smaller base model:
$ nemotron steps run peft/automodel -c default -b slurm_peft_automodel \
model.pretrained_model_name_or_path=mistralai/Mistral-7B-Instruct-v0.3 \
peft.dim=8 \
peft.alpha=16