About Fine-Tuning
About Fine-Tuning
Learn how to fine-tune models by making requests to NVIDIA NeMo Customizer through the API. Fine-tuned models you have created can be deployed using NVIDIA NIMs.
Fine-Tuning Workflow
At a high level, the fine-tuning workflow consists of the following steps:
- Create a Model Entity pointing to your base model checkpoint (stored as a FileSet).
- Format a compatible dataset.
- Create a customization job referencing the Model Entity.
- Monitor the job until it completes.
- The customization job automatically creates either:
- LoRA jobs: An adapter attached to the original Model Entity
- Full fine-tuning jobs: A new Model Entity with the customized weights
- Deploy the model using the Deployment Management Service.
- Move on to Evaluate the output model.
Model Catalog
Explore the model families and sizes supported by NVIDIA NeMo Customizer.
View the available Llama models in the model catalog.
View the available Llama Nemotron models from NVIDIA, including Nano and Super variants for efficient and advanced instruction tuning.
View the available Phi models from Microsoft, designed for strong reasoning capabilities with efficient deployment.
View the available GPT-OSS models supported for Full SFT customization.
View the available embedding models for question-answering and retrieval tasks.
Task Guides
Perform common fine-tuning tasks.
Create, list, view, and cancel customization jobs.
Create FileSets and Model Entities to prepare base models for customization.
Upload and manage datasets for training.
Tutorials
Follow these tutorials to learn how to accomplish common fine-tuning tasks.
Learn how to format datasets for different model types.
datasets chat-models completion-modelsLearn how to start a LoRA customization job using a custom dataset.
nemo-customizerLearn how to start a SFT customization job using a custom dataset.
nemo-customizerLearn how to compress a larger teacher model into a smaller student model.
nemo-customizer knowledge-distillationLearn how to check job metrics using MLFlow or Weights & Biases.
nemo-customizer mlflow wandbLearn how to optimize the token-per-GPU throughput for a LoRA optimization job.
nemo-customizer wandb sequence-packing