Create Job
Customization jobs are submitted to one of two backends. Choose the backend that matches your hardware and training goal, then build that backend’s job spec and submit it. For the full per-backend hyperparameter reference, see Training Configuration.
Prerequisites
Before you can create a customization job, make sure that you have:
- Obtained the base URL of your NeMo Platform.
- Created a FileSet and Model Entity for your base model.
- Uploaded a dataset as a FileSet.
- Determined the training configuration you want to use for the customization job.
- Verified that the platform has sufficient storage for the job. Full SFT jobs require approximately 3× the base model size in free disk space; LoRA jobs require approximately 1.5×. See ft-tut-understand-models for details. If you are also deploying the model from a base checkpoint fileset, plan for ~2.5× model size overall for LoRA.
- Set the
NMP_BASE_URLenvironment variable to your NeMo Platform endpoint.
Submit an Automodel Job
Build an AutomodelJobInput spec and submit it to the automodel backend. The job runs on the platform’s GPU cluster, and create() returns a handle you can use to poll its status.
Example Response
:open:
Submit an Unsloth Job
The Unsloth backend runs on a single GPU and supports 4-bit / 8-bit quantized loading. Build a UnslothJobInput spec and submit it to the unsloth backend. Note that Unsloth uses its own field names (model.name, dataset.path, batch.per_device_train_batch_size).
Knowledge Distillation (Automodel)
Knowledge distillation is an Automodel feature. Set training.training_type to "distillation" and provide a teacher_model that references a second Model Entity. The model field is the student model being trained.
See Knowledge Distillation constraints for requirements on model compatibility, tokenizer, and GPU memory.
After Submission
A submitted job runs on the platform’s Jobs service. Manage its lifecycle — polling status, listing, and cancelling — through that service, regardless of which backend you submitted to. See Get Job Status, List Active Jobs, and Cancel a Job.
For field-level details of the job spec and W&B or MLflow integration options, see Customization Job Reference.
Training Output
When training completes, the system automatically uploads the trained artifacts to a new FileSet (output.fileset) and creates an output based on the fine-tuning regime:
LoRA Adapters
For LoRA jobs, the adapter is added to the parent Model Entity’s adapters list:
Adapters are enabled by default and are automatically loaded by NIMs serving this model with LoRA support.