nat.plugins.customizer.dpo.config#
Configuration classes for DPO training with NeMo Customizer.
This module provides configuration for: 1. DPO Trajectory Builder - collecting preference data from workflows 2. NeMo Customizer TrainerAdapter - submitting DPO training jobs
Classes#
Configuration for the DPO (Direct Preference Optimization) Trajectory Builder. |
|
Configuration for the NeMo Customizer Trainer. |
|
DPO-specific hyperparameters for NeMo Customizer. |
|
Hyperparameters for NeMo Customizer training jobs. |
|
Configuration for NIM deployment after training. |
|
Configuration for the NeMo Customizer TrainerAdapter. |
Module Contents#
- class DPOTrajectoryBuilderConfig(/, **data: Any)#
Bases:
nat.data_models.finetuning.TrajectoryBuilderConfigConfiguration for the DPO (Direct Preference Optimization) Trajectory Builder.
This builder collects preference pairs from workflows that produce TTC_END intermediate steps with TTCEventData. It uses the structured TTCEventData model to extract turn_id, candidate_index, score, input (prompt), and output (response) - no dictionary key configuration needed.
The builder groups candidates by turn_id and creates preference pairs based on score differences.
Example YAML configuration:
trajectory_builders: dpo_builder: _type: dpo_traj_builder ttc_step_name: dpo_candidate_move exhaustive_pairs: true min_score_diff: 0.05 max_pairs_per_turn: 5
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- validate_config() DPOTrajectoryBuilderConfig#
Validate configuration consistency.
- class NeMoCustomizerTrainerConfig(/, **data: Any)#
Bases:
nat.data_models.finetuning.TrainerConfigConfiguration for the NeMo Customizer Trainer.
This trainer orchestrates DPO data collection and training job submission. Unlike epoch-based trainers, it runs the trajectory builder multiple times to collect data, then submits a single training job to NeMo Customizer.
Example YAML configuration:
trainers: nemo_dpo: _type: nemo_customizer_trainer num_runs: 5 wait_for_completion: true deduplicate_pairs: true max_pairs: 10000
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.
- class DPOSpecificHyperparameters(/, **data: Any)#
Bases:
pydantic.BaseModelDPO-specific hyperparameters for NeMo Customizer.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.
- class NeMoCustomizerHyperparameters(/, **data: Any)#
Bases:
pydantic.BaseModelHyperparameters for NeMo Customizer training jobs.
These map to the
hyperparametersargument inclient.customization.jobs.create().Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- training_type: Literal['sft', 'dpo'] = None#
- finetuning_type: Literal['lora', 'all_weights'] = None#
- dpo: DPOSpecificHyperparameters = None#
- class NIMDeploymentConfig(/, **data: Any)#
Bases:
pydantic.BaseModelConfiguration for NIM deployment after training.
These settings are used when
deploy_on_completionis True.Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.
- class NeMoCustomizerTrainerAdapterConfig(/, **data: Any)#
Bases:
nat.data_models.finetuning.TrainerAdapterConfigConfiguration for the NeMo Customizer TrainerAdapter.
This adapter submits DPO/SFT training jobs to NeMo Customizer and optionally deploys the trained model.
Example YAML configuration:
trainer_adapters: nemo_customizer: _type: nemo_customizer_trainer_adapter entity_host: https://nmp.example.com datastore_host: https://datastore.example.com namespace: my-project customization_config: meta/llama-3.2-1b-instruct@v1.0.0+A100 hyperparameters: training_type: dpo epochs: 5 batch_size: 8 use_full_message_history: true deploy_on_completion: true
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- hyperparameters: NeMoCustomizerHyperparameters = None#
- deployment_config: NIMDeploymentConfig = None#
- validate_config() NeMoCustomizerTrainerAdapterConfig#
Validate configuration consistency.