Is this page helpful?

`nemo_microservices.types.customization.dpo_parameters`#

Module Contents#

Classes#

DpoParameters

API#

class nemo_microservices.types.customization.dpo_parameters.DpoParameters(/, **data: typing.Any)#

Bases: nemo_microservices._models.BaseModel

max_grad_norm: Optional[float]#

None

Maximum gradient norm for gradient clipping during training.

Prevents exploding gradients by scaling down gradients that exceed this threshold. Lower this value (e.g., 0.5) if you observe training instability, NaN losses, or erratic loss spikes. Increase it (e.g., 5.0) if training seems overly conservative or progress is too slow. Typical values range from 0.5 to 5.0.

preference_average_log_probs: Optional[bool]#

None

If set to true, the preference loss uses average log-probabilities, making the loss less sensitive to sequence length. Setting it to false (default) uses total log-probabilities, giving more influence to longer sequences.

preference_loss_weight: Optional[float]#

None

Scales the contribution of the preference loss to the overall training objective.

Increasing this value emphasizes learning from preference comparisons more strongly.

ref_policy_kl_penalty: Optional[float]#

None

Controls how strongly the trained policy is penalized for deviating from the reference policy. Increasing this value encourages the policy to stay closer to the reference (more conservative learning), while decreasing it allows more freedom to explore user-preferred behavior. Parameter is called beta in the original paper

sft_average_log_probs: Optional[bool]#

None

If set to true, the supervised fine-tuning (SFT) loss normalizes by sequence length, treating all examples equally regardless of length. If false (default), longer examples contribute more to the loss.

sft_loss_weight: Optional[float]#

None

Scales the contribution of the supervised fine-tuning loss.

Setting this to 0 disables SFT entirely, allowing training to focus exclusively on preference-based optimization.

nemo_microservices.types.customization.dpo_parameters#

Module Contents#

Classes#

API#

`nemo_microservices.types.customization.dpo_parameters`#