nemo_automodel.components.training.neftune
nemo_automodel.components.training.neftune
NEFTune: Noisy Embeddings Fine-Tuning.
Implements the technique from “NEFTune: Noisy Embeddings Improve Instruction Finetuning” (https://arxiv.org/abs/2310.05914). Adds scaled uniform noise to token embeddings during training to improve generalization, with no additional compute or data overhead.
Module Contents
Classes
Functions
Data
API
Applies NEFTune noise to a model’s embedding layer during training.
NEFTune adds uniform random noise scaled by alpha / sqrt(seq_len * hidden_dim)
to the embedding output. The noise is only applied when the model is in training mode.
Example::
neftune = NEFTune(noise_alpha=5.0) neftune.activate(model)
… training loop …
neftune.deactivate(model)
Parameters:
Noise magnitude. Higher values add more noise. Typical values are 5-15. Set to 0 to disable.
Whether NEFTune noise is currently being applied.
Forward hook that adds NEFTune noise to embedding output during training.
Attach NEFTune noise hook to the model’s input embedding layer.
Parameters:
The model whose embeddings will be augmented with noise.
Raises:
RuntimeError: If NEFTune is already active on this model.ValueError: If the model has no recognizable embedding layer.
Remove the NEFTune noise hook from the model.
Safe to call even if NEFTune is not active (no-op in that case).
Parameters:
The model to deactivate NEFTune on.
Find the input embedding layer on a model.
Checks for get_input_embeddings() method first (HF models),
then falls back to common attribute names.
Parameters:
The model to search.
Returns: Optional[nn.Module]
The embedding module, or None if not found.