Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Neural Modules#
NeMo is built around Neural Modules, conceptual blocks of neural networks that take typed inputs and produce typed outputs. Such modules typically represent data layers, encoders, decoders, language models, loss functions, or methods of combining activations. NeMo makes it easy to combine and re-use these building blocks while providing a level of semantic correctness checking via its neural type system.
Note
All Neural Modules inherit from ``torch.nn.Module`` and are therefore compatible with the PyTorch ecosystem.
There are 3 types on Neural Modules:
Regular modules
Dataset/IterableDataset
Losses
Every Neural Module in NeMo must inherit from nemo.core.classes.module.NeuralModule class.
- class nemo.core.classes.module.NeuralModule(*args: Any, **kwargs: Any)#
Abstract class offering interface shared between all PyTorch Neural Modules.
Every Neural Modules inherits the nemo.core.classes.common.Typing
interface and needs to define neural types for its inputs and outputs.
This is done by defining two properties: input_types
and output_types
. Each property should return an ordered dictionary of
“port name”->”port neural type” pairs. Here is the example from ConvASREncoder
class:
@property
def input_types(self):
return OrderedDict(
{
"audio_signal": NeuralType(('B', 'D', 'T'), SpectrogramType()),
"length": NeuralType(tuple('B'), LengthsType()),
}
)
@property
def output_types(self):
return OrderedDict(
{
"outputs": NeuralType(('B', 'D', 'T'), AcousticEncodedRepresentation()),
"encoded_lengths": NeuralType(tuple('B'), LengthsType()),
}
)
@typecheck()
def forward(self, audio_signal, length=None):
...
- The code snippet above means that
nemo.collections.asr.modules.conv_asr.ConvASREncoder
expects two arguments: First one, named
audio_signal
of shape[batch, dimension, time]
with elements representing spectrogram values.Second one, named
length
of shape[batch]
with elements representing lengths of corresponding signals.
- It also means that
.forward(...)
and__call__(...)
methods each produce two outputs: First one, of shape
[batch, dimension, time]
but with elements representing encoded representation (AcousticEncodedRepresentation
class).Second one, of shape
[batch]
, corresponding to their lengths.
Tip
It is a good practice to define types and add @typecheck()
decorator to your .forward()
method after your module is ready for use by others.
Note
The outputs of .forward(...)
method will always be of type torch.Tensor
or container of tensors and will work with any other Pytorch code. The type information is attached to every output tensor. If tensors without types is passed to your module, it will not fail, however the types will not be checked. Thus, it is recommended to define input/output types for all your modules, starting with data layers and add @typecheck()
decorator to them.
Note
To temporarily disable typechecking, you can enclose your code in `with typecheck.disable_checks():`
statement.
Dynamic Layer Freezing#
You can selectively freeze any modules inside a Nemo model by specifying a freezing schedule in the config yaml. Freezing stops any gradient updates to that module, so that its weights are not changed for that step. This can be useful for combatting catastrophic forgetting, for example when finetuning a large pretrained model on a small dataset.
The default approach is to freeze a module for the first N training steps, but you can also enable freezing for a specific range of steps, for example, from step 20 - 100, or even activate freezing from some N until the end of training. You can also freeze a module for the entire training run. Dynamic freezing is specified in training steps, not epochs.
To enable freezing, add the following to your config:
model:
...
freeze_updates:
enabled: true # set to false if you want to disable freezing
modules: # list all of the modules you want to have freezing logic for
encoder: 200 # module will be frozen for the first 200 training steps
decoder: [50, -1] # module will be frozen at step 50 and will remain frozen until training ends
joint: [10, 100] # module will be frozen between step 10 and step 100 (step >= 10 and step <= 100)
transcoder: -1 # module will be frozen for the entire training run