Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to the Migration Guide for information on getting started.
Neural Types
Motivation
Neural Types describe the semantics, axis order, and dimensions of a tensor. The purpose of this type system is to catch semantic and dimensionality errors during model creation and facilitate module re-use.
NeuralType
class
Neural Types perform semantic checks for modules and models inputs/outputs. They contain information about:
Semantics of what is stored in the tensors. For example, logits, logprobs, audiosignal, embeddings, etc.
Axes layout, semantic and (optionally) dimensionality. For example:
[Batch, Time, Channel]
Types are implemented in nemo.core.neural_types.NeuralType
class. When you instantiate an instance of this class, you
are expected to include both axes information and element type information.
- class nemo.core.neural_types.NeuralType(axes: Optional[Any] = None, elements_type: Optional[Any] = None, optional: bool = False)
This is the main class which would represent neural type concept. It is used to represent the types of inputs and outputs.
- Parameters
axes (Optional[Tuple]) – a tuple of AxisTypes objects representing the semantics of what varying each axis means You can use a short, string-based form here. For example: (‘B’, ‘C’, ‘H’, ‘W’) would correspond to an NCHW format frequently used in computer vision. (‘B’, ‘T’, ‘D’) is frequently used for signal processing and means [batch, time, dimension/channel].
elements_type (ElementType) – an instance of ElementType class representing the semantics of what is stored inside the tensor. For example: logits (LogitsType), log probabilities (LogprobType), etc.
optional (bool) – By default, this is false. If set to True, it would means that input to the port of this type can be optional.
Type Comparison Results
When comparing two neural types, the following comparison results are generated.
- class nemo.core.neural_types.NeuralTypeComparisonResult(value)
The result of comparing two neural type objects for compatibility. When comparing A.compare_to(B):
Examples
Long vs short notation
NeMo’s NeuralType
class allows you to express axis semantics information in long and short form. Consider these two equivalent types. Both encoder 3 dimensional tensors and both contain elements of type AcousticEncodedRepresentation
(this type is a typical output of ASR encoders).
long_version = NeuralType(
axes=(AxisType(AxisKind.Batch, None), AxisType(AxisKind.Dimension, None), AxisType(AxisKind.Time, None)),
elements_type=AcousticEncodedRepresentation(),
)
short_version = NeuralType(('B', 'D', 'T'), AcousticEncodedRepresentation())
assert long_version.compare(short_version) == NeuralTypeComparisonResult.SAME
Transpose same
Often it is useful to know if a simple transposition will solve type incompatibility. This is the case if the comparison result of two types equals nemo.core.neural_types.NeuralTypeComparisonResult.TRANSPOSE_SAME
.
type1 = NeuralType(axes=('B', 'T', 'C'))
type2 = NeuralType(axes=('T', 'B', 'C'))
assert type1.compare(type2) == NeuralTypeComparisonResult.TRANSPOSE_SAME
assert type2.compare(type1) == NeuralTypeComparisonResult.TRANSPOSE_SAME
Note that in this example, we dropped elements_type
argument of NeuralType
constructor. If not supplied, the element type is VoidType
.
VoidType
for elements
Sometimes it is useful to express that elements’ types don’t matter but axes layout do. VoidType
for elements can be used to express this.
Note
VoidType
is compatible with every other elements’ type but not the other way around. See the following code snippet below for details.
btc_spctr = NeuralType(('B', 'T', 'C'), SpectrogramType())
btc_spct_bad = NeuralType(('B', 'T'), SpectrogramType())
# Note the VoidType for elements here
btc_void = NeuralType(('B', 'T', 'C'), VoidType())
# This is true because VoidType is compatible with every other element type (SpectrogramType in this case)
# And axes layout between btc_void and btc_spctr is the same
assert btc_void.compare(btc_spctr) == NeuralTypeComparisonResult.SAME
# These two types are incompatible because even though VoidType is used for elements on one side,
# the axes layout is different
assert btc_void.compare(btc_spct_bad) == NeuralTypeComparisonResult.INCOMPATIBLE
# Note that even though VoidType is compatible with every other type, other types are not compatible with VoidType!
# It is one-way compatibility
assert btc_spctr.compare(btc_void) == NeuralTypeComparisonResult.INCOMPATIBLE
Element type inheritance
Neural types in NeMo support Python inheritance between element types. Consider an example where you want to develop a Neural Module which performs data augmentation for all kinds of spectrograms.
In ASR, two types of spectrograms are frequently used: mel and mfcc. To express this, we will create 3 classes to express
element’s types: SpectrogramType
, MelSpectrogramType(SpectrogramType)
, MFCCSpectrogramType(SpectrogramType)
.
input = NeuralType(('B', 'D', 'T'), SpectrogramType())
out1 = NeuralType(('B', 'D', 'T'), MelSpectrogramType())
out2 = NeuralType(('B', 'D', 'T'), MFCCSpectrogramType())
# MelSpectrogram and MFCCSpectrogram are not interchangeable.
assert out1.compare(out2) == NeuralTypeComparisonResult.INCOMPATIBLE
assert out2.compare(out1) == NeuralTypeComparisonResult.INCOMPATIBLE
# Type comparison detects that MFCC/MelSpectrogramType is a kind of SpectrogramType and can be accepted.
assert input.compare(out1) == NeuralTypeComparisonResult.GREATER
assert input.compare(out2) == NeuralTypeComparisonResult.GREATER
Custom element types
It is possible to create user-defined element types to express the semantics of elements in your tensors. To do so, the user will need to inherit and implement abstract methods of the nemo.core.neural_types.elements.ElementType
class
- class nemo.core.neural_types.elements.ElementType
Abstract class defining semantics of the tensor elements. We are relying on Python for inheritance checking
Note that element types can be parametrized. Consider this example where it distinguishes between audio sampled at 8Khz and 16Khz.
audio16K = NeuralType(axes=('B', 'T'), elements_type=AudioSignal(16000))
audio8K = NeuralType(axes=('B', 'T'), elements_type=AudioSignal(8000))
assert audio8K.compare(audio16K) == NeuralTypeComparisonResult.SAME_TYPE_INCOMPATIBLE_PARAMS
assert audio16K.compare(audio8K) == NeuralTypeComparisonResult.SAME_TYPE_INCOMPATIBLE_PARAMS
Enforcing dimensions
In addition to specifying tensor layout and elements’ semantics, neural types also allow you to enforce tensor dimensions. The user will have to use long notations to specify dimensions. Short notations only allows you to specify axes semantics and assumes arbitrary dimensions.
type1 = NeuralType(
(AxisType(AxisKind.Batch, 64), AxisType(AxisKind.Time, 10), AxisType(AxisKind.Dimension, 128)),
SpectrogramType(),
)
type2 = NeuralType(('B', 'T', 'C'), SpectrogramType())
# type2 will accept elements of type1 because their axes semantics match and type2 does not care about dimensions
assert type2.compare(type1), NeuralTypeComparisonResult.SAME
# type1 will not accept elements of type2 because it need dimensions to match strictly.
assert type1.compare(type2), NeuralTypeComparisonResult.DIM_INCOMPATIBLE
Generic Axis kind
Sometimes (especially in the case of loss modules) it is useful to be able to specify a “generic” axis kind which will make it
compatible with any other kind of axis. This is easy to express with Neural Types by using nemo.core.neural_types.axes.AxisKind.Any
for axes.
type1 = NeuralType(('B', 'Any', 'Any'), SpectrogramType())
type2 = NeuralType(('B', 'T', 'C'), SpectrogramType())
type3 = NeuralType(('B', 'C', 'T'), SpectrogramType())
# type1 will accept elements of type2 and type3 because it only cares about element kind (SpectrogramType)
# number of axes (3) and that first one corresponds to batch
assert type1.compare(type2) == NeuralTypeComparisonResult.SAME
assert type1.compare(type3) == NeuralTypeComparisonResult.INCOMPATIBLE
Container types
The NeMo-type system understands Python containers (lists). If your module returns a nested list of typed tensors, the way to express it is by using Python list notation and Neural Types together when defining your input/output types.
The example below shows how to express that your module returns single output (“out”) which is list of lists of two dimensional tensors of shape [batch, dimension]
containing logits.
@property
def output_types(self):
return {
"out": [[NeuralType(('B', 'D'), LogitsType())]],
}