Computer Vision (CV)

The collection contains several datasets, modules and losses useful in computer/machine vision tasks.

Models

class nemo.collections.cv.models.mnist_lenet5.MNISTLeNet5(cfg: nemo.collections.cv.models.mnist_lenet5.MNISTLeNet5Config = MNISTLeNet5Config(name=None, dataset=MNISTDatasetConfig(name=None, height=32, width=32, data_folder='~/data/mnist', train=True, download=True), dataloader=DataLoaderConfig(batch_size=64, shuffle=True, sampler=None, batch_sampler=None, num_workers=0, collate_fn=None, pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None, multiprocessing_context=None), module=Config(name=None), optim=MNISTOptimizer(name='novograd', lr=0.01, args=NovogradParams(betas=(0.8, 0.5), eps=1e-08, weight_decay=0, grad_averaging=False, amsgrad=False, luc=False, luc_trust=0.001, luc_eps=1e-08))))[source]

Bases: nemo.core.classes.modelPT.ModelPT

The LeNet-5 convolutional model.

forward(images)[source]

Propagates data by calling the module LeNet5Module forward.

classmethod from_pretrained(name: str)[source]

Not implemented.

input_types

LeNet5Module input types.

Type:Returns
classmethod list_available_models() → Optional[Dict[str, str]][source]

Not implemented.

output_types

LeNet5Module output types.

Type:Returns
classmethod restore_from(restore_path: str)[source]

Not implemented.

save_to(save_path: str)[source]

Not implemented.

setup_test_data(test_data_layer_params: Optional[Dict[KT, VT]] = None)[source]

Not implemented.

setup_training_data(train_data_layer_config: Optional[Dict[KT, VT]] = None)[source]

Creates dataset, wrap it with dataloader and return the latter

setup_validation_data(val_data_layer_config: Optional[Dict[KT, VT]] = None)[source]

Not implemented.

train_dataloader()[source]

Not implemented.

training_step(batch, what_is_this_input)[source]

Training step, calculate loss.

class nemo.collections.cv.models.mnist_lenet5.MNISTLeNet5Config(name: Optional[str] = None, dataset: nemo.collections.cv.datasets.mnist_dataset.MNISTDatasetConfig = MNISTDatasetConfig(name=None, height=32, width=32, data_folder='~/data/mnist', train=True, download=True), dataloader: nemo.core.config.pytorch.DataLoaderConfig = DataLoaderConfig(batch_size=64, shuffle=True, sampler=None, batch_sampler=None, num_workers=0, collate_fn=None, pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None, multiprocessing_context=None), module: nemo.core.config.base_config.Config = Config(name=None), optim: nemo.collections.cv.models.mnist_lenet5.MNISTOptimizer = MNISTOptimizer(name='novograd', lr=0.01, args=NovogradParams(betas=(0.8, 0.5), eps=1e-08, weight_decay=0, grad_averaging=False, amsgrad=False, luc=False, luc_trust=0.001, luc_eps=1e-08)))[source]

Bases: nemo.core.config.base_config.Config

Structured config for LeNet-5 model class - that also contains parameters of dataset and dataloader.

dataloader = DataLoaderConfig(batch_size=64, shuffle=True, sampler=None, batch_sampler=None, num_workers=0, collate_fn=None, pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None, multiprocessing_context=None)
dataset = MNISTDatasetConfig(name=None, height=32, width=32, data_folder='~/data/mnist', train=True, download=True)
module = Config(name=None)
optim = MNISTOptimizer(name='novograd', lr=0.01, args=NovogradParams(betas=(0.8, 0.5), eps=1e-08, weight_decay=0, grad_averaging=False, amsgrad=False, luc=False, luc_trust=0.001, luc_eps=1e-08))
class nemo.collections.cv.models.mnist_lenet5.MNISTOptimizer(name: str = 'novograd', lr: float = 0.01, args: nemo.core.config.optimizers.NovogradParams = NovogradParams(betas=(0.8, 0.5), eps=1e-08, weight_decay=0, grad_averaging=False, amsgrad=False, luc=False, luc_trust=0.001, luc_eps=1e-08))[source]

Bases: nemo.core.config.base_config.Config

Optimizer setup for novograd

args = NovogradParams(betas=(0.8, 0.5), eps=1e-08, weight_decay=0, grad_averaging=False, amsgrad=False, luc=False, luc_trust=0.001, luc_eps=1e-08)
lr = 0.01
name = 'novograd'
class nemo.collections.cv.models.mnist_lenet5.NovogradScheduler(name: str = 'CosineAnnealing', args: nemo.core.config.schedulers.CosineAnnealingParams = CosineAnnealingParams(last_epoch=-1, warmup_steps=None, warmup_ratio=None, min_lr=0.0), monitor: str = 'val_loss', iters_per_batch: Optional[float] = None, max_steps: Optional[int] = None, reduce_on_plateau: bool = False)[source]

Bases: nemo.core.config.base_config.Config

Scheduler setup for novograd

args = CosineAnnealingParams(last_epoch=-1, warmup_steps=None, warmup_ratio=None, min_lr=0.0)
iters_per_batch = None
max_steps = None
monitor = 'val_loss'
name = 'CosineAnnealing'
reduce_on_plateau = False

Datasets

class nemo.collections.cv.datasets.mnist_dataset.MNISTDataset(cfg: nemo.collections.cv.datasets.mnist_dataset.MNISTDatasetConfig = MNISTDatasetConfig(name=None, height=28, width=28, data_folder='~/data/mnist', train=True, download=True))[source]

Bases: nemo.core.classes.dataset.Dataset

A “thin wrapper” around the torchvision’s MNIST dataset.

__getitem__(index: int)[source]

Returns a single sample.

Parameters:index – index of the sample to return.
__init__(cfg: nemo.collections.cv.datasets.mnist_dataset.MNISTDatasetConfig = MNISTDatasetConfig(name=None, height=28, width=28, data_folder='~/data/mnist', train=True, download=True))[source]

Initializes the MNIST dataset.

Parameters:cfg – Configuration object of type MNISTDatasetConfig.
__len__()[source]
Returns:Length of the dataset.
ix_to_word

Dictionary with mapping of target indices (int) to labels (class names as strings) that can we used by other modules.

Type:Returns
output_types

Creates definitions of output ports.

class nemo.collections.cv.datasets.mnist_dataset.MNISTDatasetConfig(name: Optional[str] = None, height: int = 28, width: int = 28, data_folder: str = '~/data/mnist', train: bool = True, download: bool = True)[source]

Bases: nemo.core.config.base_config.Config

Structured config for MNISTDataset class.

Parameters:
  • height – image height (DEFAULT: 28)
  • width – image width (DEFAULT: 28)
  • data_folder – path to the folder with data, can be relative to user (DEFAULT: “~/data/mnist”)
  • train – use train or test splits (DEFAULT: True)
  • name – Name of the module (DEFAULT: None)
__init__(name: Optional[str] = None, height: int = 28, width: int = 28, data_folder: str = '~/data/mnist', train: bool = True, download: bool = True) → None
data_folder = '~/data/mnist'
download = True
height = 28
train = True
width = 28

Neural Modules

class nemo.collections.cv.modules.lenet5.LeNet5(cfg: nemo.core.config.base_config.Config = Config(name=None))[source]

Bases: nemo.core.classes.module.NeuralModule

Classical LeNet-5 model for MNIST image classification.

__init__(cfg: nemo.core.config.base_config.Config = Config(name=None))[source]

Creates the LeNet-5 model.

Parameters:cfg – Default NeMo config containing name.
forward(images)[source]

Performs the forward step of the LeNet-5 model.

Parameters:images – Batch of images to be classified.
Returns:Batch of predictions.
input_types

Returns definitions of module input ports.

output_types

Returns definitions of module output ports.

classmethod restore_from(restore_path: str)[source]
Not implemented yet.
Restore module from serialization.
Parameters:restore_path (str) – path to serialization
save_to(save_path: str)[source]
Not implemented yet.
Serialize model.
Parameters:save_path (str) – path to save serialization.

Losses

class nemo.collections.cv.losses.nll_loss.NLLLoss(name: Optional[str] = None)[source]

Bases: sphinx.ext.autodoc.importer._MockObject, nemo.core.classes.common.Serialization, nemo.core.classes.common.Typing

Class representing a simple NLL loss.

__init__(name: Optional[str] = None)[source]

Constructor.

Parameters:name – Name of the module (DEFAULT: None)
forward(predictions, targets)[source]
input_types

Returns definitions of module input ports.

output_types

Returns definitions of module output ports.