NeMo Speaker Diarization API#

Model Classes#

class nemo.collections.asr.models.ClusteringDiarizer(cfg: omegaconf.DictConfig)[source]#

Bases: nemo.core.classes.common.Model, nemo.collections.asr.parts.mixins.mixins.DiarizationMixin

Inference model Class for offline speaker diarization. This class handles required functionality for diarization : Speech Activity Detection, Segmentation, Extract Embeddings, Clustering, Resegmentation and Scoring. All the parameters are passed through config file

diarize(paths2audio_files: Optional[List[str]] = None, batch_size: int = 0)[source]#

Diarize files provided thorugh paths2audio_files or manifest file input: paths2audio_files (List[str]): list of paths to file containing audio file batch_size (int): batch_size considered for extraction of speaker embeddings and VAD computation

classmethod list_available_models()[source]#

Should list all pre-trained models available via NVIDIA NGC cloud. Note: There is no check that requires model names and aliases to be unique. In the case of a collIsion, whatever model (or alias) is listed first in the this returned list will be instantiated.

Returns

A list of PretrainedModelInfo entries

path2audio_files_to_manifest(paths2audio_files, manifest_filepath)[source]#
classmethod restore_from(restore_path: str, override_config_path: Optional[str] = None, map_location: Optional[torch.device] = None, strict: bool = False)[source]#

Restores model instance (weights and configuration) from a .nemo file

Parameters
  • restore_path – path to .nemo file from which model should be instantiated

  • override_config_path – path to a yaml config that will override the internal config file or an OmegaConf / DictConfig object representing the model config.

  • map_location – Optional torch.device() to map the instantiated model to a device. By default (None), it will select a GPU if available, falling back to CPU otherwise.

  • strict – Passed to load_state_dict. By default True

  • return_config – If set to true, will return just the underlying config of the restored model as an OmegaConf DictConfig object without instantiating the model.

  • trainer – An optional Trainer object, passed to the model constructor.

  • save_restore_connector – An optional SaveRestoreConnector object that defines the implementation of the restore_from() method.

save_to(save_path: str)#
Saves model instance (weights and configuration) into EFF archive or .

You can use “restore_from” method to fully restore instance from .nemo file.

.nemo file is an archive (tar.gz) with the following:

model_config.yaml - model configuration in .yaml format. You can deserialize this into cfg argument for model’s constructor model_wights.chpt - model checkpoint

Parameters

save_path – Path to .nemo file where model instance should be saved

Mixins#

class nemo.collections.asr.parts.mixins.mixins.DiarizationMixin[source]#

Bases: abc.ABC

abstract diarize(paths2audio_files: List[str], batch_size: int = 1) List[str][source]#

Takes paths to audio files and returns speaker labels :param paths2audio_files: paths to audio fragment to be transcribed

Returns

Speaker labels