NeMo Speaker Diarization API

Model Classes

class nemo.collections.asr.models.ClusteringDiarizer(cfg: omegaconf.DictConfig)[source]

Bases: nemo.core.classes.common.Model, nemo.collections.asr.parts.mixins.DiarizationMixin

diarize(paths2audio_files: Optional[List[str]] = None, batch_size: int = 1)[source]
classmethod list_available_models()[source]

Should list all pre-trained models available via NVIDIA NGC cloud. Note: There is no check that requires model names and aliases to be unique. In the case of a collIsion, whatever model (or alias) is listed first in the this returned list will be instantiated.

Returns

A list of PretrainedModelInfo entries

path2audio_files_to_manifest(paths2audio_files)[source]
classmethod restore_from(restore_path: str, override_config_path: Optional[str] = None, map_location: Optional[torch.device] = None, strict: bool = False)[source]

Restores module/model with weights

save_to(save_path: str)
Saves model instance (weights and configuration) into EFF archive or .

You can use “restore_from” method to fully restore instance from .nemo file.

.nemo file is an archive (tar.gz) with the following:

model_config.yaml - model configuration in .yaml format. You can deserialize this into cfg argument for model’s constructor model_wights.chpt - model checkpoint

Parameters

save_path – Path to .nemo file where model instance should be saved

set_vad_model(vad_config)[source]

Mixins

class nemo.collections.asr.parts.mixins.DiarizationMixin[source]

Bases: abc.ABC

abstract diarize(paths2audio_files: List[str], batch_size: int = 1)List[str][source]

Takes paths to audio files and returns speaker labels :param paths2audio_files: paths to audio fragment to be transcribed

Returns

Speaker labels