NeMo Speaker Diarization API
NeMo Speaker Diarization API#
- class nemo.collections.asr.models.ClusteringDiarizer(cfg: omegaconf.DictConfig)#
Inference model Class for offline speaker diarization. This class handles required functionality for diarization : Speech Activity Detection, Segmentation, Extract Embeddings, Clustering, Resegmentation and Scoring. All the parameters are passed through config file
- diarize(paths2audio_files: Optional[List[str]] = None, batch_size: int = 0)#
Diarize files provided thorugh paths2audio_files or manifest file input: paths2audio_files (List[str]): list of paths to file containing audio file batch_size (int): batch_size considered for extraction of speaker embeddings and VAD computation
- classmethod list_available_models()#
Should list all pre-trained models available via NVIDIA NGC cloud. Note: There is no check that requires model names and aliases to be unique. In the case of a collIsion, whatever model (or alias) is listed first in the this returned list will be instantiated.
A list of PretrainedModelInfo entries
- path2audio_files_to_manifest(paths2audio_files, manifest_filepath)#
- classmethod restore_from(restore_path: str, override_config_path: Optional[str] = None, map_location: Optional[torch.device] = None, strict: bool = False)#
Restores model instance (weights and configuration) from a .nemo file
restore_path – path to .nemo file from which model should be instantiated
override_config_path – path to a yaml config that will override the internal config file or an OmegaConf / DictConfig object representing the model config.
map_location – Optional torch.device() to map the instantiated model to a device. By default (None), it will select a GPU if available, falling back to CPU otherwise.
strict – Passed to load_state_dict. By default True
return_config – If set to true, will return just the underlying config of the restored model as an OmegaConf DictConfig object without instantiating the model.
trainer – An optional Trainer object, passed to the model constructor.
save_restore_connector – An optional SaveRestoreConnector object that defines the implementation of the restore_from() method.
- save_to(save_path: str)#
- Saves model instance (weights and configuration) into EFF archive or .
You can use “restore_from” method to fully restore instance from .nemo file.
- .nemo file is an archive (tar.gz) with the following:
model_config.yaml - model configuration in .yaml format. You can deserialize this into cfg argument for model’s constructor model_wights.chpt - model checkpoint
save_path – Path to .nemo file where model instance should be saved
- class nemo.collections.asr.parts.mixins.mixins.DiarizationMixin#
- abstract diarize(paths2audio_files: List[str], batch_size: int = 1) List[str] #
Takes paths to audio files and returns speaker labels :param paths2audio_files: paths to audio fragment to be transcribed