stages.audio.metrics.get_wer#

Module Contents#

Classes#

GetPairwiseWerStage

Count pairwise word-error-rate (WER) * 100% for each pair of text and pred_text.

Functions#

API#

class stages.audio.metrics.get_wer.GetPairwiseWerStage#

Bases: nemo_curator.stages.audio.common.LegacySpeechStage

Count pairwise word-error-rate (WER) * 100% for each pair of text and pred_text.

WER is measured between data[self.text_key] and data[self.pred_text_key].

Args: text_key (str): a string indicating which key of the data entries should be used to find the utterance transcript. Defaults to “text”. pred_text_key (str): a string indicating which key of the data entries should be used to access the ASR predictions. Defaults to “pred_text”.

Returns: The same data as in the input manifest with wer_key and corresponding values.

pred_text_key: str#

‘pred_text’

process_dataset_entry(
data_entry: dict,
) list[nemo_curator.tasks.AudioBatch]#
text_key: str#

‘text’

wer_key: str#

‘wer’

stages.audio.metrics.get_wer.get_cer(text: str, pred_text: str) float#
stages.audio.metrics.get_wer.get_charrate(text: str, duration: float) float#
stages.audio.metrics.get_wer.get_wer(text: str, pred_text: str) float#
stages.audio.metrics.get_wer.get_wordrate(text: str, duration: float) float#