Inverse Text Normalization¶
Inverse text normalization (ITN) is a part of the Automatic Speech Recognition (ASR) post-processing pipeline. ITN is the task of converting the raw spoken output of the ASR model into its written form to improve text readability.
For example, “in nineteen seventy” -> “in 1975” and “it costs one hundred and twenty three dollars” -> “it costs $123”.
NeMo ITN [TEXTPROCESSING-ITN5] is based on WFST-grammars [TEXTPROCESSING-ITN3]. We also provide a deployment route to C++ using Sparrowhawk [TEXTPROCESSING-ITN2] – an open-source version of Google Kestrel [TEXTPROCESSING-ITN1]. See Text Procesing Deployment for details.
For more details, see the tutorial NeMo/tutorials/text_processing/Inverse_Text_Normalization.ipynb in Google’s Colab.
The base class for every grammar is
This tool is designed as a two-stage application: 1. classification of the input into semiotic tokens and 2. verbalization into written form.
For every stage and every semiotic token class there is a corresponding grammar, e.g.
Together, they compose the final grammars
VerbalizeFinalFst that are compiled into WFST and used for inference.
Example prediction run:
python run_prediction.py --input=<INPUT_TEXT_FILE> --output=<OUTPUT_PATH> [--verbose]
The input is expected to be lower-cased. Punctuation are outputted with separating spaces after semiotic tokens, e.g. “i see, it is ten o’clock…” -> “I see, it is 10:00 . . .”. Inner-sentence white-space characters in the input are not maintained.
Data Cleaning for Evaluation¶
python clean_eval_data.py --input=<INPUT_TEXT_FILE>
python run_evaluation.py --input=./en_with_types/output-00001-of-00100 [--cat CLASS_CATEGORY] [--filter]
Peter Ebden and Richard Sproat. The kestrel tts text normalization system. Natural Language Engineering, 21(3):333, 2015.
Alexander Gutkin, Linne Ha, Martin Jansche, Knot Pipatsrisawat, and Richard Sproat. Tts for low resource languages: a bangla synthesizer. In 10th Language Resources and Evaluation Conference. 2016.
Richard Sproat and Navdeep Jaitly. Rnn approaches to text normalization: a challenge. arXiv preprint arXiv:1611.00068, 2016.
Yang Zhang, Evelina Bakhturina, Kyle Gorman, and Boris Ginsburg. Nemo inverse text normalization: from development to production. 2021. arXiv:2104.05055.