Is this page helpful?

Important

You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.

Punctuation And Capitalization Models#

Automatic Speech Recognition (ASR) systems typically generate text with no punctuation and capitalization of the words. There are two issues with non-punctuated ASR output:

it could be difficult to read and understand
models for some downstream tasks, such as named entity recognition, machine translation, or text-to-speech synthesis, are usually trained on punctuated datasets and using raw ASR output as the input to these models could deteriorate their performance

NeMo provides two types of Punctuation And Capitalization Models:

Lexical only model:

Punctuation and Capitalization Model

Lexical and audio model:

Punctuation and Capitalization Lexical Audio Model