stages.text.modifiers.newline_normalizer#

Module Contents#

Classes#

NewlineNormalizer

Replaces 3 or more consecutive newline characters with only 2 newline characters.

Data#

API#

class stages.text.modifiers.newline_normalizer.NewlineNormalizer#

Bases: nemo_curator.stages.text.modifiers.doc_modifier.DocumentModifier

Replaces 3 or more consecutive newline characters with only 2 newline characters.

Initialization

modify_document(text: str) str#

Transform the provided value(s) and return the result.

stages.text.modifiers.newline_normalizer.THREE_OR_MORE_NEWLINES_REGEX#

‘compile(…)’

stages.text.modifiers.newline_normalizer.THREE_OR_MORE_WINDOWS_NEWLINES_REGEX#

‘compile(…)’