modifiers.newline_normalizer#

Module Contents#

Classes#

NewlineNormalizer

Replaces 3 or more consecutive newline characters with only 2 newline characters.

Data#

API#

class modifiers.newline_normalizer.NewlineNormalizer#

Bases: nemo_curator.modifiers.DocumentModifier

Replaces 3 or more consecutive newline characters with only 2 newline characters.

Initialization

modify_document(text: str) str#
modifiers.newline_normalizer.THREE_OR_MORE_NEWLINES_REGEX#

‘compile(…)’

modifiers.newline_normalizer.THREE_OR_MORE_WINDOWS_NEWLINES_REGEX#

‘compile(…)’