nemo_curator.stages.text.modifiers.string.quotation_remover

View as Markdown

Module Contents

Classes

NameDescription
QuotationRemoverRemoves quotations from a document following a few rules:

API

class nemo_curator.stages.text.modifiers.string.quotation_remover.QuotationRemover()

Bases: DocumentModifier

Removes quotations from a document following a few rules:

  • If the document is less than 2 characters, it is returned unchanged.
  • If the document starts and ends with a quotation mark and there are no newlines in the document, the quotation marks are removed.
  • If the document starts and ends with a quotation mark and there are newlines in the document, the quotation marks are removed only if the first line does not end with a quotation mark.
nemo_curator.stages.text.modifiers.string.quotation_remover.QuotationRemover.modify_document(
text: str
) -> str