stages.text.modifiers.markdown_remover#

Module Contents#

Classes#

MarkdownRemover

Removes Markdown formatting in a document including bold, italic, underline, and URL text.

Data#

API#

stages.text.modifiers.markdown_remover.MARKDOWN_BOLD_REGEX#

‘\\(.?)\\*’

stages.text.modifiers.markdown_remover.MARKDOWN_ITALIC_REGEX#

‘\(.?)\*’

‘\[.?\]\((.?)\)’

stages.text.modifiers.markdown_remover.MARKDOWN_UNDERLINE_REGEX#

(.*?)

class stages.text.modifiers.markdown_remover.MarkdownRemover#

Bases: nemo_curator.stages.text.modifiers.doc_modifier.DocumentModifier

Removes Markdown formatting in a document including bold, italic, underline, and URL text.

Initialization

modify_document(text: str) str#

Transform the provided value(s) and return the result.