API ReferenceFull Library ReferenceNemo CuratorNemo CuratorStagesTextDownloadHtml Extractorsnemo_curator.stages.text.download.html_extractors.baseAsk a question|Copy page|View as Markdown|More actionsModule Contents Classes NameDescriptionHTMLExtractorAlgorithm- API class nemo_curator.stages.text.download.html_extractors.base.HTMLExtractorAlgorithm() AbstractNON_SPACED_LANGUAGESnemo_curator.stages.text.download.html_extractors.base.HTMLExtractorAlgorithm.extract_text( html: str, stop_words: frozenset[str], language: str) -> list[str] | Noneabstract