stages.text.download.html_extractors.base#

Module Contents#

Classes#

HTMLExtractorAlgorithm

Helper class that provides a standard way to create an ABC using inheritance.

API#

class stages.text.download.html_extractors.base.HTMLExtractorAlgorithm#

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

NON_SPACED_LANGUAGES#

‘frozenset(…)’

abstractmethod extract_text(
html: str,
stop_words: frozenset[str],
language: str,
) list[str] | None#