nemo_curator.stages.text.download.wikipedia.download
nemo_curator.stages.text.download.wikipedia.download
Module Contents
Classes
API
Bases: DocumentDownloader
Downloads Wikipedia dump files (.bz2) from wikimedia.org.
Download a Wikipedia dump file to the specified path.
Parameters:
url
URL to download
path
Local path to save file
Returns: bool
Tuple of (success, error_message). If success is True, error_message is None.
Generate output filename from URL.