morpheus.controllers.rss_controller.RSSController
- class RSSController(feed_input, batch_size=128, run_indefinitely=None, enable_cache=False, cache_dir='./.cache/http', cooldown_interval=600, request_timeout=2.0, strip_markup=False)[source]
Bases:
object
RSSController handles fetching and processing of RSS feed entries.
- Parameters
- feed_input
- batch_size
- run_indefinitely
- enable_cache
- cache_dir
- cooldown_interval
- request_timeout
- strip_markup
The URL or file path of the RSS feed.
Number of feed items to accumulate before creating a DataFrame.
Whether to run the processing indefinitely. If set to True, the controller will continue fetching and processing If set to False, the controller will stop processing after the feed is fully fetched and processed. If not provided any value and if
feed_input
is of type URL, the controller will run indefinitely. Default is None.Enable caching of RSS feed request data.
Cache directory for storing RSS feed request data.
Cooldown interval in seconds if there is a failure in fetching or parsing the feed.
Request timeout in secs to fetch the feed.
When true, strip HTML & XML markup from the from the content, summary and title fields.
- Attributes
run_indefinitely
Property that determines to run the source indefinitely
Methods
fetch_dataframes
()Fetch and process RSS feed entries. get_feed_stats
(feed_url)Get feed url stats. is_url
(feed_input)Check if the provided url is a valid URL. parse_feeds
()Parse the RSS feed using the feedparser library. - fetch_dataframes()[source]
Fetch and process RSS feed entries.
- Raises
- Exception
If there is error fetching or processing feed entries.
- get_feed_stats(feed_url)[source]
Get feed url stats.
- Parameters
- feed_url
Feed URL that is part of feed_input passed to the constructor.
- Returns
- FeedStats
FeedStats instance for the given feed URL if it exists.
- Raises
- ValueError
If the feed URL is not found in the feed url provided to the constructor.
- classmethod is_url(feed_input)[source]
Check if the provided url is a valid URL.
- Parameters
- feed_input
The url string to be checked.
- Returns
- bool
True if the url is a valid URL, False otherwise.
- parse_feeds()[source]
Parse the RSS feed using the feedparser library.
- property run_indefinitely
Property that determines to run the source indefinitely