morpheus.utils.directory_watcher.DirectoryWatcher#

class DirectoryWatcher(
input_glob,
watch_directory,
max_files,
sort_glob,
recursive,
queue_max_size,
batch_timeout,
should_stop_fn=None,
)[source]#

Bases: object

This class is in responsible of polling for new files in the supplied input glob of directories and forwarding them on to the pipeline for processing.

Parameters:
cmorpheus.config.Config

Pipeline configuration instance.

input_globstr

Input glob pattern to match files to read. For example, /input_dir/*.json would read all files with the ‘json’ extension in the directory input_dir.

watch_directorybool

The watch directory option instructs this stage to not close down once all files have been read. Instead it will read all files that match the ‘input_glob’ pattern, and then continue to watch the directory for additional files. Any new files that are added that match the glob will then be processed.

max_files: int

Max number of files to read. Useful for debugging to limit startup time. Default value of -1 is unlimited.

sort_globbool

If true the list of files matching input_glob will be processed in sorted order.

recursive: bool

If true, events will be emitted for the files in subdirectories matching input_glob.

queue_max_size: int

Maximum queue size to hold the file paths to be processed that match input_glob.

batch_timeout: float

Timeout to retrieve batch messages from the queue.

should_stop_fn: Callable[[], bool]

Function that returns a boolean indicating if the watcher should stop processing files.

Methods

build_node(name, builder)

Build and return the MRC source node

build_node(name, builder)[source]#

Build and return the MRC source node