(Latest Version)
class DirectoryWatcher(input_glob, watch_directory, max_files, sort_glob, recursive, queue_max_size, batch_timeout)[source]

Bases: object

This class is in responsible of polling for new files in the supplied input glob of directories and forwarding them on to the pipeline for processing.


Pipeline configuration instance.


Input glob pattern to match files to read. For example, /input_dir/*.json would read all files with the ‘json’ extension in the directory input_dir.


The watch directory option instructs this stage to not close down once all files have been read. Instead it will read all files that match the ‘input_glob’ pattern, and then continue to watch the directory for additional files. Any new files that are added that match the glob will then be processed.

max_files: int

Max number of files to read. Useful for debugging to limit startup time. Default value of -1 is unlimited.


If true the list of files matching input_glob will be processed in sorted order.

recursive: bool

If true, events will be emitted for the files in subdirectories matching input_glob.

queue_max_size: int

Maximum queue size to hold the file paths to be processed that match input_glob.

batch_timeout: float

Timeout to retrieve batch messages from the queue.


build_node(name, builder)

Build and return the MRC source node

build_node(name, builder)[source]

Build and return the MRC source node

© Copyright 2023, NVIDIA. Last updated on Apr 11, 2023.