Reads [Frame] sequences from a directory representing a collection of streams.
This operator expects
file_rootto contain a set of directories, where each directory represents an extracted video stream. This stream is represented by one file for each frame, sorted lexicographically. Sequences do not cross the stream boundary and only complete sequences are considered, so there is no padding.
Example directory structure:
- file_root - 0 - 00001.png - 00002.png - 00003.png - 00004.png - 00005.png - 00006.png .... - 1 - 00001.png - 00002.png - 00003.png - 00004.png - 00005.png - 00006.png ....
This operator is an analogue of VideoReader working on video frames extracted as separate images. It’s main purpose is for test baseline. For regular usage, the VideoReader is the recommended approach.
This operator allows sequence inputs.
- Supported backends
- Keyword Arguments
file_root (str) – Path to a directory containing streams, where the directories represent streams.
sequence_length (int) – Length of sequence to load for each sample.
bytes_per_sample_hint (int or list of int, optional, default = ) –
Output size hint, in bytes per sample.
If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.
dont_use_mmap (bool, optional, default = False) –
If set to True, the Loader will use plain file I/O instead of trying to map the file in memory.
Mapping provides a small performance benefit when accessing a local file system, but most network file systems, do not provide optimum performance.
nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – The color space of input and output image.
initial_fill (int, optional, default = 1024) –
Size of the buffer that is used for shuffling.
random_shuffleis False, this parameter is ignored.
lazy_init (bool, optional, default = False) – Parse and prepare the dataset metadata only during the first run instead of in the constructor.
num_shards (int, optional, default = 1) –
Partitions the data into the specified number of parts (shards).
This is typically used for multi-GPU or multi-node training.
pad_last_batch (bool, optional, default = False) –
If set to True, pads the shard by repeating the last sample.
If the number of batches differs across shards, this option can cause an entire batch of repeated samples to be added to the dataset.
prefetch_queue_depth (int, optional, default = 1) –
Specifies the number of batches to be prefetched by the internal Loader.
This value should be increased when the pipeline is CPU-stage bound, trading memory consumption for better interleaving with the Loader thread.
preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.
random_shuffle (bool, optional, default = False) –
Determines whether to randomly shuffle data.
A prefetch buffer with a size equal to
initial_fillis used to read data sequentially, and then samples are selected randomly to form a batch.
read_ahead (bool, optional, default = False) –
Determines whether the accessed data should be read ahead.
For large files such as LMDB, RecordIO, or TFRecord, this argument slows down the first access but decreases the time of all of the following accesses.
seed (int, optional, default = -1) –
If not provided, it will be populated based on the global seed of the pipeline.
shard_id (int, optional, default = 0) – Index of the shard to read.
skip_cached_images (bool, optional, default = False) –
If set to True, the loading data will be skipped when the sample is in the decoder cache.
In this case, the output of the loader will be empty.
step (int, optional, default = 1) – Distance between first frames of consecutive sequences.
stick_to_shard (bool, optional, default = False) –
Determines whether the reader should stick to a data shard instead of going through the entire dataset.
If decoder caching is used, it significantly reduces the amount of data to be cached, but might affect accuracy of the training.
stride (int, optional, default = 1) – Distance between consecutive frames in a sequence.
tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.