- nvidia.dali.fn.readers.coco(*inputs, **kwargs)¶
Reads data from a COCO dataset that is composed of a directory with images and annotation JSON files.
This readers produces the following outputs:
images, bounding_boxes, labels, ((polygons, vertices) | (pixelwise_masks)), (image_ids)
images Each sample contains image data with layout
HWC(height, width, channels).
bounding_boxes Each sample can have an arbitrary
Mnumber of bounding boxes, each described by 4 coordinates:
[[x_0, y_0, w_0, h_0], [x_1, y_1, w_1, h_1] ... [x_M, y_M, w_M, h_M]]
[l, t, r, b]format if requested (see
labels Each bounding box is associated with an integer label representing a category identifier:
[label_0, label_1, ..., label_M]
polygons and vertices (Optional, present if
polygon_masksis set to True) If
polygon_masksis enabled, two extra outputs describing masks by a set of polygons. Each mask contains an arbitrary number of polygons
P, each associated with a mask index in the range [0, M) and composed by a group of
Vvertices. The output
polygonsdescribes the polygons as follows:
[[mask_idx_0, start_vertex_idx_0, end_vertex_idx_0], [mask_idx_1, start_vertex_idx_1, end_vertex_idx_1], ... [mask_idx_P, start_vertex_idx_P, end_vertex_idx_P]]
mask_idxis the index of the mask the polygon, in the range
[0, M), and
end_verted_idxdefine the range of indices of vertices, as they appear in the output
vertices, belonging to this polygon. Each sample in
verticescontains a list of vertices that composed the different polygons in the sample, as 2D coordinates:
[[x_0, y_0], [x_1, y_1], ... [x_V, y_V]]
pixelwise_masks (Optional, present if argument
pixelwise_masksis set to True) Contains image-like data, same shape and layout as
images, representing a pixelwise segmentation mask.
image_ids (Optional, present if argument
image_idsis set to True) One element per sample, representing an image identifier.
- Supported backends
- Keyword Arguments:
annotations_file (str, optional, default = ‘’) – List of paths to the JSON annotations files.
avoid_class_remapping (bool, optional, default = False) –
If set to True, lasses ID values are returned directly as they are defined in the manifest file.
Otherwise, classes’ ID values are mapped to consecutive values in range 1-number of classes disregarding exact values from the manifest (0 is reserved for a special background class.
bytes_per_sample_hint (int or list of int, optional, default = ) –
Output size hint, in bytes per sample.
If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.
dont_use_mmap (bool, optional, default = False) –
If set to True, the Loader will use plain file I/O instead of trying to map the file in memory.
Mapping provides a small performance benefit when accessing a local file system, but most network file systems, do not provide optimum performance.
file_root (str, optional) –
Path to a directory that contains the data files.
If a file list is not provided, this argument is required.
image_ids (bool, optional, default = False) – If set to True, the image IDs will be produced in an extra output.
images (str or list of str, optional) –
A list of image paths.
If provided, it specifies the images that will be read. The images will be read in the same order as they appear in the list, and in case of duplicates, multiple copies of the relevant samples will be produced.
If left unspecified or set to None, all images listed in the annotation file are read exactly once, ordered by their image id.
The paths to be kept should match exactly those in the annotations file.
Note: This argument is mutually exclusive with
include_iscrowd (bool, optional, default = True) – If set to True annotations marked as
iscrowd=1are included as well.
initial_fill (int, optional, default = 1024) –
Size of the buffer that is used for shuffling.
random_shuffleis False, this parameter is ignored.
lazy_init (bool, optional, default = False) – Parse and prepare the dataset metadata only during the first run instead of in the constructor.
ltrb (bool, optional, default = False) –
If set to True, bboxes are returned as [left, top, right, bottom].
If set to False, the bboxes are returned as [x, y, width, height].
masks (bool, optional, default = False) –
Enable polygon masks.
polygon_masksinstead. Note that the polygon format has changed
mask_id, start_coord, end_coordto
mask_id, start_vertex, end_vertexwhere start_coord and end_coord are total number of coordinates, effectly
start_coord = 2 * start_vertexand
end_coord = 2 * end_vertex. Example: A polygon with vertices
[[x0, y0], [x1, y1], [x2, y2]]would be represented as
[mask_id, 0, 6]when using the deprecated argument
[mask_id, 0, 3]when using the new argument
num_shards (int, optional, default = 1) –
Partitions the data into the specified number of parts (shards).
This is typically used for multi-GPU or multi-node training.
pad_last_batch (bool, optional, default = False) –
If set to True, pads the shard by repeating the last sample.
If the number of batches differs across shards, this option can cause an entire batch of repeated samples to be added to the dataset.
pixelwise_masks (bool, optional, default = False) – If true, segmentation masks are read and returned as pixel-wise masks. This argument is mutually exclusive with
polygon_masks (bool, optional, default = False) – If set to True, segmentation mask polygons are read in the form of two outputs:
vertices. This argument is mutually exclusive with
prefetch_queue_depth (int, optional, default = 1) –
Specifies the number of batches to be prefetched by the internal Loader.
This value should be increased when the pipeline is CPU-stage bound, trading memory consumption for better interleaving with the Loader thread.
preprocessed_annotations (str, optional, default = ‘’) – Path to the directory with meta files that contain preprocessed COCO annotations.
preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.
random_shuffle (bool, optional, default = False) –
Determines whether to randomly shuffle data.
A prefetch buffer with a size equal to
initial_fillis used to read data sequentially, and then samples are selected randomly to form a batch.
ratio (bool, optional, default = False) – If set to True, the returned bbox and mask polygon coordinates are relative to the image dimensions.
read_ahead (bool, optional, default = False) –
Determines whether the accessed data should be read ahead.
For large files such as LMDB, RecordIO, or TFRecord, this argument slows down the first access but decreases the time of all of the following accesses.
save_preprocessed_annotations (bool, optional, default = False) – If set to True, the operator saves a set of files containing binary representations of the preprocessed COCO annotations.
save_preprocessed_annotations_dir (str, optional, default = ‘’) – Path to the directory in which to save the preprocessed COCO annotations files.
seed (int, optional, default = -1) –
If not provided, it will be populated based on the global seed of the pipeline.
shard_id (int, optional, default = 0) – Index of the shard to read.
shuffle_after_epoch (bool, optional, default = False) – If set to True, the reader shuffles the entire dataset after each epoch.
size_threshold (float, optional, default = 0.1) – If the width or the height, in number of pixels, of a bounding box that represents an instance of an object is lower than this value, the object will be ignored.
skip_cached_images (bool, optional, default = False) –
If set to True, the loading data will be skipped when the sample is in the decoder cache.
In this case, the output of the loader will be empty.
skip_empty (bool, optional, default = False) – If true, reader will skip samples with no object instances in them
stick_to_shard (bool, optional, default = False) –
Determines whether the reader should stick to a data shard instead of going through the entire dataset.
If decoder caching is used, it significantly reduces the amount of data to be cached, but might affect accuracy of the training.
tensor_init_bytes (int, optional, default = 1048576) – Hint for how much memory to allocate per image.
dump_meta_files (bool) –
dump_meta_filesis a deprecated alias for
dump_meta_files_path (str) –
dump_meta_files_pathis a deprecated alias for
meta_files_path (str) –
meta_files_pathis a deprecated alias for
save_img_ids (bool) –
save_img_idsis a deprecated alias for