TAO v5.5.0

Annotations

The annotation service, which is part of TAO Data Services, offers tools for users to easily manipulate groundtruth labels. tao dataset annotations supports the following tasks:

  • convert

  • slice

  • merge

These tasks can be invoked from the TAO Launcher using the following convention on the command-line:

Copy
Copied!
            

tao dataset annotations <sub_task> <args_per_subtask>

Where args_per_subtask are the command-line arguments required for a given subtask. Each subtask is explained in the following sections.

The annotation service supports the KITTI, ODVG, and COCO formats.

The following is a sample spec file for converting a COCO dataset to KITTI format. It has three key components–data, kitti, and coco–as well as a global parameter, all of which are described below.

Copy
Copied!
            

data: input_format: "COCO" output_format: "KITTI" output_dir: "/workspace/output" kitti: image_dir: "/workspace/kitti/images" label_dir: "/workspace/kitti/labels" mapping: "/workspace/kitti_mapping.json" coco: ann_file: "/workspace/coco.json" results_dir: "/path/to/results"

Field Description Data Type and Constraints Recommended/Typical Value
results_dir The directory to save the annotation-conversion log to string
data The dataset config Dict
kitti The KITTI config Dict
coco The COCO config Dict
odvg The ODVG config Dict

Data Config

The data configuration (data) specifies the source and target formats of the label conversion, as well as the output path. See the Data Annotation Format page for more information about the data formats including KITTI, COCO, and ODVG.

  • Note 1: Direct KITTI to ODVG or ODVG to KITTI conversion is not supported, but you can use COCO as an intermediate format to bridge KITTI and ODVG.

  • Note 2: When input_format` and output_format` are both COCO, a new COCO annotation file is saved with category IDs remapped to contiguous IDs.

Field Description Data Type and Constraints Recommended/Typical Value
input_format The input data format (“KITTI”, “ODVG”, or “COCO”) string
output_format The output data format (“KITTI”, “ODVG”, or “COCO”) string
output_dir The path to save the converted annotations string

KITTI Config

The KITTI configuration (kitti) specifies the KITTI dataset information.

Field Description Data Type and Constraints Recommended/Typical Value
image_dir The image directory string
label_dir The label directory string
project The project name, which is used as the scene_id when converting to COCO format. The default value is the parent directory name of the image_dir. string
mapping A YAML file specifying the category mappings. If this value is not not specified, all categories in the label_dir are used). string
no_skip If True, do not skip images without any valid annotations. bool
preserve_hierarchy If True, preserve the KITTI folder structure. bool

The following is an example of a category mapping file:

Copy
Copied!
            

- person: - person - Person - person_group - rider - bag: - hand_bag - backpack - personal_bag - face: - face

COCO Config

The COCO configuration (coco) specifies the COCO annotation file location.

Field Description Data Type and Constraints Recommended/Typical Value
ann_file The annotation file string
refine_box Whether to refine boxes with segmentation when converting to KITTI bool
use_all_categories Whether to use all categories bool
add_background Whether to add background categories bool

ODVG Config

The ODVG configuration (odvg) specifies the ODVG annotation file location.

Field Description Data Type and Constraints Recommended/Typical Value
ann_file The annotation file string
labelmap_file The label map file string

The annotation conversion service can be invoked from the TAO Launcher using the following convention on the command-line:

Copy
Copied!
            

tao dataset annotations convert [-h] -e <experiment spec> [results_dir=<results_dir>]

Required Arguments

  • -e, --experiment_spec_file: The experiment specification file

Optional Arguments

  • -h, --help: Show this help message and exit.

Example

The following is an example of using the convert command in Data Services:

Copy
Copied!
            

tao dataset annotations convert -e /path/to/spec.yaml

The following is a sample spec file for slicing a COCO annotation file. It has two key components–data and filter–as well as a global parameter (results_dir), all of which are described below.

Copy
Copied!
            

data: annotation_file: /datasets/coco/annotations/instances_val2017.json filter: mode: "category" # random, number num_samples: 10 split: 5 excluded_categories: - person results_dir: /output/dir/

Field Description Data Type and Constraints Recommended/Typical Value
results_dir The directory to save the output annotation files and logs to string
data The dataset configuration Dict
filter The filter configuration Dict

Data Config

The dataset configuration (data) specifies the input format and annotation file.

Field Description Data Type and Constraints Recommended/Typical Value
format The configuration format. Currently, only “COCO” is supported. string “COCO”
annotation_file The input COCO annotation file string

Filter Config

The filter configuration (filter) specifies how to slice the annotation data, which can be done in one of four modes:

  1. random: Randomly split the annotation file into N partitions or sample the annotation file by a certain percentage

  2. category: Filter annotation labels by the desired categories

  3. number: Pick N samples in order from the annotations

  4. filename: Filter the annotations by their file names

Field Description Data Type and Constraints Recommended/Typical Value
mode The filter mode (“random”, “category”, “number”, “filename”) string “COCO”
dump_remainder A flag specifying whether to dump the remainder annotations. This parameter only applies when the mode parameter is set to “random” and the split parameter is a float value. bool
split The integer number of splits or the float sampling percentage (in “random” mode) float or integer
num_samples The number of annotations to keep (in “number” mode) integer
included_categories Categories to keep (in “category” mode) list
excluded_categories Categories to exclude (in “category” mode) list
re_patterns List of file name patterns to match (in “filename” mode) list

The annotation slicing service can be invoked from the TAO Launcher using the following convention on the command-line:

Copy
Copied!
            

tao dataset annotations slice [-h] -e <experiment spec> [results_dir=<results_dir>]

Required Arguments

  • -e, --experiment_spec_file: The experiment specification file

Optional Arguments

  • -h, --help: Show this help message and exit.

Example

The following is an example of using the slice command in Data Services:

Copy
Copied!
            

tao dataset annotations slice -e /path/to/spec.yaml

The following is a sample spec file for merging COCO annotation files. It has two key components–data and filter–as well as a global parameter (results_dir), all of which are described below.

Copy
Copied!
            

data: format: "COCO" annotations: - /datasets/part_0.json - /datasets/part_1.json - /datasets/part_2.json - /datasets/part_3.json - /datasets/part_4.json results_dir: /output/dir/

Field Description Data Type and Constraints Recommended/Typical Value
results_dir The directory to save the output annotation file and logs string
data The dataset config Dict

Data Config

The data configuration (data) specifies the input format and annotatioin file.

Field Description Data Type and Constraints Recommended/Typical Value
format The configuration format. Currently, only “COCO” is supported. string “COCO”
annotations A list of COCO annotation files string
Note

All COCO annotation files must share the same categories.

The annotation merging service can be invoked from the TAO Launcher using the following convention on the command-line:

Copy
Copied!
            

tao dataset annotations merge [-h] -e <experiment spec> [results_dir=<results_dir>]

Required Arguments

  • -e, --experiment_spec_file: The experiment specification file

Optional Arguments

  • -h, --help: Show this help message and exit.

Example

Here’s an example of using the merge command in Data Services:

Copy
Copied!
            

tao dataset annotations merge -e /path/to/spec.yaml

Previous BYOM Image Classification
Next Offline Data Augmentation
© Copyright 2024, NVIDIA. Last updated on Oct 15, 2024.