Annotations
The annotation service, which is part of TAO Data Services, offers tools for users to easily manipulate groundtruth labels. tao dataset annotations supports the following tasks:
convert
slice
merge
These tasks can be invoked from the TAO Launcher using the following convention on the command-line:
tao dataset annotations <sub_task> <args_per_subtask>
Where args_per_subtask
are the command-line arguments required for a given subtask. Each
subtask is explained in the following sections.
The following is a sample spec file for converting a COCO dataset to KITTI format.
It has three key components–data
, kitti
, and coco
–as well
as a global parameter, all of which are described below.
data:
input_format: "COCO"
output_format: "KITTI"
output_dir: "/workspace/output"
kitti:
image_dir: "/workspace/kitti/images"
label_dir: "/workspace/kitti/labels"
mapping: "/workspace/kitti_mapping.json"
coco:
ann_file: "/workspace/coco.json"
results_dir: "/path/to/results"
Field | Description | Data Type and Constraints | Recommended/Typical Value |
results_dir | The directory to save the annotation-conversion log to | string | – |
data | The dataset config | Dict | – |
kitti | The KITTI config | Dict | – |
coco | The COCO config | Dict | – |
odvg | The ODVG config | Dict | – |
Data Config
The data configuration (data
) specifies the source and target formats of the label conversion, as well as the output path.
See the Data Annotation Format page for more information about the data formats including KITTI, COCO, and ODVG.
Note 1: Direct KITTI to ODVG or ODVG to KITTI conversion is not supported, but you can use COCO as an intermediate format to bridge KITTI and ODVG.
Note 2: When
input_format`
andoutput_format`
are both COCO, a new COCO annotation file is saved with category IDs remapped to contiguous IDs.
Field | Description | Data Type and Constraints | Recommended/Typical Value |
input_format | The input data format (“KITTI”, “ODVG”, or “COCO”) | string | |
output_format | The output data format (“KITTI”, “ODVG”, or “COCO”) | string | |
output_dir | The path to save the converted annotations | string |
KITTI Config
The KITTI configuration (kitti
) specifies the KITTI dataset information.
Field | Description | Data Type and Constraints | Recommended/Typical Value |
image_dir | The image directory | string | |
label_dir | The label directory | string | |
project | The project name, which is used as the scene_id when
converting to COCO format. The default value is the parent
directory name of the image_dir . |
string | |
mapping | A YAML file specifying the category mappings. If this value is not
not specified, all categories in the label_dir are used). |
string | |
no_skip | If True, do not skip images without any valid annotations. | bool | |
preserve_hierarchy | If True, preserve the KITTI folder structure. | bool |
The following is an example of a category mapping file:
- person:
- person
- Person
- person_group
- rider
- bag:
- hand_bag
- backpack
- personal_bag
- face:
- face
COCO Config
The COCO configuration (coco
) specifies the COCO annotation file location.
Field | Description | Data Type and Constraints | Recommended/Typical Value |
ann_file | The annotation file | string | |
refine_box | Whether to refine boxes with segmentation when converting to KITTI | bool | |
use_all_categories | Whether to use all categories | bool | |
add_background | Whether to add background categories | bool |
ODVG Config
The ODVG configuration (odvg
) specifies the ODVG annotation file location.
Field | Description | Data Type and Constraints | Recommended/Typical Value |
ann_file | The annotation file | string | |
labelmap_file | The label map file | string |
The annotation conversion service can be invoked from the TAO Launcher using the following convention on the command-line:
tao dataset annotations convert [-h] -e <experiment spec>
[results_dir=<results_dir>]
Required Arguments
-e, --experiment_spec_file
: The experiment specification file
Optional Arguments
-h, --help
: Show this help message and exit.
Example
The following is an example of using the convert
command in Data Services:
tao dataset annotations convert -e /path/to/spec.yaml
The following is a sample spec file for slicing a COCO annotation file.
It has two key components–data
and filter
–as well
as a global parameter (results_dir
), all of which are described below.
data:
annotation_file: /datasets/coco/annotations/instances_val2017.json
filter:
mode: "category" # random, number
num_samples: 10
split: 5
excluded_categories:
- person
results_dir: /output/dir/
Field | Description | Data Type and Constraints | Recommended/Typical Value |
results_dir | The directory to save the output annotation files and logs to | string | – |
data | The dataset configuration | Dict | – |
filter | The filter configuration | Dict | – |
Data Config
The dataset configuration (data
) specifies the input format and annotation file.
Field | Description | Data Type and Constraints | Recommended/Typical Value |
format | The configuration format. Currently, only “COCO” is supported. | string | “COCO” |
annotation_file | The input COCO annotation file | string |
Filter Config
The filter configuration (filter
) specifies how to slice the annotation data, which can be done in one of four modes:
random
: Randomly split the annotation file into N partitions or sample the annotation file by a certain percentagecategory
: Filter annotation labels by the desired categoriesnumber
: Pick N samples in order from the annotationsfilename
: Filter the annotations by their file names
Field | Description | Data Type and Constraints | Recommended/Typical Value |
mode | The filter mode (“random”, “category”, “number”, “filename”) | string | “COCO” |
dump_remainder | A flag specifying whether to dump the remainder annotations. This
parameter only applies when the mode parameter is set to
“random” and the split parameter is a float value. |
bool | |
split | The integer number of splits or the float sampling percentage (in “random” mode) | float or integer | |
num_samples | The number of annotations to keep (in “number” mode) | integer | |
included_categories | Categories to keep (in “category” mode) | list | |
excluded_categories | Categories to exclude (in “category” mode) | list | |
re_patterns | List of file name patterns to match (in “filename” mode) | list |
The annotation slicing service can be invoked from the TAO Launcher using the following convention on the command-line:
tao dataset annotations slice [-h] -e <experiment spec>
[results_dir=<results_dir>]
Required Arguments
-e, --experiment_spec_file
: The experiment specification file
Optional Arguments
-h, --help
: Show this help message and exit.
Example
The following is an example of using the slice
command in Data Services:
tao dataset annotations slice -e /path/to/spec.yaml
The following is a sample spec file for merging COCO annotation files.
It has two key components–data
and filter
–as well
as a global parameter (results_dir
), all of which are described below.
data:
format: "COCO"
annotations:
- /datasets/part_0.json
- /datasets/part_1.json
- /datasets/part_2.json
- /datasets/part_3.json
- /datasets/part_4.json
results_dir: /output/dir/
Field | Description | Data Type and Constraints | Recommended/Typical Value |
results_dir | The directory to save the output annotation file and logs | string | – |
data | The dataset config | Dict | – |
Data Config
The data configuration (data
) specifies the input format and annotatioin file.
Field | Description | Data Type and Constraints | Recommended/Typical Value |
format | The configuration format. Currently, only “COCO” is supported. | string | “COCO” |
annotations | A list of COCO annotation files | string |
All COCO annotation files must share the same categories
.
The annotation merging service can be invoked from the TAO Launcher using the following convention on the command-line:
tao dataset annotations merge [-h] -e <experiment spec>
[results_dir=<results_dir>]
Required Arguments
-e, --experiment_spec_file
: The experiment specification file
Optional Arguments
-h, --help
: Show this help message and exit.
Example
Here’s an example of using the merge
command in Data Services:
tao dataset annotations merge -e /path/to/spec.yaml