Annotations#
The annotation service, which is part of TAO Data Services, offers tools for users to easily manipulate groundtruth labels. It supports the following actions:
convert
slice
merge
Each subtask is explained in the following sections. In TAO 7.0 these tasks
are driven by natural-language prompts to your agent; for DAFT-formatted
datasets specifically, the data/tao-convert-dataset-format skill handles conversion
between supported DAFT formats. For non-DAFT formats (COCO, KITTI, etc.) and
for slicing or merging, the agent scripts the work directly.
Supported Data Formats#
The annotation service supports the KITTI, ODVG COCO, and AICity/OVPKL formats.
Configuring a Specification File for Annotation Conversion#
The following is a sample specification file for converting a COCO dataset to KITTI format.
It has three key components–data, kitti, and coco–as well
as a global parameter, all of which are described below.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
results_dir |
The directory to save the annotation-conversion log to |
string |
– |
data |
The dataset config |
Dict |
– |
kitti |
The KITTI config |
Dict |
– |
coco |
The COCO config |
Dict |
– |
odvg |
The ODVG config |
Dict |
– |
aicity |
The AICity config |
Dict |
– |
Data Config#
The data configuration (data) specifies the source and target formats of the label conversion, as well as the output path.
Refer to the Data Annotation Format page for more information about the data formats including KITTI, COCO, and ODVG.
Note 1: Direct KITTI to ODVG or ODVG to KITTI conversion is not supported, but you can use COCO as an intermediate format to bridge KITTI and ODVG.
Note 2: When
input_formatandoutput_formatare both COCO, a new COCO annotation file is saved with category IDs remapped to contiguous IDs.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
input_format |
The input data format (“KITTI”, “ODVG”, or “COCO”) |
string |
|
output_format |
The output data format (“KITTI”, “ODVG”, or “COCO”) |
string |
|
output_dir |
The path to save the converted annotations |
string |
KITTI Config#
The KITTI configuration (kitti) specifies the KITTI dataset information.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
image_dir |
The image directory |
string |
|
label_dir |
The label directory |
string |
|
project |
The project name, which is used as the |
string |
|
mapping |
A YAML file specifying the category mappings. If this value is not
not specified, all categories in the |
string |
|
no_skip |
If True, do not skip images without any valid annotations. |
bool |
|
preserve_hierarchy |
If True, preserve the KITTI folder structure. |
bool |
The following is an example of a category mapping file:
- person:
- person
- Person
- person_group
- rider
- bag:
- hand_bag
- backpack
- personal_bag
- face:
- face
COCO Config#
The COCO configuration (coco) specifies the COCO annotation file location.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
ann_file |
The annotation file |
string |
|
refine_box |
Whether to refine boxes with segmentation when converting to KITTI |
bool |
|
use_all_categories |
Whether to use all categories |
bool |
|
add_background |
Whether to add background categories |
bool |
ODVG Config#
The ODVG configuration (odvg) specifies the ODVG annotation file location.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
ann_file |
The annotation file |
string |
|
labelmap_file |
The label map file |
string |
AICity Config#
The AICity configuration (aicity) specifies the AICity dataset information.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
root |
Path to the dataset root directory |
string |
|
version |
Dataset version identifier |
string |
“2025” |
split |
Dataset split (“train”, “val”, or “test”) |
string |
“train” |
class_config |
Class configuration dictionary that lists classes, mappings, attributes, and valid ID ranges |
Dict |
|
recentering |
Model is trained in BEV coordinates for a given BEV group. When enabled, 3D bounding boxes are shifted from OV coordinate to BEV coordinates. Origin of BEV is at the center of the BEV group. This is required to help with converge & achieve desired model accuracy. |
bool |
|
rgb_format |
File format/extension of RGB videos or images |
string |
“mp4” |
depth_format |
File format/extension of depth data |
string |
“h5” |
camera_grouping_mode |
Strategy for grouping cameras (e.g., “random”) |
string |
“random, training” |
num_frames |
Number of frames to load per sequence. If set to -1, all frames will be loaded. |
integer |
9000 |
anchor_init_config |
Anchor-initialization configuration dictionary (see below) |
Dict |
Anchor-Initialization Configuration#
The anchor-initialization configuration (anchor_init_config) specifies the anchor-initialization parameters.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
num_anchor |
Number of anchor points to generate |
integer |
900 |
detection_range |
Detection range for anchor points |
float |
-1 |
sample_ratio |
Sample ratio for anchor points |
float |
1 |
output_file_name |
Name of the output file for anchor points |
string |
“anchor_init_kmeans900.npy” |
Running the Annotation Conversion#
Ask your agent to perform the conversion. For example:
“Convert the COCO annotations at
s3://my-bucket/coco/annotations/instances_train2017.jsonto KITTI format. Images are ats3://my-bucket/coco/train2017/. Write the converted labels tos3://my-bucket/coco-as-kitti/.”
For DAFT-formatted datasets, the agent invokes the data/tao-convert-dataset-format
skill; for other formats it scripts the conversion directly using the
spec keys described below.
Configuring a Specification File for the Annotation Slicing Service#
The following is a sample specification file for slicing a COCO annotation file.
It has two key components–data and filter–as well
as a global parameter (results_dir), all of which are described below.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
results_dir |
The directory to save the output annotation files and logs to |
string |
– |
data |
The dataset configuration |
Dict |
– |
filter |
The filter configuration |
Dict |
– |
Data Config#
The dataset configuration (data) specifies the input format and annotation file.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
format |
The configuration format. Currently, only “COCO” is supported. |
string |
“COCO” |
annotation_file |
The input COCO annotation file |
string |
Filter Config#
The filter configuration (filter) specifies how to slice the annotation data, which can be done in one of four modes:
random: Randomly split the annotation file into N partitions or sample the annotation file by a certain percentagecategory: Filter annotation labels by the desired categoriesnumber: Pick N samples in order from the annotationsfilename: Filter the annotations by their file names
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
mode |
The filter mode (“random”, “category”, “number”, “filename”) |
string |
“COCO” |
dump_remainder |
A flag specifying whether to dump the remainder annotations. This
parameter only applies when the |
bool |
|
split |
The integer number of splits or the float sampling percentage (in “random” mode) |
float or integer |
|
num_samples |
The number of annotations to keep (in “number” mode) |
integer |
|
included_categories |
Categories to keep (in “category” mode) |
list |
|
excluded_categories |
Categories to exclude (in “category” mode) |
list |
|
re_patterns |
List of file name patterns to match (in “filename” mode) |
list |
Running the Annotation Slicer#
Ask your agent to perform the slice. For example:
“Slice the COCO annotation file at
s3://my-bucket/coco/instances_train2017.jsoninto an 80/20 random split. Write both outputs tos3://my-bucket/coco-splits/.”
The agent scripts the slice using the spec keys described above.
Configuring a Specification File for Annotation Merge#
The following is a sample specification file for merging COCO annotation files.
It has two key components–data and filter–as well
as a global parameter (results_dir), all of which are described below.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
results_dir |
The directory to save the output annotation file and logs |
string |
– |
data |
The dataset config |
Dict |
– |
Data Config#
The data configuration (data) specifies the input format and annotation file.
Field |
Description |
Data Type and Constraints |
Recommended/Typical Value |
format |
The configuration format. Currently, only “COCO” is supported. |
string |
“COCO” |
annotations |
A list of COCO annotation files |
string |
Note
All COCO annotation files must share the same categories.
Running the Annotation Merge#
Ask your agent to perform the merge. For example:
“Merge
s3://my-bucket/coco/train_part_a.jsonands3://my-bucket/coco/train_part_b.jsoninto a single COCO file ats3://my-bucket/coco/train_merged.json.”
The agent scripts the merge using the spec keys described above.