Annotations#

The annotation service, which is part of TAO Data Services, offers tools for users to easily manipulate groundtruth labels. It supports the following actions:

convert
slice
merge

Each subtask is explained in the following sections. In TAO 7.0 these tasks are driven by natural-language prompts to your agent; for DAFT-formatted datasets specifically, the data/tao-convert-dataset-format skill handles conversion between supported DAFT formats. For non-DAFT formats (COCO, KITTI, etc.) and for slicing or merging, the agent scripts the work directly.

Supported Data Formats#

The annotation service supports the KITTI, ODVG COCO, and AICity/OVPKL formats.

Configuring a Specification File for Annotation Conversion#

The following is a sample specification file for converting a COCO dataset to KITTI format. It has three key components–data, kitti, and coco–as well as a global parameter, all of which are described below.

Field	Description	Data Type and Constraints	Recommended/Typical Value
results_dir	The directory to save the annotation-conversion log to	string	–
data	The dataset config	Dict	–
kitti	The KITTI config	Dict	–
coco	The COCO config	Dict	–
odvg	The ODVG config	Dict	–
aicity	The AICity config	Dict	–

Data Config#

The data configuration (data) specifies the source and target formats of the label conversion, as well as the output path. Refer to the Data Annotation Format page for more information about the data formats including KITTI, COCO, and ODVG.

Note 1: Direct KITTI to ODVG or ODVG to KITTI conversion is not supported, but you can use COCO as an intermediate format to bridge KITTI and ODVG.
Note 2: When input_format and output_format are both COCO, a new COCO annotation file is saved with category IDs remapped to contiguous IDs.

Field	Description	Data Type and Constraints	Recommended/Typical Value
input_format	The input data format (“KITTI”, “ODVG”, or “COCO”)	string
output_format	The output data format (“KITTI”, “ODVG”, or “COCO”)	string
output_dir	The path to save the converted annotations	string

KITTI Config#

The KITTI configuration (kitti) specifies the KITTI dataset information.

Field	Description	Data Type and Constraints	Recommended/Typical Value
image_dir	The image directory	string
label_dir	The label directory	string
project	The project name, which is used as the `scene_id` when converting to COCO format. The default value is the parent directory name of the `image_dir`.	string
mapping	A YAML file specifying the category mappings. If this value is not not specified, all categories in the `label_dir` are used).	string
no_skip	If True, do not skip images without any valid annotations.	bool
preserve_hierarchy	If True, preserve the KITTI folder structure.	bool

The following is an example of a category mapping file:

- person:
  - person
  - Person
  - person_group
  - rider
- bag:
  - hand_bag
  - backpack
  - personal_bag
- face:
  - face

COCO Config#

The COCO configuration (coco) specifies the COCO annotation file location.

Field	Description	Data Type and Constraints	Recommended/Typical Value
ann_file	The annotation file	string
refine_box	Whether to refine boxes with segmentation when converting to KITTI	bool
use_all_categories	Whether to use all categories	bool
add_background	Whether to add background categories	bool

ODVG Config#

The ODVG configuration (odvg) specifies the ODVG annotation file location.

Field	Description	Data Type and Constraints	Recommended/Typical Value
ann_file	The annotation file	string
labelmap_file	The label map file	string

AICity Config#

The AICity configuration (aicity) specifies the AICity dataset information.

Field	Description	Data Type and Constraints	Recommended/Typical Value
root	Path to the dataset root directory	string
version	Dataset version identifier	string	“2025”
split	Dataset split (“train”, “val”, or “test”)	string	“train”
class_config	Class configuration dictionary that lists classes, mappings, attributes, and valid ID ranges	Dict
recentering	Model is trained in BEV coordinates for a given BEV group. When enabled, 3D bounding boxes are shifted from OV coordinate to BEV coordinates. Origin of BEV is at the center of the BEV group. This is required to help with converge & achieve desired model accuracy.	bool
rgb_format	File format/extension of RGB videos or images	string	“mp4”
depth_format	File format/extension of depth data	string	“h5”
camera_grouping_mode	Strategy for grouping cameras (e.g., “random”)	string	“random, training”
num_frames	Number of frames to load per sequence. If set to -1, all frames will be loaded.	integer	9000
anchor_init_config	Anchor-initialization configuration dictionary (see below)	Dict

Anchor-Initialization Configuration#

The anchor-initialization configuration (anchor_init_config) specifies the anchor-initialization parameters.

Field	Description	Data Type and Constraints	Recommended/Typical Value
num_anchor	Number of anchor points to generate	integer	900
detection_range	Detection range for anchor points	float	-1
sample_ratio	Sample ratio for anchor points	float	1
output_file_name	Name of the output file for anchor points	string	“anchor_init_kmeans900.npy”

Running the Annotation Conversion#

Ask your agent to perform the conversion. For example:

“Convert the COCO annotations at s3://my-bucket/coco/annotations/instances_train2017.json to KITTI format. Images are at s3://my-bucket/coco/train2017/ . Write the converted labels to s3://my-bucket/coco-as-kitti/.”

For DAFT-formatted datasets, the agent invokes the data/tao-convert-dataset-format skill; for other formats it scripts the conversion directly using the spec keys described below.

Configuring a Specification File for the Annotation Slicing Service#

The following is a sample specification file for slicing a COCO annotation file. It has two key components–data and filter–as well as a global parameter (results_dir), all of which are described below.

Field	Description	Data Type and Constraints	Recommended/Typical Value
results_dir	The directory to save the output annotation files and logs to	string	–
data	The dataset configuration	Dict	–
filter	The filter configuration	Dict	–

Data Config#

The dataset configuration (data) specifies the input format and annotation file.

Field	Description	Data Type and Constraints	Recommended/Typical Value
format	The configuration format. Currently, only “COCO” is supported.	string	“COCO”
annotation_file	The input COCO annotation file	string

Filter Config#

The filter configuration (filter) specifies how to slice the annotation data, which can be done in one of four modes:

random: Randomly split the annotation file into N partitions or sample the annotation file by a certain percentage
category: Filter annotation labels by the desired categories
number: Pick N samples in order from the annotations
filename: Filter the annotations by their file names

Field	Description	Data Type and Constraints	Recommended/Typical Value
mode	The filter mode (“random”, “category”, “number”, “filename”)	string	“COCO”
dump_remainder	A flag specifying whether to dump the remainder annotations. This parameter only applies when the `mode` parameter is set to “random” and the `split` parameter is a float value.	bool
split	The integer number of splits or the float sampling percentage (in “random” mode)	float or integer
num_samples	The number of annotations to keep (in “number” mode)	integer
included_categories	Categories to keep (in “category” mode)	list
excluded_categories	Categories to exclude (in “category” mode)	list
re_patterns	List of file name patterns to match (in “filename” mode)	list

Running the Annotation Slicer#

Ask your agent to perform the slice. For example:

“Slice the COCO annotation file at s3://my-bucket/coco/instances_train2017.json into an 80/20 random split. Write both outputs to s3://my-bucket/coco-splits/.”

The agent scripts the slice using the spec keys described above.

Configuring a Specification File for Annotation Merge#

The following is a sample specification file for merging COCO annotation files. It has two key components–data and filter–as well as a global parameter (results_dir), all of which are described below.

Field	Description	Data Type and Constraints	Recommended/Typical Value
results_dir	The directory to save the output annotation file and logs	string	–
data	The dataset config	Dict	–

Data Config#

The data configuration (data) specifies the input format and annotation file.

Field	Description	Data Type and Constraints	Recommended/Typical Value
format	The configuration format. Currently, only “COCO” is supported.	string	“COCO”
annotations	A list of COCO annotation files	string

Note

All COCO annotation files must share the same categories.

Running the Annotation Merge#

Ask your agent to perform the merge. For example:

“Merge s3://my-bucket/coco/train_part_a.json and s3://my-bucket/coco/train_part_b.json into a single COCO file at s3://my-bucket/coco/train_merged.json .”

The agent scripts the merge using the spec keys described above.