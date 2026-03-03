The annotation service, which is part of TAO Data Services, offers tools for users to easily manipulate groundtruth labels. tao dataset annotations supports the following tasks:

convert

slice

merge

These tasks can be invoked from the TAO Launcher using the following convention on the command-line:

tao dataset annotations <sub_task> <args_per_subtask>

Where args_per_subtask are the command-line arguments required for a given subtask. Each subtask is explained in the following sections.

Supported Data Formats# The annotation service supports the KITTI, ODVG COCO, and AICity/OVPKL formats.

Configuring a Spec File for Annotation Conversion# The following is a sample spec file for converting a COCO dataset to KITTI format. It has three key components– data , kitti , and coco –as well as a global parameter, all of which are described below. FTMS Client Use the following command to get an experiment spec file for annotations format conversion: SPECS = $( tao-client annotations get-spec --action annotation_format_convert --job_type dataset --id $DATASET_ID ) TAO Launcher data : input_format : "COCO" output_format : "KITTI" output_dir : "/workspace/output" kitti : image_dir : "/workspace/kitti/images" label_dir : "/workspace/kitti/labels" mapping : "/workspace/kitti_mapping.json" coco : ann_file : "/workspace/coco.json" results_dir : "/path/to/results" Field Description Data Type and Constraints Recommended/Typical Value results_dir The directory to save the annotation-conversion log to string – data The dataset config Dict – kitti The KITTI config Dict – coco The COCO config Dict – odvg The ODVG config Dict – aicity The AICity config Dict – Data Config# The data configuration ( data ) specifies the source and target formats of the label conversion, as well as the output path. See the Data Annotation Format page for more information about the data formats including KITTI, COCO, and ODVG. Note 1: Direct KITTI to ODVG or ODVG to KITTI conversion is not supported, but you can use COCO as an intermediate format to bridge KITTI and ODVG.

Note 2: When input_format` and output_format` are both COCO, a new COCO annotation file is saved with category IDs remapped to contiguous IDs. Field Description Data Type and Constraints Recommended/Typical Value input_format The input data format (“KITTI”, “ODVG”, or “COCO”) string output_format The output data format (“KITTI”, “ODVG”, or “COCO”) string output_dir The path to save the converted annotations string KITTI Config# The KITTI configuration ( kitti ) specifies the KITTI dataset information. Field Description Data Type and Constraints Recommended/Typical Value image_dir The image directory string label_dir The label directory string project The project name, which is used as the scene_id when converting to COCO format. The default value is the parent directory name of the image_dir . string mapping A YAML file specifying the category mappings. If this value is not not specified, all categories in the label_dir are used). string no_skip If True, do not skip images without any valid annotations. bool preserve_hierarchy If True, preserve the KITTI folder structure. bool The following is an example of a category mapping file: - person: - person - Person - person_group - rider - bag: - hand_bag - backpack - personal_bag - face: - face COCO Config# The COCO configuration ( coco ) specifies the COCO annotation file location. Field Description Data Type and Constraints Recommended/Typical Value ann_file The annotation file string refine_box Whether to refine boxes with segmentation when converting to KITTI bool use_all_categories Whether to use all categories bool add_background Whether to add background categories bool ODVG Config# The ODVG configuration ( odvg ) specifies the ODVG annotation file location. Field Description Data Type and Constraints Recommended/Typical Value ann_file The annotation file string labelmap_file The label map file string AICity Config# The AICity configuration ( aicity ) specifies the AICity dataset information. Field Description Data Type and Constraints Recommended/Typical Value root Path to the dataset root directory string version Dataset version identifier string “2025” split Dataset split (“train”, “val”, or “test”) string “train” class_config Class configuration dictionary that lists classes, mappings, attributes, and valid ID ranges Dict recentering Model is trained in BEV coordinates for a given BEV group. When enabled, 3D bounding boxes are shifted from OV coordinate to BEV coordinates. Origin of BEV is at the center of the BEV group. This is required to help with converge & achieve desired model accuracy. bool rgb_format File format/extension of RGB videos or images string “mp4” depth_format File format/extension of depth data string “h5” camera_grouping_mode Strategy for grouping cameras (e.g., “random”) string “random, training” num_frames Number of frames to load per sequence. If set to -1, all frames will be loaded. integer 9000 anchor_init_config Anchor-initialization configuration dictionary (see below) Dict Anchor-Initialization Configuration# The anchor-initialization configuration ( anchor_init_config ) specifies the anchor-initialization parameters. Field Description Data Type and Constraints Recommended/Typical Value num_anchor Number of anchor points to generate integer 900 detection_range Detection range for anchor points float -1 sample_ratio Sample ratio for anchor points float 1 output_file_name Name of the output file for anchor points string “anchor_init_kmeans900.npy”

Running the Annotation Conversion# The annotation conversion service can be invoked from the TAO Launcher using the following convention on the command-line: FTMS Client DS_FORMAT_CONVERT_JOB_ID = $( tao-client annotations dataset-run-action --action annotations_format_convert --id $DATASET_ID --specs " $SPECS " ) TAO Launcher tao dataset annotations convert [ -h ] -e <experiment spec> [ results_dir = <results_dir> ] Required Arguments -e, --experiment_spec_file : The experiment specification file Optional Arguments -h, --help : Show this help message and exit. Example The following is an example of using the convert command in Data Services: tao dataset annotations convert -e /path/to/spec.yaml

Configuring a Spec File for the Annotation Slicing Service# The following is a sample spec file for slicing a COCO annotation file. It has two key components– data and filter –as well as a global parameter ( results_dir ), all of which are described below. FTMS Client Use the following command to get an experiment spec file for annotations slicing: SLICE_SPECS = $( tao-client annotations get-spec --action annotation_slice --job_type dataset --id $DATASET_ID ) TAO Launcher data : annotation_file : /datasets/coco/annotations/instances_val2017.json filter : mode : "category" # random, number num_samples : 10 split : 5 excluded_categories : - person results_dir : /output/dir/ Field Description Data Type and Constraints Recommended/Typical Value results_dir The directory to save the output annotation files and logs to string – data The dataset configuration Dict – filter The filter configuration Dict – Data Config# The dataset configuration ( data ) specifies the input format and annotation file. Field Description Data Type and Constraints Recommended/Typical Value format The configuration format. Currently, only “COCO” is supported. string “COCO” annotation_file The input COCO annotation file string Filter Config# The filter configuration ( filter ) specifies how to slice the annotation data, which can be done in one of four modes: random : Randomly split the annotation file into N partitions or sample the annotation file by a certain percentage category : Filter annotation labels by the desired categories number : Pick N samples in order from the annotations filename : Filter the annotations by their file names Field Description Data Type and Constraints Recommended/Typical Value mode The filter mode (“random”, “category”, “number”, “filename”) string “COCO” dump_remainder A flag specifying whether to dump the remainder annotations. This parameter only applies when the mode parameter is set to “random” and the split parameter is a float value. bool split The integer number of splits or the float sampling percentage (in “random” mode) float or integer num_samples The number of annotations to keep (in “number” mode) integer included_categories Categories to keep (in “category” mode) list excluded_categories Categories to exclude (in “category” mode) list re_patterns List of file name patterns to match (in “filename” mode) list

Running the Annotation Slicer# The annotation slicing service can be invoked from the TAO Launcher using the following convention on the command-line: FTMS Client DS_SLICE_JOB_ID = $( tao-client annotations dataset-run-action --action annotations_slice --id $DATASET_ID --specs " $SLICE_SPECS " ) TAO Launcher tao dataset annotations slice [ -h ] -e <experiment spec> [ results_dir = <results_dir> ] Required Arguments -e, --experiment_spec_file : The experiment specification file Optional Arguments -h, --help : Show this help message and exit. Example The following is an example of using the slice command in Data Services: tao dataset annotations slice -e /path/to/spec.yaml

Configuring a Spec file for Annotation Merge# The following is a sample spec file for merging COCO annotation files. It has two key components– data and filter –as well as a global parameter ( results_dir ), all of which are described below. FTMS Client Use the following command to get an experiment spec file for annotations merging: MERGE_SPECS = $( tao-client annotations get-spec --action annotation_merge --job_type dataset --id $DATASET_ID ) TAO Launcher data : format : "COCO" annotations : - /datasets/part_0.json - /datasets/part_1.json - /datasets/part_2.json - /datasets/part_3.json - /datasets/part_4.json results_dir : /output/dir/ Field Description Data Type and Constraints Recommended/Typical Value results_dir The directory to save the output annotation file and logs string – data The dataset config Dict – Data Config# The data configuration ( data ) specifies the input format and annotatioin file. Field Description Data Type and Constraints Recommended/Typical Value format The configuration format. Currently, only “COCO” is supported. string “COCO” annotations A list of COCO annotation files string Note All COCO annotation files must share the same categories .