Auto-Label#

Annotating an image-based dataset can be quite tedious and time-consuming, which is especially true for segmentation tasks. When labelling, it can take 10 times longer to draw a good polygon around an object than a bounding box. The Auto-Label service of TAO Data Servcies is designed to reduce the time spent annotating an image dataset. Currently, this service supports: * automatically generating bounding box annotations given category names or referring expressions. * automatically generating instance segmentation masks given the groundtruth bounding boxes.

Data Input for Auto-Label#

The Auto-Label service expects that the groundtruth annotation of a directory of images is stored in a COCO-format JSON file.

Configuring Spec File for Auto-Label#

Parameter	Datatype	Description
mal	collection	The configuration of MAL
grounding_dino	collection	The configuration of Grounding DINO
gpu_ids	list	Indices of GPUs to use
num_gpus	int32	Number of GPUs to use
batch_size	int32	Batch size
num_workers	int32	Number of workers for dataloader
results_dir	string	Result directory
autolabel_type	string	Type of auto-labeling to run (“mal” or “grounding_dino”)

Grounding DINO Configuration#

Field	value_type	Description
`model`	collection	The GroundingDINO model config
`train`	collection	The data source for testing:
`dataset`	collection	The data source for inference: * image_dir : The list of directories that contains the inference images * class_names : The list of classes to run auto-labeling * noun_chunk_path : The JSONL file that stores noun chunks * augmentation : The GroundingDINO augmentation config
`results_dir`	string	Result directory
`iteration_scheduler`	string	The list of iteration schedule. Default is one iteration with confidence threshold of 0.5. Next iteration eliminates classes/noun chunks that have been already detected.
`visualize`	bool	Flag to enable visualization of bounding boxes
`checkpoint`	string	Grounding DINO model checkpoint path

The process of using Grounding DINO to iteratively auto-label an image dataset is described as follows:

A single forward pass of the candidate images is run through a Grounding DINO model that generates bounding box annotations for the list of grounded noun chunks or class names.
Takes the labels from this iteration and then aggregates it with the labels from the previous iteration. The aggregation process involves a method of clustering similar annotations, such as NMS or DBSCAN.
The iterative labeling process is terminated based on a predefined criterion, such as:
- current iteration number crossing an upper bound of maximum number of iterations.
- if all the classes mentioned in the input list of noun chunks and class names have corresponding labels and no new labels have been added across iterations.
If the termination condition isn’t met, it retriggers another forward pass through the open vocabulary model inferencer. However, this time the model inference happens at a lower confidence threshold. The rate at which the confidence threshold is decreased, is determined by the confidence threshold annealing scheduler (“confidence annealing”). This could be stepwise annealing, exponential decay, or cosine annealing.

MAL Configuration#

See the Mask Auto Labeler (MAL) documentation for more information about creating a spec file.

Running the Auto-Label Tool#

The Auto-Label service supports the following tasks:

generate - Generates pseudo-labels based on the input bounding boxes

The Auto-Label service can be invoked from the TAO Launcher using the following convention on the command-line:

tao dataset auto_label generate [-h] -e <experiment spec>
                                [results_dir=<results_dir>]
                                [num_gpus=<num_gpus>]

Required Arguments#

-e, --experiment_spec_file: The experiment specification file

Optional Arguments#

num_gpus: The number of GPUs to use for inference. The default value is 1.
-h, --help: Show this help message and exit.

Here’s an example of using the Auto-Label generate command with an MAL model:

tao dataset auto_label generate -e /path/to/spec.yaml num_gpus=2