Auto-Label
Annotating an image-based dataset can be quite tedious and time-consuming, which is especially true for segmentation tasks. When labelling, it can take 10 times longer to draw a good polygon around an object than a bounding box. The Auto-Label service of TAO Data Servcies is designed to reduce the time spent annotating an image dataset. Currently, this service supports: * automatically generating bounding box annotations given category names or referring expressions. * automatically generating instance segmentation masks given the groundtruth bounding boxes.
The Auto-Label service expects that the groundtruth annotation of a directory of images is stored in a COCO-format JSON file.
Parameter | Datatype | Description |
mal | collection | The configuration of MAL |
grounding_dino | collection | The configuration of Grounding DINO |
gpu_ids | list | Indices of GPUs to use |
num_gpus | int32 | Number of GPUs to use |
batch_size | int32 | Batch size |
num_workers | int32 | Number of workers for dataloader |
results_dir | string | Result directory |
autolabel_type | string | Type of auto-labeling to run (“mal” or “grounding_dino”) |
Grounding DINO Configuration
Field |
value_type |
Description |
---|---|---|
model |
collection | The GroundingDINO model config |
train |
collection | The data source for testing: |
dataset |
collection | The data source for inference: * image_dir : The list of directories that contains the inference images * class_names : The list of classes to run auto-labeling * noun_chunk_path : The JSONL file that stores noun chunks * augmentation : The GroundingDINO augmentation config |
results_dir |
string | Result directory |
iteration_scheduler |
string | The list of iteration schedule. Default is one iteration with confidence threshold of 0.5. Next iteration eliminates classes/noun chunks that have been already detected. |
visualize |
bool | Flag to enable visualization of bounding boxes |
checkpoint |
string | Grounding DINO model checkpoint path |
The process of using Grounding DINO to iteratively auto-label an image dataset is described as follows:
A single forward pass of the candidate images is run through a Grounding DINO model that generates bounding box annotations for the list of grounded noun chunks or class names.
Takes the labels from this iteration and then aggregates it with the labels from the previous iteration. The aggregation process involves a method of clustering similar annotations, such as NMS or DBSCAN.
The iterative labeling process is terminated based on a predefined criterion, such as:
current iteration number crossing an upper bound of maximum number of iterations.
if all the classes mentioned in the input list of noun chunks and class names have corresponding labels and no new labels have been added across iterations.
If the termination condition isn’t met, it retriggers another forward pass through the open vocabulary model inferencer. However, this time the model inference happens at a lower confidence threshold. The rate at which the confidence threshold is decreased, is determined by the confidence threshold annealing scheduler (“confidence annealing”). This could be stepwise annealing, exponential decay, or cosine annealing.
MAL Configuration
See the Mask Auto Labeler (MAL) documentation for more information about creating a spec file.
The Auto-Label service supports the following tasks:
generate
- Generates pseudo-labels based on the input bounding boxes
The Auto-Label service can be invoked from the TAO Launcher using the following convention on the command-line:
tao dataset auto_label generate [-h] -e <experiment spec>
[results_dir=<results_dir>]
[num_gpus=<num_gpus>]
Required Arguments
-e, --experiment_spec_file
: The experiment specification file
Optional Arguments
num_gpus
: The number of GPUs to use for inference. The default value is 1.-h, --help
: Show this help message and exit.
Here’s an example of using the Auto-Label generate
command with an MAL model:
tao dataset auto_label generate -e /path/to/spec.yaml num_gpus=2