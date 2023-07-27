Mask Auto Labeler
Mask Auto Labeler (MAL) is a high-quality, transformer-based mask auto-labeling framework for instance segmentation using only box annotations. It supports the following tasks:
train
evaluate
inference
These tasks may be invoked from the TAO Toolkit Launcher using the following convention on the command line:
tao mal <sub_task> <args_per_subtask>
Where
args_per_subtask are the command-line arguments required for a given subtask. Each of
these subtasks are explained in detail below.
Below is a sample MAL spec file. It has four five components–
model,
inference,
evaluate,
dataset, and
train–as well as several global parameters,
which are described below. The format of the spec file is a YAML file.
gpus: [0, 1]
strategy: 'ddp_sharded'
results_dir: '/path/to/result/dir'
dataset:
train_ann_path: '/datasets/coco/annotations/instances_train2017.json'
train_img_dir: '/datasets/coco/raw-data/train2017'
val_ann_path: '/coco/annotations/instances_val2017.json'
val_img_dir: '/datasets/coco/raw-data/val2017'
load_mask: True
crop_size: 512
inference:
ann_path: '/dataset/sample.json'
img_dir: '/dataset/sample_dir'
label_dump_path: '/dataset/sample_output.json'
model:
arch: 'vit-mae-base/16'
train:
num_epochs: 10
batch_size: 4
use_amp: True
|
Field
|
Description
|
Data Type and Constraints
|
Recommended/Typical Value
|
gpus
|
A list of GPU indices to use
|
List of int
|
strategy
|
The distributed training strategy
|
string
|
‘ddp_sharded’
|
num_nodes
|
The number of nodes in multinode training
|
Unsigned int
|
–
|
checkpoint
|
Either a pretrained model or a MAL checkpoint to load
|
string
|
–
|
results_dir
|
The directory to save experiement results to
|
string
|
–
|
dataset
|
The dataset config
|
Dict
|
–
|
train
|
The training config
|
Dict
|
–
|
model
|
The model config
|
Dict
|
–
|
evaluate
|
The evaluation config
|
Dict
|
–
Dataset Config
The dataset configuration (
dataset) defines the data source and input size.
|
Field
|
Description
|
Data Type and Constraints
|
Recommended/Typical Value
|
train_ann_path
|
The path to the training annotation JSON file
|
string
|
val_ann_path
|
The path to the validation annotation JSON file
|
string
|
train_img_dir
|
The path to the training image directory
|
string
|
val_img_dir
|
The path to the validation annotation JSON file
|
string
|
crop_size
|
The effective input size of the model
|
Unsigned int
|
512
|
load_mask
|
A flag specifying whether to load the segmentation mask from the JSON file
|
boolean
|
min_obj_size
|
The minimum object size for training
|
float
|
2048
|
max_obj_size
|
The maximum object size for training
|
float
|
1e10
|
num_workers_per_gpu
|
The number of workers to load data for each GPU
|
Unsigned int
Model Config
The model configuration (
model) defines the model architecture.
|
Field
|
Description
|
Data Type and Constraints
|
Recommended/Typical Value
|
arch
|
The backbone architecture Supported backbones include the following:
|
string
|
vit-mae-base/16
|
frozen_stages
|
The indices of the frozen blocks
|
List[int]
|
-1
|
mask_head_num_convs
|
The number of conv layers in the mask head
|
Unsigned int
|
4
|
mask_head_hidden_channel
|
The number of conv channels in the mask head
|
Unsigned int
|
256
|
mask_head_out_channel
|
The number of output channels in the mask head
|
Unsigned int
|
256
|
teacher_momentum
|
The momentum of the teacher model
|
float
|
0.996
Train Config
The training configuration (
train) specifies the parameters for the training process.
|
Field
|
Description
|
Data Type and Constraints
|
Recommended/Typical Value
|
num_epochs
|
The number of epochs
|
Unsigned int
|
10
|
save_every_k_epoch
|
The save checkpoint for every K epochs
|
Unsigned int
|
1
|
val_interval
|
The validation interval
|
Unsigned int
|
1
|
batch_size
|
The training batch size
|
Unsigned int
|
use_amp
|
A flag specifying whether to use mixed precision
|
boolean
|
True
|
optim_momentum
|
The momentum of the AdamW optimizer
|
float
|
0.9
|
lr
|
The learning rate
|
float
|
0.0000015
|
min_lr_rate
|
The minimum learning rate ratio
|
float
|
0.2
|
wd
|
The weight decay
|
float
|
0.0005
|
warmup_epochs
|
The number of epochs for warmup
|
Unsigned int
|
1
|
crf_kernel_size
|
The kernel size of the mean field approximation
|
Unsigned int
|
3
|
crf_num_iter
|
The number of iterations to run mask refinement
|
Unsigned int
|
100
|
loss_mil_weight
|
The weight of multiple instance learning loss
|
float
|
4
|
loss_crf_weight
|
The weight of conditional random field loss
|
float
|
0.5
|
results_dir
|
The directory to save training results
|
string
Evaluation Config
The evaluation configuration (
evaluate) specifies the parameters for the validation during training as well as the standalone evaluation.
|
Field
|
Description
|
Data Type and Constraints
|
Recommended/Typical Value
|
batch_size
|
The evaluation batch size
|
Unsigned int
|
use_mixed_model_test
|
A flag specifying whether to evaluate with a mixed model
|
boolean
|
False
|
use_teacher_test
|
A flag specifying whether to evaluate with the teacher model
|
boolean
|
False
|
results_dir
|
The directory to save the evaluation log
|
string
Inference Config
The inference configuration (
inference) specifies the parameters for generating pseudo masks given the groundtruth bounding boxes in COCO format.
|
Field
|
Description
|
Data Type and Constraints
|
Recommended/Typical Value
|
ann_path
|
The path to the annotation JSON file
|
string
|
img_dir
|
The image directory
|
string
|
label_dump_path
|
The path to save the output JSON file with pseudo masks
|
string
|
batch_size
|
The inference batch size
|
Unsigned int
|
load_mask
|
A flag specifying whether to load masks if the annotation file has them
|
boolean
|
False
|
results_dir
|
The directory to save the inference log
|
string
Train the MAL model using this command:
tao model mal train [-h] -e <experiment_spec>
[-r <results_dir>]
[--gpus <num_gpus>]
Required Arguments
-e, --experiment_spec_file: The experiment specification file
Optional Arguments
--gpus: The number of GPUs to use for training. The default value is 1.
-h, --help: Show this help message and exit.
Sample Usage
Here’s an example of using the
train command on a MAL model:
tao model mal train --gpus 2 -e /path/to/spec.yaml
To run evaluation for a MAL model, use this command:
tao model mal evaluate [-h] -e <experiment_spec_file>
[-r <results_dir>]
[--gpus <num_gpus>]
Required Arguments
-e, --experiment_spec_file: The experiment specification file
Optional Arguments
--gpus: The number of GPUs to use for evaluation. The default value is 1.
-h, --help: Show this help message and exit.
The
inference tool for MAL networks can be used to generate pseudo masks.
Here’s an example of using this tool:
tao model mal inference [-h] -e <experiment spec file>
[-r <results_dir>]
[--gpus <num_gpus>]
Required Arguments
-e, --experiment_spec_file: The experiment specification file
Optional Arguments
--gpus: The number of GPUs to use for inference. The default value is 1.
-h, --help: Show this help message and exit.