Mask Auto Labeler#
Mask Auto Labeler (MAL) is a high-quality, transformer-based mask auto-labeling framework for instance segmentation using only box annotations. It supports the following tasks:
train
evaluate
inference
Creating a Configuration File#
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
dict config |
– |
The configuration of the model architecture |
|
|
dict config |
– |
The configuration of the dataset |
|
|
dict config |
– |
The configuration of the training task |
|
|
dict config |
– |
The configuration of the evaluation task |
|
|
dict config |
– |
The configuration of the inference task |
|
|
string |
None |
The encryption key to encrypt and decrypt model files |
|
|
string |
/results |
The directory where experiment results are saved |
|
|
string |
‘ddp’ |
The distributed training strategy |
‘ddp’, ‘fsdp’ |
Dataset Config#
The dataset configuration (dataset) defines the data source and input size.
Field |
Datatype |
Default |
Description |
Supported Values |
|
string |
– |
The path to the training annotation JSON file |
|
|
string |
– |
The path to the validation annotation JSON file |
|
|
string |
– |
The path to the training image directory |
|
|
string |
– |
The path to the validation annotation JSON file |
|
|
Unsigned int |
512 |
The effective input size of the model |
|
|
boolean |
True |
A flag specifying whether to load the segmentation mask from the JSON file |
|
|
float |
2048 |
The minimum object size for training |
|
|
float |
1e10 |
The maximum object size for training |
|
|
Unsigned int |
The number of workers to load data for each GPU |
Model Config#
The model configuration (model) defines the model architecture.
Field |
Datatype |
Default |
Description |
Supported Values |
|
string |
vit-mae-base/16 |
The backbone architecture Supported backbones include the following:
|
|
|
List[int] |
[-1] |
The indices of the frozen blocks |
|
|
Unsigned int |
4 |
The number of conv layers in the mask head |
|
|
Unsigned int |
256 |
The number of conv channels in the mask head |
|
|
Unsigned int |
256 |
The number of output channels in the mask head |
|
|
float |
0.996 |
The momentum of the teacher model |
Train Config#
The training configuration (train) specifies the parameters for the training process.
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
unsigned int |
1 |
The number of GPUs to use for distributed training |
>0 |
|
List[int] |
[0] |
The indices of the GPU’s to use for distributed training |
|
|
unsigned int |
1234 |
The random seed for random, numpy, and torch |
>0 |
|
unsigned int |
10 |
The total number of epochs to run the experiment |
>0 |
|
unsigned int |
1 |
The epoch interval at which the checkpoints are saved |
>0 |
|
unsigned int |
1 |
The epoch interval at which the validation is run |
>0 |
|
string |
The intermediate PyTorch Lightning checkpoint to resume training from |
||
|
string |
/results/train |
The directory to save training results |
|
|
Unsigned int |
The training batch size |
||
|
boolean |
True |
A flag specifying whether to use mixed precision |
|
|
float |
0.9 |
The momentum of the AdamW optimizer |
|
|
float |
0.0000015 |
The learning rate |
|
|
float |
0.2 |
The minimum learning rate ratio |
|
|
float |
0.0005 |
The weight decay |
|
|
Unsigned int |
1 |
The number of epochs for warmup |
|
|
Unsigned int |
3 |
The kernel size of the mean field approximation |
|
|
Unsigned int |
100 |
The number of iterations to run mask refinement |
|
|
float |
4 |
The weight of multiple instance learning loss |
|
|
float |
0.5 |
The weight of conditional random field loss |
Evaluation Config#
The evaluation configuration (evaluate) specifies the parameters for the validation during training as well as the standalone evaluation.
Field |
Datatype |
Default |
Description |
Supported Values |
|
string |
Path to PyTorch model to evaluate |
||
|
string |
/results/evaluate |
The directory to save evaluation results |
|
|
unsigned int |
1 |
The number of GPUs to use for distributed evaluation |
>0 |
|
List[int] |
[0] |
The indices of the GPU’s to use for distributed evaluation |
|
|
Unsigned int |
The evaluation batch size |
||
|
boolean |
False |
A flag specifying whether to evaluate with a mixed model |
|
|
boolean |
False |
A flag specifying whether to evaluate with the teacher model |
Inference Config#
The inference configuration (inference) specifies the parameters for generating pseudo masks given the groundtruth bounding boxes in COCO format.
Field |
Datatype |
Default |
Description |
Supported Values |
|
string |
Path to PyTorch model to inference |
||
|
string |
/results/inference |
The directory to save inference results |
|
|
unsigned int |
1 |
The number of GPUs to use for distributed inference |
>0 |
|
List[int] |
[0] |
The indices of the GPU’s to use for distributed inference |
|
|
string |
The path to the annotation JSON file |
||
|
string |
The image directory |
||
|
string |
The path to save the output JSON file with pseudo masks |
||
|
Unsigned int |
The inference batch size |
||
|
boolean |
False |
A flag specifying whether to load masks if the annotation file has them |
Running Inference#
The inference tool for MAL networks can be used to generate pseudo masks.