SegFormer
SegFormer is an NVIDIA-developed semantic-segmentation model that is included in the TAO Toolkit. SegFormer supports the following tasks:
train
evaluate
inference
export
These tasks can be invoked from the TAO Toolkit Launcher using the following convention on the command-line:
tao model segformer <sub_task> <args_per_subtask>
where args_per_subtask
are the command-line arguments required for a given subtask. Each subtask is explained in detail in the following sections.
Segformer requires the data to be provided as image and mask folders. See the Data Annotation Format page for more information about the input data format for Segformer.
Configuration for Custom Dataset
In this doucmentation, we show example configuration and commands for training on ISBI dataset. ISBI Challenge: Segmentation of neuronal structures in EM stacks dataset for the binary segmentation. It contains grayscale images. For more details, please refer to the example notebook TAO Computer Vision samples. Hence, we set :code: input_type is set to
grayscale
.For “RGB” input the images the :code: input_type should be set to
rgb
instead ofgrayscale
.Please configure the
img_norm_cfg
mean, standard deviation based on your input dataset.
Here is an example spec file for training a SegFormer model with an mit_b5 backbone on an ISBI dataset.
train:
exp_config:
manual_seed: 49
checkpoint_interval: 200
logging_interval: 50
max_iters: 1000
resume_training_checkpoint_path: null
validate: True
validation_interval: 500
trainer:
find_unused_parameters: True
sf_optim:
lr: 0.00006
model:
input_height: 512
input_width: 512
pretrained_model_path: null
backbone:
type: "mit_b5"
dataset:
data_root: /tlt-pytorch
input_type: "grayscale"
img_norm_cfg:
mean:
- 127.5
- 127.5
- 127.5
std:
- 127.5
- 127.5
- 127.5
to_rgb: True
train_dataset:
img_dir:
- /data/images/train
ann_dir:
- /data/masks/train
pipeline:
augmentation_config:
random_crop:
cat_max_ratio: 0.75
resize:
ratio_range:
- 0.5
- 2.0
random_flip:
prob: 0.5
val_dataset:
img_dir: /data/images/val
ann_dir: /data/masks/val
palette:
- seg_class: foreground
rgb:
- 0
- 0
- 0
label_id: 0
mapping_class: foreground
- seg_class: background
rgb:
- 255
- 255
- 255
label_id: 1
mapping_class: background
repeat_data_times: 500
batch_size: 4
workers_per_gpu: 1
The train classification experiment specification consists of three main components:
train
dataset
model
The train config contains the parameters related to training. They are described as follows:
Parameter | Datatype | Default | Description | Supported Values |
exp_config |
Dict int |
None 49 |
The * The random seed to make the trainig deterministic |
– |
max_iters |
int | 10 | The maximum number of iterations/ steps for which the training should be conducted | |
checkpoint_interval |
int | 1 | The number of steps at which the checkpoint needs to be saved | |
logging_interval |
int | 10 | The number of steps at which the experiment logs need to be saved. The logs are saved in the logs directory. | |
resume_training_checkpoint_path |
str | None | The path to the checkpoint for resuming training | |
validate |
bool | False | A flag to enable validation during training | |
|
int |
int |
The interval number of iterations at which validation should be performed during training |
|
|
Dict
bool |
None
False |
This config contains parameters required by MMSeg trainer:
* |
– |
sf_optim
sf_optim:
lr: 0.00006
betas:
- 0.0
- 0.999
paramwise_cfg:
pos_block:
decay_mult: 0.0
norm:
decay_mult: 0.0
head:
lr_mut: 10.0
weight_decay: 5e-4
Parameter | Datatype | Default | Description | Supported Values |
lr |
float | 0.00006 | The learning rate | >=0.0 |
betas |
List[float] | [0.0, 0.9] | The beta parameters in the Adam optimizer | >=0.0 |
|
Dict |
None |
Configuration parameters for the Adam optimizer:
*
*
|
– |
weight_decay |
float | 5e-4 | weight_decay hyper-parameter for regularization. | >=0.0 |
lr_config
lr_config:
warmup_iters: 1500
warmup_ratio: 1e-6
power: 1.0
min_lr: 0.0
Parameter | Datatype | Default | Description | Supported Values |
warmup_iters |
int | 1500 | The number of iterations or epochs that warmup lasts. | >=0.0 |
warmup_ratio |
float | 1e-6 | The LR used at the beginning of warmup is equal to warmup_ratio * initial_lr |
>=0.0 |
power |
float | 1.0 | The power to which the multiplying coefficients are raised to. | >=0.0 |
min_lr |
float | 0.0 | The minimum LR to start the LR scheduler | >=0.0 |
The following example model
provides options to change the SegFormer architecture for training.
model:
input_height: 512
input_width: 512
pretrained_model_path: null
backbone:
type: "mit_b5"
The following example model
is used during Segformer evaluation/inference.
Parameter | Datatype | Default | Description | Supported Values |
pretrained_model_path |
string | None | The optional path to the pretrained backbone file | string to the path |
|
Dict |
None |
A dictionary containing the following configurable parameters: |
mit_b0, mit_b1 |
|
Dict Float |
None 0.1 |
A dictionary containing the decoder parameters:
*
|
256, 512, 768 >=0.0 |
input_width |
int | 512 | Input height of the model | >0 |
input_height |
int | 512 | Input width of the model | >0 |
The dataset
parameter defines the dataset source, training batch size, and
augmentation. An example dataset
is provided below.
dataset:
data_root: /tlt-pytorch
input_type: "grayscale"
img_norm_cfg:
mean:
- 127.5
- 127.5
- 127.5
std:
- 127.5
- 127.5
- 127.5
to_rgb: True
train_dataset:
img_dir:
- /data/images/train
ann_dir:
- /data/masks/train
pipeline:
augmentation_config:
random_crop:
cat_max_ratio: 0.75
resize:
ratio_range:
- 0.5
- 2.0
random_flip:
prob: 0.5
val_dataset:
img_dir: /data/images/val
ann_dir: /data/masks/val
palette:
- seg_class: foreground
rgb:
- 0
- 0
- 0
label_id: 0
mapping_class: foreground
- seg_class: background
rgb:
- 255
- 255
- 255
label_id: 1
mapping_class: background
repeat_data_times: 500
batch_size: 4
workers_per_gpu: 1
Parameter | Datatype | Default | Description | Supported Values |
|
Dict |
None |
The mage normalization config, which contains the following parameters: |
>=0, <=255 |
input_type |
String | “rgb” | Whether the input type is RGB or grayscale | “rgb”, “grayscale” |
|
List[Dict] |
None |
The pallate config:
|
string |
batch_size |
unsigned int | 32 | The batch size for training and validation | >0 |
workers_per_gpu |
unsigned int | 8 | The number of parallel workers processing data | >0 |
|
dict config
str dict config |
None
None None |
The parameters to define the training dataset:
*
*
|
Dict Config
None |
|
dict config
str |
None
None |
The validation config contains the following parameters for validation
|
>=0 |
|
dict config
str |
None
None |
The validation config contains the following parameters for validation
|
>=0 |
augmentation_config
Parameter | Datatype | Default | Description | Supported Values |
|
Dict |
None |
The random_crop config has following parameters:
|
0< h,w <= img_ht, img_wd |
|
Dict Bool |
None [0.5, 2.0] True |
The resize Config has the following configurable parameters:
*
|
>=0 True/ False |
|
Dict |
None |
The random_flip config contains the following parameters for flipping aug:
|
>=0.0 |
Use the following command to run Segformer training:
tao model segformer train -e <experiment_spec_file>
-r <results_dir>
-g <num_gpus>
Required Arguments
-e, --experiment_spec_file
: The path to the experiment spec file.-r, --results_dir
: The path to a folder where the experiment outputs should be written.
Optional Arguments
-g, --num_gpus
: The number ogf GPUs to be used for training. The default value is 1.
Here’s an example of using the SegFormer training command:
tao model segformer train -e $DEFAULT_SPEC -r $RESULTS_DIR -g $NUM_GPUs
The evaluation metric of Segformer is the meanIOU. For more details on the mean IOU metric, please refer here meanIOU.:
Use the following command to run Segformer evaluation:
tao model segformer evaluate -e <experiment_spec>
-g <num GPUs>
evaluate.checkpoint=<evaluation model>
results_dir=<path to output evaluation results>
Required Arguments
-e, --experiment_spec_file
: The experiment spec file to set up the evaluation experiment.evaluate.checkpoint
: The.pth
model.
Optional Argument
-g, --num_gpus
: The number ogf GPUs to be used for training. The default value is 1.
Here’s an example of using the Segformer evaluation command:
+------------+-------+-------+
| Class | IoU | Acc |
+------------+-------+-------+
| foreground | 37.81 | 44.56 |
| background | 83.81 | 95.51 |
+------------+-------+-------+
Summary:
+--------+-------+-------+-------+
| Scope | mIoU | mAcc | aAcc |
+--------+-------+-------+-------+
| global | 60.81 | 70.03 | 85.26 |
+--------+-------+-------+-------+
...
tao model segformer evaluate -e $DEFAULT_SPEC -g $NUM_GPUS evaluate.checkpoint=$TRAINED_PTH_MODEL results_dir=$PATH_TO_RESULTS_DIR
Use the following command to run inference on Segformer with the .pth
model.
tao model segformer inference -e <experiment_spec>
inference.checkpoint=<inference model>
results_dir=<path to output directory for inference>
The output mask PNG images with class ID’s is saved in vis_tao
.
The overlaid mask images are saved in mask_tao
.
Required Arguments
-e, --experiment_spec
: The experiment spec file to set up inferenceinference.checkpoint
: The.pth
model to perform inference withresults_dir
: The path to save the inference masks and mask overlaid images to. Inference creates two directories.
Optional Argument
-g, --num_gpus
: The number ogf GPUs to be used for training. The default value is 1.
Here’s an example of using the Segformer inference command:
tao model segformer inference -e $DEFAULT_SPEC -g $NUM_GPUS inference.checkpoint=$TRAINED_PTH_MODEL results_dir=$OUTPUT_FOLDER
Use the following command to export the model.
tao model segformer export [-h] -e <experiment spec file>
results_dir=<path to results dir>
export.checkpoint=<trained pth model to be xported>
export.onnx_file=<onnx path>
Required Arguments
-e, --experiment_spec
: The path to an experiment spec fileresults_dir
: The path where the logs for export will be savedexport.checkpoint
: The.pth
model to be exportedexport.onnx_file
: The :code:.`onnx` file to be stored
Sample Usage
The following is an example export
command:
tao model segformer export -e /path/to/spec.yaml export.checkpoint=/path/to/model.pth export.onnx_file=/path/to/model.onnx results_dir=/path/to/export_result_dir
For deployment, refer to the TAO Deploy documentation
Refer to the Integrating a SegFormer Model page for more information about deploying a SegFormer model to DeepStream.