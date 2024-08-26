Visual ChangeNet-Segmentation
Visual ChangeNet-Segmentation is an NVIDIA-developed semantic change segmentation model and is included in the TAO Toolkit. Visual ChangeNet supports the following tasks:
train
evaluate
inference
export
These tasks can be invoked from the TAO Toolkit Launcher using the following convention on the command-line:
tao model visual_changenet <sub_task> <args_per_subtask>
Where
args_per_subtask are the command-line arguments required for a given subtask. Each subtask is explained in the following sections.
VisualChangeNet-Segmentation requires the data to be provided as image and mask folders. See the Data Annotation Format page for more information about the input data format for VisualChangeNet-Segmentation.
Configuring a Custom Dataset
This section provides an example configuration and commands for training VisualChangeNet-Segmentation using the dataset format described for the LEVIR-CD dataset, above. LEVIR-CD dataset is a large-scale remote sensing building Change Detection dataset.
Here is an example spec file for training a VisualChangeNet-Segmentation model with NVIDIA’s FAN Hybrid backbone on the LEVIR-CD dataset using the Data Annotation Format.
encryption_key: tlt_encode
task: segment
train:
pretrained_model_path: /path/to/pretrained/model.pth
resume_training_checkpoint_path: null
segment:
loss: "ce"
weights: [0.5, 0.5, 0.5, 0.8, 1.0]
num_epochs: 350
num_nodes: 1
val_interval: 1
checkpoint_interval: 1
optim:
lr: 0.0001
optim: "adamw"
policy: "linear"
momentum: 0.9
weight_decay: 0.01
betas: [0.9, 0.999]
results_dir: /path/to/experiment_results
model:
backbone:
type: "fan_small_12_p4_hybrid"
pretrained_backbone_path: null
freeze_backbone: False
decode_head:
feature_strides: [4, 8, 16, 16]
dataset:
segment:
dataset: "CNDataset"
root_dir: /path/to/root/dataset/dir/
data_name: "LEVIR-CD"
label_transform: "norm"
batch_size: 16
workers: 2
multi_scale_train: True
multi_scale_infer: False
num_classes: 2
img_size: 256
image_folder_name: "A"
change_image_folder_name: "B"
list_folder_name: 'list'
annotation_folder_name: "label"
train_split: "train"
validation_split: "val"
label_suffix: .png
augmentation:
random_flip:
vflip_probability: 0.5
hflip_probability: 0.5
enable: True
random_rotate:
rotate_probability: 0.5
angle_list: [90, 180, 270]
enable: True
random_color:
brightness: 0.3
contrast: 0.3
saturation: 0.3
hue: 0.3
enable: True
with_scale_random_crop:
enable: True
with_random_crop: True
with_random_blur: True
evaluate:
checkpoint: "???"
vis_after_n_batches: 10
inference:
checkpoint: "???"
vis_after_n_batches: 1
export:
gpu_id: 0
checkpoint: "???"
onnx_file: "???"
input_width: 256
input_height: 256
|Parameter
|Data Type
|Default
|Description
|
model
|dict config
|–
|The configuration of the model architecture
|
dataset
|dict config
|–
|The configuration for the dataset detailed in the Config section
|
train
|dict config
|–
|The configuration for training parameters, which is detailed in the Train section
|
results_dir
|string
|–
|The path to save the model experiment log outputs and model checkpoints
|
task
|str
|segment
|A flag to indicate the change detection task. Currently supports two tasks: ‘segment’ and ‘classify’ for segmentation and classification
|
results_dir
|string
|–
|The path to save the model training experiment log outputs and model checkpoints
|
checkpoint_interval
|int
|5
|The interval at which the checkpoint needs to be saved
|
resume_training_checkpoint_path
|str
|None
|The path to the checkpoint for resuming training
|
|
Dict
str
|
None
ce
|
|
The
*
|
num_nodes
|unsigned int
|1
|The number of nodes. If the value is larger than 1, multi-node is enabled.
|
val_interval
|unsigned int
|1
|The epoch interval at which the validation is run
|
checkpoint_interval
|int
|1
|The number of steps at which the checkpoint needs to be saved
|
num_epochs
|int
|300
|The total number of epochs to run the experiment
|
pretrained_model_path
|string
|–
|The path to the pretrained model checkpoint to initialize the end-end model weights.
|
|
dict config
|
None
|
|
Contains the configurable parameters for the VisualChangeNet optimizer detailed in
optim
optim:
lr: 0.0001
optim: "adamw"
policy: "linear"
momentum: 0.9
weight_decay: 0.01
|Parameter
|Datatype
|Default
|Description
|Supported Values
|
lr
|float
|0.0005
|The learning rate
|>=0.0
|
optim
|str
|adamw
|
|
str
|
linear
|
The learning scheduler:
|
linear/step
|
momentum
|float
|0.9
|The momentum for the AdamW optimizer
|
weight_decay
|float
|0.1
|The weight decay coefficient
The following example
model config provides options to change the VisualChangeNet-Segmentation architecture for training.
model:
backbone:
type: "fan_small_12_p4_hybrid"
pretrained_backbone_path: null
freeze_backbone: False
decode_head:
feature_strides: [4, 8, 16, 16]
align_corner: False
|Parameter
|Datatype
|Default
|Description
|Supported Values
|
|
Dict
bool
|
None
None
|
A dictionary containing the following configurable parameters:
*
|
fan_tiny_8_p4_hybrid
|
|
Dict
|
None
|
A dictionary containing the following configurable parameters:
|
True, False
The
dataset parameter defines the dataset source, training batch size,
augmentation, and pre-processing. An example
dataset is provided below.
dataset:
segment:
dataset: "CNDataset"
root_dir: /path/to/root/dataset/dir/
data_name: "LEVIR-CD"
label_transform: "norm"
batch_size: 16
workers: 2
multi_scale_train: True
multi_scale_infer: False
num_classes: 2
img_size: 256
image_folder_name: "A"
change_image_folder_name: "B"
list_folder_name: 'list'
annotation_folder_name: "label"
train_split: "train"
validation_split: "val"
test_split: "test"
predict_split: 'predict'
label_suffix: .png
augmentation:
random_flip:
vflip_probability: 0.5
hflip_probability: 0.5
enable: True
random_rotate:
rotate_probability: 0.5
angle_list: [90, 180, 270]
enable: True
random_color:
brightness: 0.3
contrast: 0.3
saturation: 0.3
hue: 0.3
enable: True
with_scale_random_crop:
enable: True
with_random_crop: True
with_random_blur: True
|Parameter
|Datatype
|Default
|Description
|Supported Values
|
segment
|Dict
|–
|The
segment contains dataset config for the segmentation dataloader detailed in the segment section.
|
classify
|Dict
|–
|The
classify contains dataset config for the classification dataloader
segment
|Parameter
|Datatype
|Default
|Description
|Supported Values
|
dataset
|Dict
|CNDataset
|The dataloader supported for segmentation
|CNDataset
|
root_dir
|str
|–
|The root directory path where the dataset is located.
|
data_name
|str
|LEVIR-CD
|The dataset identifier
|LEVIR-CD, LandSCD, custom
|
batch_size
|int
|32
|The number of samples per batch
|>0
|
workers
|int
|2
|The number of worker processes for data loading
|>=0
|
multi_scale_train
|bool
|True
|Whether multi-scale training is enabled
|True, False
|
multi_scale_infer
|bool
|False
|Whether multi-scale inference is enabled
|True, False
|
num_classes
|int
|2
|Number of classes in the dataset.
|>=2
|
img_size
|int
|256
|Size of the input images after resizing.
|
image_folder_name
|str
|A
|Name of the folder containing input images.
|
change_image_folder_name
|str
|B
|Name of the folder containing the changed images
|
list_folder_name
|str
|list
|Name of the folder containing dataset split lists’ csv files.
|
annotation_folder_name
|str
|label
|Name of the folder containing annotation masks
|
train_split
|str
|train
|Dataset split used for training, should indicate the name of csv file in list_folder_name.
|
validation_split
|str
|val
|Dataset split used for validation, should indicate the name of csv file in list_folder_name.
|
test_split
|str
|test
|Dataset split used for evaluation, should indicate the name of csv file in list_folder_name.
|
predict_split
|str
|predict
|Dataset split used for inference, should indicate the name of csv file in list_folder_name.
|
label_suffix
|str
|.png
|Suffix of the label image files.
|
augmentation
|Dict
|None
|Dictionary containing various data augmentation settings, which is detailed in the augmentation section.
augmentation
|Parameter
|Datatype
|Default
|Description
|Supported Values
|
|
Dict
|
None
|
Random vertical and horizontal flipping augmentation settings.
|
>=0.0
|
|
Dict
|
None
|
Randomly rotate images with specified probability and angles
|
>=0.0
|
|
Dict
|
None
|
Apply random color augmentation to images.
|
>=0.0
|
|
Dict
|
None
|
Apply random scaling and cropping augmentation.
|
True, False
|
with_random_crop
|bool
|True
|Apply random crop augmentation.
|True, False
|
with_random_blur
|bool
|True
|Apply random blurring augmentation.
|True, False
Example spec file for ViT backbones
The following spec file is only relevant for TAO Toolkit versions 5.3 and later.
encryption_key: tlt_encode
task: segment
train:
pretrained_model_path: /path/to/pretrained/model.pth
resume_training_checkpoint_path: null
segment:
loss: "ce"
weights: [0.5, 0.5, 0.5, 0.8, 1.0]
num_epochs: 350
num_nodes: 1
val_interval: 1
checkpoint_interval: 1
optim:
lr: 0.00002
optim: "adamw"
policy: "linear"
momentum: 0.9
weight_decay: 0.01
betas: [0.9, 0.999]
results_dir: /path/to/experiment_results
model:
backbone:
type: "vit_large_nvdinov2"
pretrained_backbone_path: /path/to/pretrained/backbone.pth
freeze_backbone: False
decode_head:
feature_strides: [4, 8, 16, 32]
dataset:
segment:
dataset: "CNDataset"
root_dir: /path/to/root/dataset/dir/
data_name: "LEVIR-CD"
label_transform: "norm"
batch_size: 16
workers: 2
multi_scale_train: True
multi_scale_infer: False
num_classes: 2
img_size: 256
image_folder_name: "A"
change_image_folder_name: "B"
list_folder_name: 'list'
annotation_folder_name: "label"
train_split: "train"
validation_split: "val"
label_suffix: .png
augmentation:
random_flip:
vflip_probability: 0.5
hflip_probability: 0.5
enable: True
random_rotate:
rotate_probability: 0.5
angle_list: [90, 180, 270]
enable: True
random_color:
brightness: 0.3
contrast: 0.3
saturation: 0.3
hue: 0.3
enable: True
with_scale_random_crop:
enable: True
with_random_crop: True
with_random_blur: True
evaluate:
checkpoint: "???"
vis_after_n_batches: 10
inference:
checkpoint: "???"
vis_after_n_batches: 1
export:
gpu_id: 0
checkpoint: "???"
onnx_file: "???"
input_width: 256
input_height: 256
Use the following command to run VisualChangeNet-Segmentation training:
tao model visual_changenet train -e <experiment_spec_file>
-r <results_dir>
--gpus <num_gpus>
task=segment
Required Arguments
-e, --experiment_spec_file: The path to the experiment spec file.
-r, --results_dir: The path to a folder where the experiment outputs should be written.
task: The task (‘segment’ or ‘classify’) for the visual_changenet training. Default: segment.
Optional Arguments
--gpus: The number of GPUs to use for training. The default value is 1.
Here’s an example of using the VisualChangeNet training command:
tao model visual_changenet train -e $DEFAULT_SPEC -r $RESULTS_DIR --gpus $NUM_GPUs
Here is an example spec file for testing evaluation and inference of a trained VisualChangeNet-Segmentation model:
results_dir: /path/to/experiment_results
task: segment
model:
backbone:
type: "fan_small_12_p4_hybrid"
dataset:
segment:
dataset: "CNDataset"
root_dir: /path/to/root/dataset/dir/
data_name: "LEVIR-CD"
label_transform: "norm"
batch_size: 16
workers: 2
multi_scale_train: True
multi_scale_infer: False
num_classes: 2
img_size: 256
image_folder_name: "A"
change_image_folder_name: "B"
list_folder_name: 'list'
annotation_folder_name: "label"
test_split: "test"
predict_split: 'predict'
label_suffix: .png
evaluate:
checkpoint: /path/to/checkpoint
vis_after_n_batches: 1
inference:
checkpoint: /path/to/checkpoint
vis_after_n_batches: 1
|Parameter
|Datatype
|Default
|Description
|Supported Values
|
checkpoint
|string
|Path to PyTorch model to evaluate/infer
|
vis_after_n_batches
|int
|Number of batches interval between each visualisation output save.
|
trt_engine
|string
|Path to TensorRT model to inference. Should be only used with TAO Deploy
|
num_gpus
|unsigned int
|1
|The number of GPUs to use
|>0
Use the following command to run a VisualChangeNet-Segmentation evaluation:
tao model visual_changenet evaluate -e <experiment_spec>
-r <results_dir>
task=segment
Required Arguments
-e, --experiment_spec_file: The experiment spec file to set up the evaluation experiment.
-r, --results_dir: The path to a folder where the experiment outputs should be written.
Here’s an example of using the VisualChangeNet evaluation command:
tao model visual_changenet evaluate -e $DEFAULT_SPEC -r $RESULTS_DIR
Use the following command to run inference on VisualChangeNet-Segmentation with the
.tlt model:
tao model visual_changenet inference -e <experiment_spec>
-r <results_dir>
task=segment
Required Arguments
-e, --experiment_spec_file: The spec file to use to set up the evaluation experiment.
-r, --results_dir: The path to a folder where the experiment outputs should be written.
Here’s an example of using the VisualChangeNet inference command:
tao model visual_changenet inference -e $DEFAULT_SPEC -r $RESULTS_DIR
Here is an example spec file for exporting the trained VisualChangeNet model:
export:
checkpoint: /path/to/model.pth
onnx_file: /path/to/model.onnx
opset_version: 12
input_channel: 3
input_width: 256
input_height: 256
batch_size: -1
|Parameter
|Datatype
|Default
|Description
|Supported Values
|
checkpoint
|string
|The path to the PyTorch model to export
|
onnx_file
|string
|The path to the
.onnx file
|
opset_version
|unsigned int
|12
|The opset version of the exported ONNX
|>0
|
input_channel
|unsigned int
|3
|The input channel size. Only the value 3 is supported.
|3
|
input_width
|unsigned int
|256
|The input width
|>0
|
input_height
|unsigned int
|256
|The input height
|>0
|
batch_size
|unsigned int
|-1
|The batch size of the ONNX model. If this value is set to -1, the export uses dynamic batch size.
|>=-1
Use the following command to export the model:
tao model visual_changenet export [-h] -e <experiment spec file>
-r <results_dir>
task=segment
Required Arguments
-e, --experiment_spec_file: The spec file to use to set up the evaluation experiment.
-r, --results_dir: The path to a folder where the experiment outputs should be written.
Sample Usage
The following is an example
export command:
tao model visual_changenet export -e /path/to/spec.yaml -r $RESULTS_DIR
For deployment, refer to the TAO Deploy Documentation for VisualChangeNet-Segmentation.