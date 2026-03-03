SegFormer is an NVIDIA-developed semantic-segmentation model that is included in TAO. SegFormer supports the following tasks:

These tasks can be invoked from the TAO Launcher using the following convention on the command-line:

TAO Client (v2 API) SPECS = $( tao-client segformer get-spec --action <sub_task> --job_type experiment --id $EXPERIMENT_ID ) JOB_ID = $( tao-client segformer experiment-run-action --action <sub_task> --id $EXPERIMENT_ID --specs " $SPECS " ) Required Arguments --id : The unique identifier of the experiment from which to train the model See also For information on how to create an experiment using the FTMS client, refer to the Creating an experiment section in the Remote Client documentation. TAO Launcher tao model segformer <sub_task> <args_per_subtask> Where args_per_subtask are the command-line arguments required for a given subtask. Each subtask is explained in detail in the following sections.

Data Input for SegFormer# Segformer requires the data to be provided as image and mask folders. See the Data Annotation Format page for more information about the input data format for Segformer.

Creating Training Experiment Spec File# Configuration for Custom Dataset# In this doucmentation, we show example configuration and commands for training on multi-class dataset. For more details, please refer to the example notebook TAO Computer Vision samples. Here is an example spec file for training a SegFormer model with an NVDINOv2 backbone. Please noted that the spec file is for reference. The user should create their own spec file based on their own dataset. TAO Client (v2 API) We first need to set the base_experiment. FILTER_PARAMS = '{"network_arch": "segformer"}' $BASE_EXPERIMENTS = $( tao-client segformer list-base-experiments --filter_params " $FILTER_PARAMS " ) Retrieve the PTM_ID for NVDINOv2 backbone from $BASE_EXPERIMENTS before setting base_experiment. PTM_INFORMATION = "{\"base_experiment\": [ $PTM_ID ]}" tao-client segformer patch-artifact-metadata --id $EXPERIMENT_ID --job_type experiment --update_info $PTM_INFORMATION Then retrieve the specifications. BASE_EXPERIMENT_ID = $( tao segformer list-base-experiments | jq -r '.[0].id' ) TRAIN_SPECS = $( tao segformer get-job-schema --action train --base-experiment-id $BASE_EXPERIMENT_ID | jq -r '.default' ) Get specifications from $TRAIN_SPECS. You can override values as needed. TAO Launcher encryption_key: tlt_encode results_dir: <path_to_output_dir> train: resume_training_checkpoint_path: null segment: loss: "ce" num_epochs: 50 num_nodes: 1 validation_interval: 1 checkpoint_interval: 50 optim: lr: 0 .0001 optim: "adamw" policy: "linear" weight_decay: 0 .0005 evaluate: checkpoint: ${ results_dir } /train/segformer_model_latest.pth vis_after_n_batches: 1 inference: checkpoint: ${ results_dir } /train/segformer_model_latest.pth vis_after_n_batches: 1 export: results_dir: " ${ results_dir } /export" gpu_id: 0 checkpoint: ${ results_dir } /train/segformer_model_latest.pth onnx_file: " ${ export .results_dir } /segformer.onnx" input_width: 224 input_height: 224 batch_size: -1 model: backbone: type: "vit_large_nvdinov2" pretrained_backbone_path: <path_to_pretrained_weight> freeze_backbone: False decode_head: feature_strides: [ 4 , 8 , 16 , 32 ] dataset: segment: dataset: "SFDataset" root_dir: <dataset_root> batch_size: 32 workers: 8 num_classes: 6 img_size: 224 train_split: "train" validation_split: "val" test_split: "val" predict_split: "val" augmentation: random_flip: vflip_probability: 0 .5 hflip_probability: 0 .5 enable: True random_rotate: rotate_probability: 0 .5 angle_list: [ 90 , 180 , 270 ] enable: True random_color: brightness: 0 .3 contrast: 0 .3 saturation: 0 .3 hue: 0 .3 enable: False with_scale_random_crop: enable: True with_random_crop: True with_random_blur: False label_transform: None palette: - seg_class: urban rgb: - 0 - 255 - 255 label_id: 0 mapping_class: urban - seg_class: agriculture rgb: - 255 - 255 - 0 label_id: 1 mapping_class: agriculture - seg_class: rangeland rgb: - 255 - 0 - 255 label_id: 2 mapping_class: rangeland - seg_class: forest rgb: - 0 - 255 - 0 label_id: 3 mapping_class: forest - seg_class: water rgb: - 0 - 0 - 255 label_id: 4 mapping_class: water - seg_class: barren rgb: - 255 - 255 - 255 label_id: 5 mapping_class: barren - seg_class: unknown rgb: - 0 - 0 - 0 label_id: 255 mapping_class: unknown The experiment specification consists of several main components: train

train# The train config contains the parameters related to training. They are described as follows: Note For FTMS Client, these parameters are set in json format. train: resume_training_checkpoint_path: null segment: loss: "ce" num_epochs: 50 num_nodes: 1 validation_interval: 1 checkpoint_interval: 50 optim: lr: 0 .0001 optim: "adamw" policy: "linear" weight_decay: 0 .0005 Parameter Datatype Default Description Supported Values optim dict config – Optimizer config. – pretrained_model_path str None Pretrained model path. – segment dict config – Segmentation loss Config. – num_gpus int 1 The number of GPUs to run the train job. – gpu_ids List[int] [0] List of GPU IDs to run the training on. – num_nodes int 1 Number of nodes to run the training on. – seed int 1234 The seed for the initializer in PyTorch. – num_epochs int 10 Number of epochs to run the training. – checkpoint_interval int 1 Checkpoint interval. – validation_interval int 1 Validation interval. – resume_training_checkpoint_path str None Path to the checkpoint to resume training – results_dir str None Path to where all the assets are stored. – optim# Parameter Datatype Default Description Supported Values monitor_name str val_loss Monitor Name – optim str adamw Optimizer adamw,adam,sgd lr float 0.00006 Optimizer learning rate – policy str linear Optimizer policy linear,step momentum float 0.9 The momentum for the AdamW optimizer. – weight_decay float 0.01 The weight decay coefficient. – segment# Parameter Datatype Default Description Supported Values loss str ce Segment loss ce weights List[float] [0.5, 0.5, 0.5, 0.8, 1.0] Multi-scale Segment loss weight – tensorboard# Parameter Datatype Default Description Supported Values enabled bool False Flag to enable tensorboard – infrequent_logging_frequency int 2 infrequent_logging_frequency –

evaluate# The evaluate config contains the parameters related to training. They are described as follows: Note For FTMS Client, these parameters are set in json format and the evaluate checkpoint is deduced from the previous train job ID as specified with the –parent_job_id argument. For TAO Launcher, one must set the path in the evaluate specification: evaluate : checkpoint : ${results_dir}/train/segformer_model_latest.pth vis_after_n_batches : 1 Parameter Datatype Default Description Supported Values vis_after_n_batches int 1 Visualize evaluation segmentation results after n batches. – batch_size int 8 Batch Size. – checkpoint str – Path to checkpoint file. – num_gpus int 1 The number of GPUs to run the evaluate job. – gpu_ids List[int] [0] List of GPU IDs to run the evaluate on. – num_nodes int 1 Number of nodes to run the evaluate on. – checkpoint str – Path to the checkpoint used for evaluation. – trt_engine Optional[str] None Path to the TensorRT engine to be used for evaluation. – results_dir Optional[str] None Path to where all the assets are stored. –

inference# The inference config contains the parameters related to training. They are described as follows: Note For FTMS Client, these parameters are set in json format and the inference checkpoint is deduced from the previous train job ID as specified with the –parent_job_id argument. For TAO Launcher, one must set the path in the inference specification: inference : checkpoint : ${results_dir}/train/segformer_model_latest.pth vis_after_n_batches : 1 Parameter Datatype Default Description Supported Values vis_after_n_batches int 1 Visualize inference segmentation results after n batches. – batch_size int 8 Batch Size. – checkpoint str – Path to checkpoint file. – num_gpus int 1 The number of GPUs to run the inference job. – gpu_ids List[int] [0] List of GPU IDs to run the inference on. – num_nodes int 1 Number of nodes to run the inference on. – checkpoint str – Path to the checkpoint used for inference. – trt_engine Optional[str] None Path to the TensorRT engine to be used for inference. – results_dir Optional[str] None Path to where all the assets are stored. –

export# The export config contains the parameters related to export. They are described as follows: Note For FTMS Client, these parameters are set in json format and the export checkpoint is deduced from the previous train job ID as specified with the –parent_job_id argument. For TAO Launcher, one must set the path in the export specification: export : results_dir : "${results_dir}/export" gpu_id : 0 checkpoint : ${results_dir}/train/segformer_model_latest.pth onnx_file : "${export.results_dir}/segformer.onnx" input_width : 224 input_height : 224 batch_size : -1 Parameter Datatype Default Description Supported Values results_dir Optional[str] None Path to where all the assets are stored. – gpu_ids int 0 The index of the GPU to build the TensorRT engine. – checkpoint str – Path to the checkpoint file to run export. – onnx_file str – Path to the onnx model file. – on_cpu bool False Flag to export CPU compatible model. True,False input_channel int 3 Number of channels in the input Tensor. 1,3 input_width int 960 Width of the input image tensor. – input_height int 544 Height of the input image tensor. – opset_version int 17 Operator set version. – batch_size int -1 The batch size of the input Tensor for the engine. –

model# The following example model provides options to define the SegFormer backbone and decoder head. Note For FTMS Client, these parameters are set in json format. model : backbone : type : "vit_large_nvdinov2" pretrained_backbone_path : <path_to_pretrained_weight> freeze_backbone : False decode_head : feature_strides : [ 4 , 8 , 16 , 32 ] Parameter Datatype Default Description Supported Values backbone dict config – The configuration of the backbone. decode_head dict config – The configuration of the decoder head. backbone# Parameter Datatype Default Description Supported Values type

























str

























fan_small_12_p4_hybrid

























The name of the backbone to be used

























mit_b0, mit_b1 mit_b2, mit_b3 mit_b4, mit_b5 fan_tiny_8_p4_hybrid fan_large_16_p4_hybrid fan_small_12_p4_hybrid fan_base_16_p4_hybrid vit_large_nvdinov2 vit_giant_nvdinov2 vit_base_nvclip_16_siglip vit_huge_nvclip_14_siglip c_radio_v2_vit_base_patch16_224 c_radio_v2_vit_large_patch16_224 c_radio_v2_vit_huge_patch16_224 pretrained_backbone_path str – Path to the pretrained model – freeze_backbone bool False Flag to freeze backbone True,False decode_head# Parameter Datatype Default Description Supported Values feature_strides List[int] [4, 8, 16, 32] Feature strides for the head. –

dataset# The dataset parameter defines the dataset source, training batch size, and augmentation. An example dataset is provided below. Note For FTMS Client, these parameters are set in json format. dataset: segment: dataset: "SFDataset" root_dir: <dataset_root> batch_size: 32 workers: 8 num_classes: 6 img_size: 224 train_split: "train" validation_split: "val" test_split: "val" predict_split: "val" augmentation: random_flip: vflip_probability: 0 .5 hflip_probability: 0 .5 enable: True random_rotate: rotate_probability: 0 .5 angle_list: [ 90 , 180 , 270 ] enable: True random_color: brightness: 0 .3 contrast: 0 .3 saturation: 0 .3 hue: 0 .3 enable: False with_scale_random_crop: enable: True with_random_crop: True with_random_blur: False label_transform: None palette: - seg_class: urban rgb: - 0 - 255 - 255 label_id: 0 mapping_class: urban - seg_class: agriculture rgb: - 255 - 255 - 0 label_id: 1 mapping_class: agriculture - seg_class: rangeland rgb: - 255 - 0 - 255 label_id: 2 mapping_class: rangeland - seg_class: forest rgb: - 0 - 255 - 0 label_id: 3 mapping_class: forest - seg_class: water rgb: - 0 - 0 - 255 label_id: 4 mapping_class: water - seg_class: barren rgb: - 255 - 255 - 255 label_id: 5 mapping_class: barren - seg_class: unknown rgb: - 0 - 0 - 0 label_id: 255 mapping_class: unknown Parameter Datatype Default Description Supported Values segment dict config – Segmentation Dataset Config. – segment# Parameter Datatype Default Description Supported Values root_dir str – Path to root directory for dataset. – dataset str SFDataset dataset class. SFDataset num_classes int 2 The number of classes in the training data. – img_size int 256 The input image size. – batch_size int 8 Batch size. – workers int 1 Workers. – shuffle bool True Shuffle dataloader. True,False train_split str train Train split folder name. – validation_split str val Validation split folder name. – test_split str val Test split folder name. – predict_split str test Predict split folder name. – augmentation dict config – Augmentation. – label_transform str norm label transform. norm,None palette

List[Dict]

{“label_id”: 0, “mapping_class”: “foreground”, “rgb”: [0, 0, 0], “seg_class”: “foreground”} {“label_id”: 1, “mapping_class”: “background”, “rgb”: [1, 1, 1], “seg_class”: “background”} Palette, be careful of label_transform, if norm then RGB value from 0~1, else 0~255.

– – augmentation# Parameter Datatype Default Description Supported Values random_flip dict config – RandomFlip augmentation config. – random_rotate dict config – RandomRotation augmentation config. – random_color dict config – RandomColor augmentation config. – with_scale_random_crop dict config – RandomCropWithScale augmentation config. – with_random_blur bool – Flag to enable with_random_blur. – with_random_crop bool – Flag to enable with_random_crop. – mean List[float] – Mean for the augmentation. – std List[float] – Standard deviation for the augmentation. – RandomFlip# Parameter Datatype Default Description Supported Values vflip_probability float 0.5 Vertical Flip probability. – hflip_probability float 0.5 Horizontal Flip probability. – enable bool True Flag to enable augmentation. True,False RandomRotation# Parameter Datatype Default Description Supported Values rotate_probability float 0.5 Random Rotate probability. – angle_list List[float] [90, 180, 270] Random rotate angle. – enable bool True Flag to enable augmentation. True,False RandomColor# Parameter Datatype Default Description Supported Values brightness float 0.3 Random Color Brightness. – contrast float 0.3 Random Color Contrast. – saturation float 0.3 Random Color Saturation. – hue float 0.3 Random Color Hue. – enable bool True Flag to enable Random Color. True,False color_probability float 0.5 Random Color Probability. – RandomCropWithScale# Parameter Datatype Default Description Supported Values scale_range float [1, 1.2] Random Scale range. – enable bool True Flag to enable augmentation. True,False

Training the Model# Use the following command to run Segformer training: TAO Client (v2 API) TRAIN_JOB_ID = $( tao segformer create-job \ --kind experiment \ --name "segformer_train" \ --action train \ --workspace-id $WORKSPACE_ID \ --specs " $TRAIN_SPECS " \ --train-datasets '["' $DATASET_ID '"]' \ --eval-dataset " $DATASET_ID " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model segformer train [ -h ] -e <experiment_spec_file> [ results_dir = <global_results_dir> ] [ model.<model_option> = <model_option_value> ] [ dataset.<dataset_option> = <dataset_option_value> ] [ train.<train_option> = <train_option_value> ] [ train.gpu_ids = <gpu indices> ] [ train.num_gpus = <number of gpus> ] Required Arguments The only required argument is the path to the experiment spec: -e, --experiment_spec : The experiment specification file to set up the training experiment Optional Arguments You can set optional arguments to override the option values in the experiment spec file. -h, --help : Show this help message and exit.

model.<model_option> : The model options.

dataset.<dataset_option> : The dataset options.

train.<train_option> : The train options. Note For training, evaluation, and inference, we expose two variables for each task: num_gpus and gpu_ids , which default to 1 and [0] , respectively. If both are passed, but are inconsistent, for example num_gpus = 1 , gpu_ids = [0, 1] , then they are modified to follow the setting that implies more GPUs; in the same example num_gpus is modified from 1 to 2. In some cases multi-GPU training may result in a segmentation fault. You can circumvent this by setting the enviroment variable OMP_NUM_THREADS to 1. Depending upon your model of execution, you may use the following methods to set this variable: CLI Launcher : You may set the environment variable by adding the following fields to the Envs field of your ~/.tao_mounts.json file as mentioned in bullet 3 in ths section Running the launcher. { "Envs" : [ { "variable" : "OMP_NUM_THREADSR" , "value" : "1" } }

Docker: You may set environment variables in Docker by setting the -e flag in the Docker command line. docker run -it --rm --gpus all \ -e OMP_NUM_THREADS = 1 \ -v /path/to/local/mount:/path/to/docker/mount nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt <model> train -e

Evaluating the model# The evaluation metric of Segformer is the meanIOU. For more details on the mean IOU metric, please refer here meanIOU.: Use the following command to run Segformer evaluation: TAO Client (v2 API) EVAL_JOB_ID = $( tao segformer create-job \ --kind experiment \ --name "segformer_evaluate" \ --action evaluate \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --eval-dataset " $DATASET_ID " \ --specs " $EVALUATE_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model segformer evaluate -e <experiment_spec> evaluate.checkpoint = <evaluation model> results_dir = <path to output evaluation results> [ evaluate.gpu_ids = <gpu indices> ] [ evaluate.num_gpus = <number of gpus> ] Required Arguments The following arguments are required. -e, --experiment_spec_file : The experiment spec file to set up the evaluation experiment.

evaluate.checkpoint : The .pth model. Here’s an example of using the Segformer evaluation command: Note For FTMS Client, the job output will be in your experiment’s cloud workspace. +------------+-------+-------+ | Class | IoU | Acc | +------------+-------+-------+ | foreground | 37 .81 | 44 .56 | | background | 83 .81 | 95 .51 | +------------+-------+-------+ Summary: +--------+-------+-------+-------+ | Scope | mIoU | mAcc | aAcc | +--------+-------+-------+-------+ | global | 60 .81 | 70 .03 | 85 .26 | +--------+-------+-------+-------+ ...

Running Inference on the Model# Use the following command to run inference on Segformer with the .pth model. TAO Client (v2 API) INFER_JOB_ID = $( tao segformer create-job \ --kind experiment \ --name "segformer_inference" \ --action inference \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --inference-dataset " $DATASET_ID " \ --specs " $INFERENCE_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model segformer inference -e <experiment_spec> inference.checkpoint = <inference model> results_dir = <path to output directory for inference> [ inference.gpu_ids = <gpu indices> ] [ inference.num_gpus = <number of gpus> ] Required Arguments The following arguments are required. -e, --experiment_spec : The experiment spec file to set up inference

inference.checkpoint : The .pth model to perform inference with

results_dir : The path to save the inference masks and mask overlaid images to. Inference creates two directories. Note For FTMS Client, the job output will be in your experiment’s cloud workspace. The output mask PNG images with class ID’s is saved in vis_tao . The overlaid mask images are saved in mask_tao .

Exporting the Model# Use the following command to export the model. TAO Client (v2 API) EXPORT_JOB_ID = $( tao segformer create-job \ --kind experiment \ --name "segformer_export" \ --action export \ --workspace-id $WORKSPACE_ID \ --parent-job-id $TRAIN_JOB_ID \ --specs " $EXPORT_SPECS " \ --base-experiment-ids '["' $BASE_EXPERIMENT_ID '"]' \ --encryption-key "nvidia_tlt" | jq -r '.id' ) TAO Launcher tao model segformer export [ -h ] -e <experiment spec file> results_dir = <path to results dir> export.checkpoint = <trained pth model to be xported> export.onnx_file = <onnx path> Required Arguments The following arguments are required to run the command. -e, --experiment_spec : The path to an experiment spec file

results_dir : The path where the logs for export will be saved

export.checkpoint : The .pth model to be exported

export.onnx_file : The :code:.`onnx` file to be stored

TensorRT engine generation, validation, and int8 calibration# For deployment, refer to the TAO Deploy documentation