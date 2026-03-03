Segformer with TAO Deploy#
Segformer ONNX file generated from
tao export is taken as an input to
tao-deploy to generate
optimized TensorRT engine. We do not support Int8 precision for Segformer.
Note
Throughout this documentation are references to
$EXPERIMENT_IDand
$DATASET_IDin the FTMS Client sections.
For instructions on creating a dataset using the remote client, refer to the Creating a dataset section in the Remote Client documentation.
For instructions on creating an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.
-
The spec format is YAML for TAO Launcher, and JSON for FTMS Client.
File-related parameters, such as dataset paths or pretrained model paths, are required only for TAO Launcher, not for FTMS Client.
Converting .onnx File into TensorRT Engine#
Same spec file can be used as the
tao model segformer export command.
trt_config#
The
gen_trt_engine parameter defines TensorRT engine generation.
Use the following command to get an experiment spec file for ReIdentificationNet:
SPECS=$(tao-client segformer get-spec --action train --job_type experiment --id $EXPERIMENT_ID)
gen_trt_engine:
onnx_file: /path/to/onnx_file
trt_engine: /path/to/trt_engine
input_width: 512
input_height: 512
tensorrt:
data_type: FP32
workspace_size: 1024
min_batch_size: 1
opt_batch_size: 1
max_batch_size: 1
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
|
string
|
The precision to be used for the TensorRT engine
|
|
string
|
The maximum workspace size for the TensorRT engine
|
|
unsigned int
|
3
|
The input channel size. Only the value 3 is supported.
|
3
|
|
unsigned int
|
960
|
The input width
|
>0
|
|
unsigned int
|
544
|
The input height
|
>0
|
|
unsigned int
|
-1
|
The batch size of the ONNX model
|
>=-1
tensorrt#
The
tensorrt parameter defines TensorRT engine generation.
|
Parameter
|
Datatype
|
Default
|
Description
|
Supported Values
|
|
string
|
fp32
|
The precision to be used for the TensorRT engine
|
fp32/fp16
|
|
unsigned int
|
1024
|
The maximum workspace size for the TensorRT engine
|
>1024
|
|
unsigned int
|
1
|
The minimum batch size used for the optimization profile shape
|
>0
|
|
unsigned int
|
1
|
The optimal batch size used for the optimization profile shape
|
>0
|
|
unsigned int
|
1
|
The maximum batch size used for the optimization profile shape
|
>0
Use the following command to run Segformer engine generation:
GTE_JOB_ID=$(tao-client segformer experiment-run-action --action gen_trt_engine --id $EXPERIMENT_ID --parent_job_id $EXPORT_JOB_ID --specs "$SPECS")
See also
The Export job ID is the job ID of the
tao-client segformer experiment-run-action --action export command.
tao deploy segformer gen_trt_engine -e /path/to/spec.yaml \ results_dir=/path/to/results \ gen_trt_engine.onnx_file=/path/to/onnx/file \ gen_trt_engine.trt_engine=/path/to/engine/file \ gen_trt_engine.tensorrt.data_type=<data_type>
Required Arguments
-e, --experiment_spec: The experiment spec file to set up TensorRT engine generation
Optional Arguments
results_dir: The directory where the JSON status-log file will be dumped
gen_trt_engine.onnx_file: The
.onnxmodel to be converted
gen_trt_engine.trt_engine: The path where the generated engine will be stored
gen_trt_engine.tensorrt.data_type: The precision to be exported
Sample Usage
Here’s an example of using the
gen_trt_engine command to generate an FP16 TensorRT engine:
tao deploy segformer gen_trt_engine -e $DEFAULT_SPEC
gen_trt_engine.onnx_file=$ONNX_FILE \
gen_trt_engine.trt_engine=$ENGINE_FILE \
gen_trt_engine.tensorrt.data_type=FP16
Running Evaluation through TensorRT Engine#
Same spec file as TAO evaluation/ inference spec file. Sample spec file:
model:
input_height: 512
input_width: 512
backbone:
type: "mit_b1"
dataset:
img_norm_cfg:
mean:
- 127.5
- 127.5
- 127.5
std:
- 127.5
- 127.5
- 127.5
test_dataset:
img_dir: /data/images/val
ann_dir: /data/masks/val
input_type: "grayscale"
data_root: /tlt-pytorch
palette:
- seg_class: foreground
rgb:
- 0
- 0
- 0
label_id: 0
mapping_class: foreground
- seg_class: background
rgb:
- 255
- 255
- 255
label_id: 1
mapping_class: background
batch_size: 1
workers_per_gpu: 1
Use the following command to run Segformer engine evaluation:
INFER_JOB_ID=$(tao-client segformer experiment-run-action --action evaluate --id $EXPERIMENT_ID --parent_job_id $GTE_JOB_ID --specs "$SPECS")
tao deploy segformer evaluate -e /path/to/spec.yaml \ results_dir=/path/to/results \ evaluate.trt_engine=/path/to/engine/file
Required Arguments
-e, --experiment_spec: The experiment spec file for evaluation This should be the same as the
tao evaluatespec file
Optional Arguments
results_dir: The directory where the evaluat file and evaluation results will be dumped
evaluate.trt_engine: The engine file for evaluation
Sample Usage
Here’s an example of using the
evaluate command to run evaluation with a TensorRT engine:
tao deploy segformer evaluate -e $DEFAULT_SPEC
results_dir=$RESULTS_DIR \
evaluate.trt_engine=$ENGINE_FILE
Running Inference through TensorRT Engine#
Use the following command to run SegFormer engine inference:
EVAL_JOB_ID=$(tao-client segformer experiment-run-action --action inference --id $EXPERIMENT_ID --parent_job_id $GTE_JOB_ID --specs "$SPECS")
tao deploy segformer inference -e /path/to/spec.yaml \ results_dir=/path/to/results \ inference.trt_engine=/path/to/engine/file
Required Arguments
-e, --experiment_spec: The experiment spec file for inference. This should be the same as the
tao inferencespec file.
Optional Arguments
results_dir: The directory where JSON status-log file and inference results will be dumped
inference.trt_engine: The engine file for inference
Sample Usage
For inference, you can re-use the spec config mentioned under running_evaluation_through_tensorrt_engine.
Here’s an example of using the
inference command to run inference with a TensorRT engine:
tao deploy segformer inference -e $DEFAULT_SPEC
results_dir=$RESULTS_DIR \
inference.trt_engine=$ENGINE_FILE
The mask overlaid visualization will be stored under
$RESULTS_DIR/vis_overlay and raw predictions in mask format will be stored under
$RESULTS_DIR/mask_labels.