Segformer ONNX file generated from tao export
is taken as an input to tao-deploy
to generate
optimized TensorRT engine. We do not support Int8 precision for Segformer.
Same spec file can be used as the tao model segformer export
command.
trt_config
The gen_trt_engine
parameter defines TensorRT engine generation.
gen_trt_engine:
onnx_file: /path/to/onnx_file
trt_engine: /path/to/trt_engine
input_width: 512
input_height: 512
tensorrt:
data_type: FP32
workspace_size: 1024
min_batch_size: 1
opt_batch_size: 1
max_batch_size: 1
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
The precision to be used for the TensorRT engine |
||
|
string |
The maximum workspace size for the TensorRT engine |
||
|
unsigned int |
3 |
The input channel size. Only the value 3 is supported. |
3 |
|
unsigned int |
960 |
The input width |
>0 |
|
unsigned int |
544 |
The input height |
>0 |
|
unsigned int |
-1 |
The batch size of the ONNX model |
>=-1 |
tensorrt
The tensorrt
parameter defines TensorRT engine generation.
Parameter |
Datatype |
Default |
Description |
Supported Values |
|
string |
fp32 |
The precision to be used for the TensorRT engine |
fp32/fp16 |
|
unsigned int |
1024 |
The maximum workspace size for the TensorRT engine |
>1024 |
|
unsigned int |
1 |
The minimum batch size used for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The optimal batch size used for the optimization profile shape |
>0 |
|
unsigned int |
1 |
The maximum batch size used for the optimization profile shape |
>0 |
Use the following command to run Segformer engine generation:
tao deploy segformer gen_trt_engine -e /path/to/spec.yaml \
-r /path/to/results \
gen_trt_engine.onnx_file=/path/to/onnx/file \
gen_trt_engine.trt_engine=/path/to/engine/file \
gen_trt_engine.tensorrt.data_type=<data_type>
Required Arguments
-e, --experiment_spec
: The experiment spec file to set up TensorRT engine generation
Optional Arguments
-r, --results_dir
: The directory where the JSON status-log file will be dumpedgen_trt_engine.onnx_file
: The.onnx
model to be convertedgen_trt_engine.trt_engine
: The path where the generated engine will be storedgen_trt_engine.tensorrt.data_type
: The precision to be exported
Sample Usage
Here’s an example of using the gen_trt_engine
command to generate an FP16 TensorRT engine:
tao deploy segformer gen_trt_engine -e $DEFAULT_SPEC
gen_trt_engine.onnx_file=$ONNX_FILE \
gen_trt_engine.trt_engine=$ENGINE_FILE \
gen_trt_engine.tensorrt.data_type=FP16
Same spec file as TAO evaluation/ inference spec file. Sample spec file:
model:
input_height: 512
input_width: 512
backbone:
type: "mit_b1"
dataset:
img_norm_cfg:
mean:
- 127.5
- 127.5
- 127.5
std:
- 127.5
- 127.5
- 127.5
test_dataset:
img_dir: /data/images/val
ann_dir: /data/masks/val
input_type: "grayscale"
data_root: /tlt-pytorch
palette:
- seg_class: foreground
rgb:
- 0
- 0
- 0
label_id: 0
mapping_class: foreground
- seg_class: background
rgb:
- 255
- 255
- 255
label_id: 1
mapping_class: background
batch_size: 1
workers_per_gpu: 1
Use the following command to run Segformer engine evaluation:
tao deploy segformer evaluate -e /path/to/spec.yaml \
-r /path/to/results \
evaluate.trt_engine=/path/to/engine/file
Required Arguments
-e, --experiment_spec
: The experiment spec file for evaluation This should be the same as thetao evaluate
spec file
Optional Arguments
-r, --results_dir
: The directory where the evaluat file and evaluation results will be dumpedevaluate.trt_engine
: The engine file for evaluation
Sample Usage
Here’s an example of using the evaluate
command to run evaluation with a TensorRT engine:
tao deploy segformer evaluate -e $DEFAULT_SPEC
-r $RESULTS_DIR \
evaluate.trt_engine=$ENGINE_FILE
Use the following command to run SegFormer engine inference:
tao deploy segformer inference -e /path/to/spec.yaml \
-r /path/to/results \
inference.trt_engine=/path/to/engine/file
Required Arguments
-e, --experiment_spec
: The experiment spec file for inference. This should be the same as thetao inference
spec file.
Optional Arguments
-r, --results_dir
: The directory where JSON status-log file and inference results will be dumpedinference.trt_engine
: The engine file for inference
Sample Usage
For inference, you can re-use the spec config mentioned under running_evaluation_through_tensorrt_engine.
Here’s an example of using the inference
command to run inference with a TensorRT engine:
tao deploy segformer inference -e $DEFAULT_SPEC
-r $RESULTS_DIR \
inference.trt_engine=$ENGINE_FILE
The mask overlaid visualization will be stored under $RESULTS_DIR/vis_overlay
and raw predictions in mask format will be stored under $RESULTS_DIR/mask_labels
.