1 # Copyright (c) 2019-2020 NVIDIA CORPORATION. All rights reserved.
3 @page dwx_tensorRT_tool TensorRT Optimizer Tool
6 @section dwx_tensorRT_tool_description Description
8 The NVIDIA<sup>®</sup> DriveWorks TensorRT Optimizer Tool enables optimization for a given Caffe, UFF or ONNX model using TensorRT.
10 For specific examples, please refer to the following:
12 - @ref dwx_tensorRT_tool_examples_uff.
13 - @ref dwx_tensorRT_tool_examples_caffe.
14 - @ref dwx_tensorRT_tool_examples_onnx.
16 @section dwx_tensorRT_tool_prerequisites Prerequisites
18 This tool is available on NVIDIA DRIVE<sup>™</sup> OS Linux.
20 This tool creates output files that are placed into the current working directory by default. Please ensure the following for your convenience:
22 - Write permissions are enabled for the current working directory.
23 - Include the tools folder in the binary search path of the system.
24 - Execute from your home directory.
26 @section dwx_tensorRT_tool_usage Running the Tool
28 The TensorRT Optimization tool accepts the following parameters. Several of these parameters are required based on model type. \n
29 For more information, please refer to the @ref dwx_tensorRT_tool_examples.
31 Run the tool by executing:
33 ./tensorRT_optimization --modelType=[uff|caffe|onnx]
34 --outputBlobs=[<output_blob1>,<output_blob2>,...]
35 --prototxt=[path to file]
36 --caffemodel=[path to file]
37 --uffFile=[path to file]
38 --inputBlobs=[<input_blob1>,<input_blob2>,...]
39 --inputDims=[<NxNxN>,<NxNxN>,...]
40 --onxxFile=[path to file]
44 [--out=[path to file]]
46 [--calib=[calibration file name]]
47 [--cudaDevice=[CUDA GPU index]]
50 [--dlaLayerConfig=[path to json layer config]]
51 [--pluginConfig=[path to plugin config file]]
52 [--precisionConfig=[path to precision config file]]
53 [--testFile=[path to binary file]]
55 [--workspaceSize=[int]]
56 [--explicitBatch=[int]]
58 @subsection dwx_tensorRT_tool_params Parameters
60 --modelType=[uff|caffe|onnx]
61 Description: The type of model to be converted to the TensorRT network.
62 Warning: uff and caffe model types are deprecated and will be dropped in the next major release.
64 --outputBlobs=[<output_blob1>,<output_blob2>,...]
65 Description: Names of output blobs combined with commas.
66 Example: --outputBlobs=bboxes,coverage
68 --prototxt=[path to file]
69 Description: Deploys a file that describes the Caffe network.
70 Example: --prototxt=deploy.prototxt
72 --caffemodel=[path to file]
73 Description: Caffe model file containing weights.
74 Example: --caffemodel=weights.caffemodel
76 --outputBlobs=[<output_blob1>,<output_blob2>,...]
77 Description: Names of output blobs combined with commas.
78 Example: --outputBlobs=bboxes,coverage
80 --uffFile=[path to file]
81 Description: Path to a UFF file.
82 Example: --uffFile=~/myNetwork.uff
84 --inputBlobs=[<input_blob1>,<input_blob2>,...]
85 Description: Names of input blobs combined with commas. Ignored if the model is ONNX or Caffe.
86 Example: --inputBlobs=data0,data1
88 --inputDims=[<NxNxN>,<NxNxN>,...]
89 Description: Input dimensions for each input blob separated by commas, given in the same
90 order as the input blobs.
91 Dimensions are separated by `x`, and given in CHW format.
92 Example: --inputDims=3x480x960,1x1x10
94 --onxxFile=[path to file]
95 Description: Path to an ONNX file.
96 Example: --onnxFile=~/myNetwork.onnx
99 Description: Number of iterations to run to measure speed.
100 This parameter is optional.
101 Example: --iterations=100
105 Description: Batch size of the model to be generated.
106 This parameter is optional.
107 Example: --batchSize=2
111 Description: The network running in paired fp16 mode. Requires platform to support native fp16.
112 This parameter is optional.
117 Description: Name of the optimized model file.
118 This parameter is optional.
119 Example: --out=model.bin
120 Default value: optimized.bin
123 Description: If specified, run in INT8 mode.
124 This parameter is optional.
126 --calib=[calibration file name]
127 Description: INT8 calibration file name.
128 This parameter is optional.
129 Example: --calib=calib.cache
131 --cudaDevice=[CUDA GPU index]
132 Description: Index of a CUDA capable GPU device.
133 This parameter is optional.
134 Example: --cudaDevice=1
138 Description: Enable tensorRT verbose logging.
139 This parameter is optional
143 Description: If specified, this generates a model to be executed on DLA. This argument is only valid on platforms with DLA hardware.
144 This parameter is optional.
147 Description: If specified, this generates a model to be executed on DLA.
148 The safe mode indicates all layers must be executable on DLA, the input/output of the DNN module
149 must be provided in the corresponding precision and format, and the input/output tensors must be provided
150 as NvMediaTensor for best performance.
151 `dwDNN` module is capable of streaming NvMediaTensors from/to CUDA and
152 converting precisions and format. For more information, please refer to `dwDNN` module's documentation.
155 Descripton: If specified, specific layers to be forced to GPU are read from this json. Layers to be run on GPU can be specified by type of layer or layer number. Layer type and layer number can be obtained from logs by running with default template. This argument is valid only if --useDLA=1
156 This parameter is optional.
157 Example: --dlaLayerConfig=./template_dlaconfig.json
159 --pluginConfig=[path to plugin config file]
160 Description: Path to plugin configuration file. See template_plugin.json for an example.
161 This parameter is optional.
162 Example: --pluginConfig=template_plugin.json
164 --precisionConfig=[path to precision config file]
165 Description: Path to a precision configuration file for generating models with mixed
166 precision. For layers not included in the configuration file, builder mode determines the precision. For these layers, TensorRT may choose any precision for better performance. If 'output_types' is not provided for a layer, the data type of the output tensors will be set to the precision of the layer. For the layers with precision set to INT8, scaling factors of the input/output tensors should be provided. This file can also be used to set the scaling factors for each tensor by name. The values provided in this file will override the scaling factors specified in calibration file (if provided). See 'template_precision.json' for an example.
167 This parameter is optional.
168 Example: --precisionConfig=template_precision.json
170 --testFile=[path to binary file]
171 Description: Name of a binary file for model input/output validation. This file should contain
172 flattened pairs of inputs and expected outputs in the same order as the TensorRT model expects. The file is assumed to hold 32 bit floats. The number of test pairs is automatically detected.
173 This parameter is optional.
174 Example: Data with two inputs and two outputs would have a layout in the file as follows:
175 > \[input 1\]\[input 2\]\[output 1\]\[output 2\]\[input 1\]\[input 2\]\[output 1\]\[output 2\]...
178 Description: If specified, executes the optimized network by CUDA graph. It helps check if the optimized network
179 works with CUDA graph acceleration.
180 This parameter is optional.
182 --workspaceSize=[int]
183 Description: Max workspace size in megabytes. Limits the maximum size that any layer in the network
184 can use. If insufficient scratch is provided, TensorRT may not be able to find an implementation for a given layer.
185 This parameter is optional.
187 --explicitBatch=[int]
188 Description: Determines whether explicit batch should be enabled or not.
189 For TensorRT versions higher than or equal to 6.3, if an ONNX model is provided as
190 input, this flag will be automatically set to 1.
191 This parameter is optional.
193 @section dwx_tensorRT_tool_examples Examples
195 @subsection dwx_tensorRT_tool_examples_uff Optimizing UFF Models
197 ./tensorRT_optimization --modelType=uff
198 --outputBlobs=bboxes,coverage
199 --uffFile=~/myNetwork.uff
200 --inputBlobs=data0,data1
201 --inputDims=3x480x960,1x1x10
203 @subsection dwx_tensorRT_tool_examples_caffe Optimizing Caffe Models
205 ./tensorRT_optimization --modelType=caffe
206 --outputBlobs=bboxes,coverage
207 --prototxt=deploy.prototxt
208 --caffemodel=weights.caffemodel
210 @note The `--inputBlobs` and `--inputDims` parameters are ignored if you select the Caffe model type. <br>All the input blobs will be automatically marked as input.
212 @subsection dwx_tensorRT_tool_examples_onnx Optimizing ONNX Models
214 ./tensorRT_optimization --modelType=onxx
215 --onnxFile=~/myNetwork.onnx
217 @note The `--inputBlobs`, `--inputDims`, and `--outBlobs` parameters are ignored if you select the ONNX model type.<br>All the input and output blobs will be automatically marked as input or output, respectively.