DriveWorks SDK Reference
3.0.4260 Release
For Test and Development only

/dvs/git/dirty/gitlab-master_av/dw/sdk/tools/tensorRT_optimization/README-tensorRT_optimization.md
Go to the documentation of this file.
1 # Copyright (c) 2019-2020 NVIDIA CORPORATION. All rights reserved.
2 
3 @page dwx_tensorRT_tool TensorRT Optimizer Tool
4 @tableofcontents
5 
6 @section dwx_tensorRT_tool_description Description
7 
8 The NVIDIA<sup>&reg;</sup> DriveWorks TensorRT Optimizer Tool enables optimization for a given Caffe, UFF or ONNX model using TensorRT.
9 
10 For specific examples, please refer to the following:
11 - @ref dwx_tensorRT_tool_examples_uff.
12 - @ref dwx_tensorRT_tool_examples_caffe.
13 - @ref dwx_tensorRT_tool_examples_onnx.
14 
15 @note SW Release Applicability: This tool is available in both <b>NVIDIA DriveWorks</b> and <b>NVIDIA DRIVE Software</b> releases.
16 
17 @section dwx_tensorRT_tool_prerequisites Prerequisites
18 
19 This tool is available on NVIDIA DRIVE<sup>&trade;</sup> OS Linux.
20 
21 This tool creates output files that are placed into the current working directory by default. Please ensure the following for your convenience:
22 - Write permissions are enabled for the current working directory.
23 - Include the tools folder in the binary search path of the system.
24 - Execute from your home directory.
25 
26 @section dwx_tensorRT_tool_usage Running the Tool
27 
28 The TensorRT Optimization tool accepts the following parameters. Several of these parameters are required based on model type. \n
29 For more information, please refer to the @ref dwx_tensorRT_tool_examples.
30 
31 Run the tool by executing:
32 
33  ./tensorRT_optimization --modelType=[uff|caffe|onnx]
34  --outputBlobs=[<output_blob1>,<output_blob2>,...]
35  --prototxt=[path to file]
36  --caffemodel=[path to file]
37  --uffFile=[path to file]
38  --inputBlobs=[<input_blob1>,<input_blob2>,...]
39  --inputDims=[<NxNxN>,<NxNxN>,...]
40  --onxxFile=[path to file]
41  [--iterations=[int]]
42  [--batchSize=[int]]
43  [--half2=[int]]
44  [--out=[path to file]]
45  [--int8]
46  [--calib=[calibration file name]]
47  [--cudaDevice=[CUDA GPU index]]
48  [--useDLA]
49  [--dlaLayerConfig=[path to json layer config]]
50  [--pluginConfig=[path to plugin config file]]
51  [--precisionConfig=[path to precision config file]]
52  [--testFile=[path to binary file]]
53  [--useGraph]
54  [--workspaceSize=[int]]
55  [--refitting]
56  [--zeroWeight]
57 
58 @subsection dwx_tensorRT_tool_params Parameters
59 
60  --modelType=[uff|caffe|onnx]
61  Description: The type of model to be converted to the TensorRT network.
62 
63  --outputBlobs=[<output_blob1>,<output_blob2>,...]
64  Description: Names of output blobs combined with commas.
65  Example: --outputBlobs=bboxes,coverage
66 
67  --prototxt=[path to file]
68  Description: Deploys a file that describes the Caffe network.
69  Example: --prototxt=deploy.prototxt
70 
71  --caffemodel=[path to file]
72  Description: Caffe model file containing weights.
73  Example: --caffemodel=weights.caffemodel
74 
75  --outputBlobs=[<output_blob1>,<output_blob2>,...]
76  Description: Names of output blobs combined with commas.
77  Example: --outputBlobs=bboxes,coverage
78 
79  --uffFile=[path to file]
80  Description: Path to a UFF file.
81  Example: --uffFile=~/myNetwork.uff
82 
83  --inputBlobs=[<input_blob1>,<input_blob2>,...]
84  Description: Names of input blobs combined with commas.
85  Example: --inputBlobs=data0,data1
86 
87  --inputDims=[<NxNxN>,<NxNxN>,...]
88  Description: Input dimensions for each input blob separated by commas, given in the same
89  order as the input blobs.
90  Dimensions are separated by `x`, and given in CHW format.
91  Example: --inputDims=3x480x960,1x1x10
92 
93  --onxxFile=[path to file]
94  Description: Path to an ONNX file.
95  Example: --onnxFile=~/myNetwork.onnx
96 
97  --iterations=[int]
98  Description: Number of iterations to run to measure speed.
99  This parameter is optional.
100  Example: --iterations=100
101  Default value: 10
102 
103  --batchSize=[int]
104  Description: Batch size of the model to be generated.
105  This parameter is optional.
106  Example: --batchSize=2
107  Default value: 1
108 
109  --half2=[int]
110  Description: The network running in paired fp16 mode. Requires platform to support native fp16.
111  This parameter is optional.
112  Example: --half2=1
113  Default value: 0
114 
115  --out=[path to file]
116  Description: Name of the optimized model file.
117  This parameter is optional.
118  Example: --out=model.bin
119  Default value: optimized.bin
120 
121  --int8
122  Description: If specified, run in INT8 mode.
123  This parameter is optional.
124 
125  --calib=[calibration file name]
126  Description: INT8 calibration file name.
127  This parameter is optional.
128  Example: --calib=calib.cache
129 
130  --cudaDevice=[CUDA GPU index]
131  Description: Index of a CUDA capable GPU device.
132  This parameter is optional.
133  Example: --cudaDevice=1
134  Default value: 0
135 
136  --verbose = [int]
137  Description: Enable tensorRT verbose logging.
138  This parameter is optional
139  Default value: 0
140 
141  --useDLA
142  Description: If specified, this generates a model to be executed on DLA. This argument is only valid on platforms with DLA hardware.
143  This parameter is optional.
144 
145  --dlaLayerConfig
146  Descripton: If specified, specific layers to be forced to GPU are read from this json. Layers to be run on GPU can be specified by type of layer or layer number. Layer type and layer number can be obtained from logs by running with default template. This argument is valid only if --useDLA=1
147  This parameter is optional.
148  Example: --dlaLayerConfig=./template_dlaconfig.json
149 
150  --pluginConfig=[path to plugin config file]
151  Description: Path to plugin configuration file. See template_plugin.json for an example.
152  This parameter is optional.
153  Example: --pluginConfig=template_plugin.json
154 
155  --precisionConfig=[path to precision config file]
156  Description: Path to a precision configuration file for generating models with mixed
157  precision. For layers not included in the configuration file, builder mode determines the precision. For these layers, TensorRT may choose any precision for better performance. If 'output_types' is not provided for a layer, the data type of the output tensors will be set to the precision of the layer. For the layers with precision set to INT8, scaling factors of the input/output tensors should be provided. This file can also be used to set the scaling factors for each tensor by name. The values provided in this file will override the scaling factors specified in calibration file (if provided). See 'template_precision.json' for an example.
158  This parameter is optional.
159  Example: --precisionConfig=template_precision.json
160 
161  --testFile=[path to binary file]
162  Description: Name of a binary file for model input/output validation. This file should contain
163  flattened pairs of inputs and expected outputs in the same order as the TensorRT model expects. The file is assumed to hold 32 bit floats. The number of test pairs is automatically detected.
164  This parameter is optional.
165  Example: Data with two inputs and two outputs would have a layout in the file as follows:
166  > \[input 1\]\[input 2\]\[output 1\]\[output 2\]\[input 1\]\[input 2\]\[output 1\]\[output 2\]...
167 
168  --useGraph
169  Description: If specified, executes the optimized network by CUDA graph. It helps check if the optimized network
170  works with CUDA graph acceleration.
171  This parameter is optional.
172 
173  --workspaceSize=[int]
174  Description: Max workspace size in megabytes. Limits the maximum size that any layer in the network
175  can use. If insufficient scratch is provided, TensorRT may not be able to find an implementation for a given layer.
176  This parameter is optional.
177 
178 @section dwx_tensorRT_tool_examples Examples
179 
180 @subsection dwx_tensorRT_tool_examples_uff Optimizing UFF Models
181 
182  ./tensorRT_optimization --modelType=uff
183  --outputBlobs=bboxes,coverage
184  --uffFile=~/myNetwork.uff
185  --inputBlobs=data0,data1
186  --inputDims=3x480x960,1x1x10
187 
188 @subsection dwx_tensorRT_tool_examples_caffe Optimizing Caffe Models
189 
190  ./tensorRT_optimization --modelType=caffe
191  --outputBlobs=bboxes,coverage
192  --prototxt=deploy.prototxt
193  --caffemodel=weights.caffemodel
194 
195 @note The `--inputBlobs` and `--inputDims` parameters are ignored if you select the Caffe model type. <br>All the input blobs will be automatically marked as input.
196 
197 @subsection dwx_tensorRT_tool_examples_onnx Optimizing ONNX Models
198 
199  ./tensorRT_optimization --modelType=onxx
200  --onnxFile=~/myNetwork.onnx
201 
202 @note The `--inputBlobs`, `--inputDims`, and `--outBlobs` parameters are ignored if you select the ONNX model type.<br>All the input and output blobs will be automatically marked as input or output, respectively.