Exporting the Model =================== .. _exporting_the_model: The Transfer Learning Toolkit includes the :code:`tlt-export` command to export and prepare TLT models for :ref:`Deploying to DeepStream `. The :code:`tlt-export` command optionally generates the calibration cache for TensorRT INT8 engine calibration. Exporting the model decouples the training process from inference and allows conversion to TensorRT engines outside the TLT environment. TensorRT engines are specific to each hardware configuration and should be generated for each unique inference environment. This may be interchangeably referred to as .trt or .engine file. The same exported TLT model may be used universally across training and deployment hardware. This is referred to as the .etlt file or encrypted TLT file. During model export TLT model is encrypted with a private key. This key is required when you deploy this model for inference. INT8 Mode Overview ^^^^^^^^^^^^^^^^^^ TensorRT engines can be generated in INT8 mode to improve performance, but require a calibration cache at engine creation-time. The calibration cache is generated using a calibration tensor file, if :code:`tlt-export` is run with the :code:`--data_type` flag set to :code:`int8`. Pre-generating the calibration information and caching it removes the need for calibrating the model on the inference machine. Moving the calibration cache is usually much more convenient than moving the calibration tensorfile, since it is a much smaller file and can be moved with the exported model. Using the calibration cache also speeds up engine creation as building the cache can take several minutes to generate depending on the size of the Tensorfile and the model itself. The export tool can generate INT8 calibration cache by ingesting training data using either of these options: * **Option 1**: Providing a calibration tensorfile generated using the :code:`tlt-int8-tensorfile` command. For image classification, and detection using Detectnet_v2, SSD and DSSD, the recommendation is to use this option, because the :code:`tlt-int8-tensorfile` command uses the data generators to produce the training data. This helps easily generate a representative subsample of the training dataset. * **Option 2**: Pointing the tool to a directory of images that you want to use to calibrate the model. For this option, make sure to create a sub-sampled directory of random images that best represent your training dataset. For FasterRCNN, YOLOV3 and RetinaNet detection architecture, only option 2 is supported. * **Option 3**: Using the training data loader to load the training images for INT8 calibration. This option is supported for DetectNet_v2 and FasterRCNN. This option is now the recommended approach to support multiple image directories by leveraging the training dataset loader. This also ensures two important aspects of data during calibration: * Data pre=processing in the INT8 calibration step is the same as in the training process * The data batches are sampled randomly across the entire training dataset, thereby improving the accuracy of the int8 model. NVIDIA plans to eventually deprecate the Option 1 and only support Option 2 and 3. .. image:: ../content/tlt_export_int8_cali.png FP16/FP32 Model ^^^^^^^^^^^^^^^ The :code:`calibration.bin` is only required if you need to run inference at INT8 precision. For FP16/FP32 based inference, the export step is much simpler. All that is required is to provide a model from the :code:`tlt-train` step to :code:`tlt-export` to convert into an encrypted tlt model. .. image:: ../content/tlt_export_int8_cali2.png Generating an INT8 tensorfile Using the tlt-int8-tensorfile Command ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The INT8 tensorfile is a binary file that contains the preprocessed training samples, which maybe used to calibrate the model. In this release, TLT only supports calibration tensorfile generation for SSD, DSSD, DetectNet_v2 and classification models. The sample usage for the :code:`tlt-int8-tensorfile` command to generate a calibration tensorfile is defined as below: .. code:: tlt-int8-tensorfile {classification, detectnet_v2} [-h] -e -o -m [--use_validation_set] Positional Arguments ******************** * :code:`classification` * :code:`detectnet_v2` * :code:`ssd` * :code:`dssd` Required Arguments ****************** * :code:`-e, --experiment_spec_file`: Path to the experiment spec file (Only required for SSD and FasterRCNN) * :code:`-o, --output_path`: Path to the output tensorfile that will be created * :code:`-m, --max_batches`: Number of batches of input data to be serialized Optional Argument ***************** * :code:`--use_validation_set`: Flag to use validation dataset instead of training set. Here's a sample command to invoke the :code:`tlt-int8-tensorfile` command for a classification model. .. code:: tlt-int8-tensorfile classification -e $SPECS_DIR/classification_retrain_spec.cfg \ -m 10 \ -o $USER_EXPERIMENT_DIR/export/calibration.tensor Exporting the Model Using tlt-export ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Here's an example of the command line arguments of the tlt-export command: .. code:: tlt-export [-h] {classification, detectnet_v2, ssd, dssd, faster_rcnn, yolo, retinanet} -m -k [-o ] [--cal_data_file ] [--cal_image_dir ] [--data_type ] [--batches ] [--max_batch_size ] [--max_workspace_size ] [--experiment_spec ] [--engine_file ] [--verbose Verbosity of the logger] [--force_ptq Flag to force PTQ] Required Arguments ****************** * :code:`export_module`: Which model to export, can be :code:`classification`, :code:`detectnet_v2`, :code:`faster_rcnn`, :code:`ssd`, :code:`dssd`, :code:`yolo`, or :code:`retinanet`. This is a positional argument. * :code:`-m, --model`: Path to the .tlt model file to be exported using :code:`tlt-export`. * :code:`-k, --key`: Key used to save the :code:`.tlt` model file. * :code:`-e, --experiment_spec`: Path to the spec file, this argument is required for faster_rcnn,ssd,dssd,yolo,retinanet. Optional Arguments ****************** * :code:`-o, --output_file`: Path to save the exported model to. The default is :code:`./.etlt`. * :code:`--data_type`: Desired engine data type, generates calibration cache if in INT8 mode. The options are: {fp32, fp16, int8} The default value is fp32. If using int8, following INT8 arguments are required. * :code:`-s, --strict_type_constraints`: A Boolean flag to indicate whether or not to apply the TensorRT strict_type_constraints when building the TensorRT engine. Note this is only for applying the strict type of INT8 mode. INT8 Export Mode Required Arguments ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * :code:`--cal_data_file`: tensorfile generated from tlt-int8-tensorfile for calibrating the engine. This can also be an output file if used with :code:`--cal_image_dir`. * :code:`--cal_image_dir`: Directory of images to use for calibration. .. Note:: :code:`--cal_image_dir` parameter for images and applies the necessary preprocessing to generate a tensorfile at the path mentioned in the :code:`--cal_data_file` parameter, which is in turn used for calibration. The number of batches in the tensorfile generated is obtained from the value set to the :code:`--batches` parameter, and the :code:`batch_size` is obtained from the value set to the :code:`--batch_size` parameter. Be sure that the directory mentioned in :code:`--cal_image_dir` has at least :code:`batch_size * batches` number of images in it. The valid image extensions are .jpg, .jpeg, and .png. In this case, the :code:`input_dimensions` of the calibration tensors are derived from the input layer of the :code:`.tlt` model. INT8 Export Optional Arguments ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * :code:`--cal_cache_file`: Path to save the calibration cache file. The default value is :code:`./cal.bin`. * :code:`--batches`: Number of batches to use for calibration and inference testing.The default value is 10. * :code:`--batch_size`: Batch size to use for calibration. The default value is 8. * :code:`--max_batch_size`: Maximum batch size of TensorRT engine. The default value is 16. * :code:`--max_workspace_size`: Maximum workspace size of TensorRT engine. The default value is: 1073741824 = 1<<30) * :code:`--experiment_spec`: The experiment_spec for training/inference/evaluation. This is used to generate the graphsurgeon config script for FasterRCNN from the experiment_spec, only useful for FasterRCNN. This when used with DetectNet_v2 and FasterRCNN also sets up the dataloader based calibrator to leverage the training dataloader to calibrate the model. * :code:`--engine_file`: Path to the serialized TensorRT engine file. Note that this file is hardware specific, and cannot be generalized across GPUs. Useful to quickly test your model accuracy using TensorRT on the host. As TensorRT engine file is hardware specific, you cannot use this engine file for deployment unless the deployment GPU is identical to training GPU. * :code:`--force_ptq`: A boolean flag to force post training quantization on the exported etlt model. .. Note:: When exporting a model trained with QAT enabled, the tensor scale factors to calibrate the activations are peeled out of the model and serialized to a TensorRT readable cache file defined by the :code:`cal_cache_file` argument. However, do note that the current version of QAT doesn’t natively support DLA int8 deployment in the Jetson. Inorder to deploy this model on a Jetson with DLA :code:`int8`, please use the :code:`--force_ptq` flag to use TensorRT post training quantization to generate the calibration cache file. Exporting a Model ^^^^^^^^^^^^^^^^^ Here's a sample command to export a DetectNet_v2 model in INT8 mode. This command shows option 1; uses :code:`--cal_data_file` option with the calibration.tensor generated using :code:`tlt-int8-tensorfile` command. .. code:: tlt-export detectnet_v2 \ -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \ -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \ -k $KEY \ --cal_data_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor \ --data_type int8 \ --batches 10 \ --cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin --engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet_18.engine Here's an example log of a successful export: .. code:: Using TensorFlow backend. 2018-11-02 18:59:43,347 [INFO] iva.common.tlt-export: Loading model from resnet10_kitti_multiclass_v1.tlt 2018-11-02 18:59:47,572 [INFO] tensorflow: Restoring parameters from /tmp/tmp8crUBp.ckpt INFO:tensorflow:Froze 82 variables. 2018-11-02 18:59:47,701 [INFO] tensorflow: Froze 82 variables. Converted 82 variables to const ops. 2018-11-02 18:59:48,123 [INFO] iva.common.tlt-export: Converted model was saved into resnet10_kitti_multiclass_v1.etlt 2018-11-02 18:59:48,123 [INFO] iva.common.tlt-export: Input node: input_1 2018-11-02 18:59:48,124 [INFO] iva.common.tlt-export: Output node(s): ['output_bbox/BiasAdd', 'output_cov/Sigmoid'] Here's a sample command using the :code:`--cal_image_dir` option for a FasterRCNN model using option 2. .. code:: tlt-export faster_rcnn \ -m $USER_EXPERIMENT_DIR/data/faster_rcnn/frcnn_kitti_retrain.epoch12.tlt \ -o $USER_EXPERIMENT_DIR/data/faster_rcnn/frcnn_kitti_retrain.int8.etlt \ -e $SPECS_DIR/frcnn_kitti_retrain_spec.txt \ --key $KEY \ --cal_image_dir $USER_EXPERIMENT_DIR/data/KITTI/val/image_2 \ --data_type int8 \ --batch_size 8 \ --batches 10 \ --cal_data_file $USER_EXPERIMENT_DIR/data/faster_rcnn/cal.tensorfile \ --cal_cache_file $USER_EXPERIMENT_DIR/data/faster_rcnn/cal.bin \ --engine_file $USER_EXPERIMENT_DIR/data/faster_rcnn/detection.trt