Evaluating the Model
====================

.. _evaluating_the_model:

Once the model has been trained, using the experiment config file, and by following the steps to
train a model, the next step would be to evaluate this model on a test set to measure the
accuracy of the model. The TLT toolkit includes the :code:`tlt-evaluate` command to do this.

The classification app computes evaluation loss, Top-k accuracy, precision and recall as metrics.
Meanwhile, :code:`tlt-evaluate` for DetectNet_v2, FasterRCNN, Retinanet, DSSD, YOLOV3, and SSD
computes the Average Precision per class and the mean Average Precision metrics as defined
in the Pascal VOC challenge. Both sample and integrate modes are supported to calculate
average precision. The former was used in VOC challenges before 2010 while the latter was used
from 2010 onwards. The SAMPLE mode uses an 11-point method to compute the AP, while the INTEGRATE
mode uses a more fine-grained integration method and gets a more accurate number of AP. MaskRCNN
reports COCO’s `detection evaluation metrics`_. AP50 in COCO metrics is comparable to mAP in
Pascal VOC metrics.

.. _detection evaluation metrics: https://cocodataset.org/#detection-eval

When training is complete, the model is stored in the output directory of your choice in
$OUTPUT_DIR. Evaluate a model using the :code:`tlt-evaluate` command:

.. code::

    tlt-evaluate {classification,detectnet_v2,faster_rcnn,ssd,dssd,retinanet,yolo, mask_rcnn} [-h] [<arguments for classification/detectnet_v2/faster_rcnn/ssd/dssd/retinanet/yolo, mask_rcnn>]

**Required Arguments**

.. code::

   {classification, detectnet_v2, faster_rcnn, ssd, dssd, retinanet, yolo,
         mask_rcnn}

Choose whether you are evaluating a :code:`classification`, :code:`detectnet_v2`, :code:`ssd`,
:code:`dssd`, :code:`yolo`, :code:`retinanet`, :code:`faster_rcnn`, or :code:`mask_rcnn model`.

**Optional Arguments**

These arguments vary depending upon Classification, DetectNet_v2, SSD, DSSD, RetinaNet, YOLOv3,
FasterRCNN and MaskRCNN models.

Evaluating a Classification Model
---------------------------------

Execute :code:`tlt-evaluate` on a classification model.

.. code::

    tlt-evaluate classification [-h] -e <experiment_spec_file> -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Path to the experiment spec file.
* :code:`-k, –key`: Provide the encryption key to decrypt the model.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: Show this help message and exit.

If you followed the example in Training a classification model, you can run the evaluation:

.. code::

    tlt-evaluate classification -e classification_spec.cfg -k $YOUR_KEY

TLT evaluate for classification produces the following metrics:

* Loss
* Top-K accuracy
* Precision (P): TP / (TP + FP)
* Recall (R): TP / (TP + FN)
* Confusion Matrix

Evaluating a DetectNet_v2 Model
-------------------------------

Execute :code:`tlt-evaluate` on a DetectNet_v2 model.

.. code::

        tlt-evaluate detectnet_v2 [-h] -e <experiment_spec 
                                       -m <model_file>
                                       -k <key>
                                       [--use_training_set]

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment.
  This should be the same as training spec file.
* :code:`-m, --model`: Path to the model file to use for evaluation. This could be a
  :code:`.tlt` model file or a tensorrt engine generated using the tlt-export tool.
* :code:`-k, -–key`: Provide the encryption key to decrypt the model. This is a required argument
  only with a :code:`.tlt` model file.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: show this help message and exit.
* :code:`-f, --framework`: the framework to use when running evaluation
  (choices: “tlt”, “tensorrt”). By default the framework is set to TensorRT.
* :code:`--use_training_set`: Set this flag to run evaluation on training + validation dataset.

If you have followed the example in :ref:`Training a Detection Model
<training_a_detectnet_v2_model>`, you may now evaluate the model using the following command:

.. code::

        tlt-evaluate detectnet_v2 -e <path to training spec file>
                                  -m <path to the model> 
                                  -k <key to load the model>

.. Note:: This command runs evaluation on the same validation set that was used during training.

Use these steps to evaluate on a test set with ground truth labeled:

1. Create tfrecords for this training set by following the steps listed in the data input section.
2. Update the dataloader configuration part of the training spec file to include the newly
   generated tfrecords. For more information on the dataset config, please refer to Create
   an experiment spec file. You may create the tfrecords with any partition mode
   (sequence/random). The evaluate tool iterates through all the folds in the tfrecords patterns
   mentioned in the validation_data_source.

.. code::

        dataset_config {
          data_sources: {
            tfrecords_path: "<path to training tfrecords root>/<tfrecords_name*>"
            image_directory_path: "<path to training data root>"
          }
          image_extension: "jpg"
          target_class_mapping {
              key: "car"
              value: "car"
          }
          target_class_mapping {
              key: "automobile"
              value: "car"
          }
          ..
          ..
          ..
          target_class_mapping {
              key: "person"
              value: "pedestrian"
          }
          target_class_mapping {
              key: "rider"
              value: "cyclist"
          }
          validation_data_source: {
            tfrecords_path: "<path to testing tfrecords root>/<tfrecords_name*>"
            image_directory_path: "<path to testing data root>"
          }
        }

The rest of the experiment spec file remains the same as the training spec file.

Evaluating a FasterRCNN Model
-----------------------------

To run evaluation for a faster_rcnn model use this command:

.. code::

        tlt-evaluate faster_rcnn [-h] -e <experiment_spec> 
                                      [-k <enc_key>]

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment.
  This should be the same as a training spec file.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: show this help message and exit.
* :code:`-k, --enc_key`：The encoding key, can override the one in the spec file.

Evaluation Metrics
^^^^^^^^^^^^^^^^^^

For FasterRCNN, the evaluation will print out 4 metrics for the evaluated model:
AP(average precision), precision, recall and RPN_recall for each class in the evaluation dataset.
inally, it will also print the mAP(mean average precision) as a single metric number. Two
modes are supported for computing the AP, i.e., the PASCAL VOC 2007 and 2012 metrics. This
can be configured in the spec file's :code:`evaluation_config.use_voc_11_point_metric` parameter.
If this parameter is set to True, then AP calculation will use VOC 2007 method, otherwise it will
use VOC 2012 method.

The RPN_recall metric indicates the recall capability of the RPN of the
FasterRCNN model. The higher the RPN_recall metric, it means RPN can better detect an object as
foreground(but it doesn't say anything on which class this object belongs to since that is
delegated to RCNN). The RPN_recall metric is mainly used for debugging on the accuracy issue
of a FasterRCNN model.

Two Modes for tlt-evaluate
^^^^^^^^^^^^^^^^^^^^^^^^^^

The :code:`tlt-evaluate` command line for FasterRCNN has two modes. It can run with either
TLT backend or TensorRT backend. This behavior is also controlled via the spec file. The
:code:`evaluation_config` in the spec file can have an optional :code:`trt_evaluation` sub-field
that specifies which backend the :code:`tlt-evaluate` will run with.

By default (if the :code:`trt_evaluation` sub-field is not present in :code:`evaluation_config)`,
:code:`tlt-evaluate` will use TLT as the backend. If the :code:`trt_evaluation` sub-field
is present, it can specify :code:`tlt-evaluate` to run at TensorRT backend. In that case,
the model to do inference can be either the .etlt model from :code:`tlt-export` or the
TensorRT engine file from :code:`tlt-export` or :code:`tlt-converter`.

To use a TensorRT engine file for TensorRT backend based :code:`tlt-evaluate`, the
:code:`trt_evaluation` sub-field should look like this:

.. code::

        trt_evaluation {
        trt_engine: '/workspace/tlt-experiments/data/faster_rcnn/trt.int8.engine'
        max_workspace_size_MB: 2000
        }

To use a :code:`.etlt` model for TensorRT backend based :code:`tlt-evaluate`, the
:code:`trt_evaluation` sub-field should look like this:

.. code::

        trt_evaluation {
        etlt_model {
        model: '/workspace/tlt-experiments/data/faster_rcnn/resnet18.epoch12.etlt'
        calibration_cache: '/workspace/tlt-experiments/data/faster_rcnn/cal.bin'
        }
        trt_data_type: 'int8'
        max_workspace_size_MB: 2000
        }

If the TensorRT inference data type is not INT8, the :code:`calibration_cache` sub-field that
provides the path to the INT8 calibration cache is not needed. In INT8 case, the calibration
cache should be generated via the tlt-export command line in INT8 mode. See also the
documentation of FasterRCNN spec file for the details of the :code:`trt_evaluation` message
structure.

Evaluating an SSD Model
-----------------------

To run evaluation for an SSD model use this command:

.. code::

    tlt-evaluate ssd [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_fil`: Experiment spec file to set up the evaluation experiment.
  This should be the same as the training specification file.
* :code:`-m, --model`: Path to the model file to use for evaluation.
* :code:`-k, --key`: Provide the key to load the model.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: show this help message and exit.

Evaluating a DSSD model
-----------------------

To run evaluation for an DSSD model use this command:

.. code::

    tlt-evaluate ssd [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment. This should be the same as training spec file.
* :code:`-m, --model`: Path to the model file to use for evaluation.
* :code:`-k, --key`: Provide the key to load the model.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: Show this help message and exit.

Evaluating a YOLOv3 Model
-------------------------

To run evaluation for a YOLOv3 model use this command:

.. code::

    tlt-evaluate yolo [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment. This should be the same as the training specification file.
* :code:`-m, --model`: Path to the model file to use for evaluation.
* :code:`-k, --key`: Provide the key to load the model.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: show this help message and exit.

Evaluating a RetinaNet Model
----------------------------

To run evaluation for a RetinaNet model use this command:

.. code::

    tlt-evaluate retinanet [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment. This should be the same as the training specification file.
* :code:`-m, --model`: Path to the model file to use for evaluation.
* :code:`-k, --key`: Provide the key to load the model.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: Show this help message and exit.

Evaluating a MaskRCNN Model
---------------------------

To run evaluation for a MaskRCNN model use this command:

.. code::

    tlt-evaluate mask_rcnn [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-e, --experiment_spec_file`: Experiment spec file to set up the evaluation experiment.
  This should be the same as the training spec file.
* :code:`-m, --model`: Path to the model file to use for evaluation.
* :code:`-k, --key`: Provide the key to load the model. This argument is not required if :code:`-m`
  is followed by a TensorRT engine.

Optional Arguments
^^^^^^^^^^^^^^^^^^

* :code:`-h, --help`: show this help message and exit.