Evaluating the Model - NVIDIA Docs

Once the model has been trained, using the experiment config file, and by following the steps to train a model, the next step would be to evaluate this model on a test set to measure the accuracy of the model. The TLT toolkit includes the tlt-evaluate command to do this.

The classification app computes evaluation loss, Top-k accuracy, precision and recall as metrics. Meanwhile, tlt-evaluate for DetectNet_v2, FasterRCNN, Retinanet, DSSD, YOLOV3, and SSD computes the Average Precision per class and the mean Average Precision metrics as defined in the Pascal VOC challenge. Both sample and integrate modes are supported to calculate average precision. The former was used in VOC challenges before 2010 while the latter was used from 2010 onwards. The SAMPLE mode uses an 11-point method to compute the AP, while the INTEGRATE mode uses a more fine-grained integration method and gets a more accurate number of AP. MaskRCNN reports COCO’s detection evaluation metrics. AP50 in COCO metrics is comparable to mAP in Pascal VOC metrics.

When training is complete, the model is stored in the output directory of your choice in $OUTPUT_DIR. Evaluate a model using the tlt-evaluate command:

Copy
Copied!

            
            tlt-evaluate {classification,detectnet_v2,faster_rcnn,ssd,dssd,retinanet,yolo, mask_rcnn} [-h] [<arguments for classification/detectnet_v2/faster_rcnn/ssd/dssd/retinanet/yolo, mask_rcnn>]

Required Arguments

Copy
Copied!

            
            {classification, detectnet_v2, faster_rcnn, ssd, dssd, retinanet, yolo,
      mask_rcnn}

Choose whether you are evaluating a classification, detectnet_v2, ssd, dssd, yolo, retinanet, faster_rcnn, or mask_rcnn model.

Optional Arguments

These arguments vary depending upon Classification, DetectNet_v2, SSD, DSSD, RetinaNet, YOLOv3, FasterRCNN and MaskRCNN models.

Evaluating a Classification Model

Execute tlt-evaluate on a classification model.

Copy
Copied!

            
            tlt-evaluate classification [-h] -e <experiment_spec_file> -k <key>

Required Arguments

-e, --experiment_spec_file: Path to the experiment spec file.
-k, –key: Provide the encryption key to decrypt the model.

Optional Arguments

-h, --help: Show this help message and exit.

If you followed the example in Training a classification model, you can run the evaluation:

Copy
Copied!

            
            tlt-evaluate classification -e classification_spec.cfg -k $YOUR_KEY

TLT evaluate for classification produces the following metrics:

Loss
Top-K accuracy
Precision (P): TP / (TP + FP)
Recall (R): TP / (TP + FN)
Confusion Matrix

Evaluating a DetectNet_v2 Model

Execute tlt-evaluate on a DetectNet_v2 model.

Copy
Copied!

            
            tlt-evaluate detectnet_v2 [-h] -e <experiment_spec
                               -m <model_file>
                               -k <key>
                               [--use_training_set]

Required Arguments

-e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as training spec file.
-m, --model: Path to the model file to use for evaluation. This could be a .tlt model file or a tensorrt engine generated using the tlt-export tool.
-k, -–key: Provide the encryption key to decrypt the model. This is a required argument only with a .tlt model file.

Optional Arguments

-h, --help: show this help message and exit.
-f, --framework: the framework to use when running evaluation (choices: “tlt”, “tensorrt”). By default the framework is set to TensorRT.
--use_training_set: Set this flag to run evaluation on training + validation dataset.

If you have followed the example in Training a Detection Model, you may now evaluate the model using the following command:

Copy
Copied!

            
            tlt-evaluate detectnet_v2 -e <path to training spec file>
                          -m <path to the model>
                          -k <key to load the model>

Note

This command runs evaluation on the same validation set that was used during training.

Use these steps to evaluate on a test set with ground truth labeled:

Create tfrecords for this training set by following the steps listed in the data input section.
Update the dataloader configuration part of the training spec file to include the newly generated tfrecords. For more information on the dataset config, please refer to Create an experiment spec file. You may create the tfrecords with any partition mode (sequence/random). The evaluate tool iterates through all the folds in the tfrecords patterns mentioned in the validation_data_source.

Copy
Copied!

            
            dataset_config {
  data_sources: {
    tfrecords_path: "<path to training tfrecords root>/<tfrecords_name*>"
    image_directory_path: "<path to training data root>"
  }
  image_extension: "jpg"
  target_class_mapping {
      key: "car"
      value: "car"
  }
  target_class_mapping {
      key: "automobile"
      value: "car"
  }
  ..
  ..
  ..
  target_class_mapping {
      key: "person"
      value: "pedestrian"
  }
  target_class_mapping {
      key: "rider"
      value: "cyclist"
  }
  validation_data_source: {
    tfrecords_path: "<path to testing tfrecords root>/<tfrecords_name*>"
    image_directory_path: "<path to testing data root>"
  }
}

The rest of the experiment spec file remains the same as the training spec file.

Evaluating a FasterRCNN Model

To run evaluation for a faster_rcnn model use this command:

Copy
Copied!

            
            tlt-evaluate faster_rcnn [-h] -e <experiment_spec>
                              [-k <enc_key>]

Required Arguments

-e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as a training spec file.

Optional Arguments

-h, --help: show this help message and exit.
-k, --enc_key：The encoding key, can override the one in the spec file.

Evaluation Metrics

For FasterRCNN, the evaluation will print out 4 metrics for the evaluated model: AP(average precision), precision, recall and RPN_recall for each class in the evaluation dataset. inally, it will also print the mAP(mean average precision) as a single metric number. Two modes are supported for computing the AP, i.e., the PASCAL VOC 2007 and 2012 metrics. This can be configured in the spec file’s evaluation_config.use_voc_11_point_metric parameter. If this parameter is set to True, then AP calculation will use VOC 2007 method, otherwise it will use VOC 2012 method.

The RPN_recall metric indicates the recall capability of the RPN of the FasterRCNN model. The higher the RPN_recall metric, it means RPN can better detect an object as foreground(but it doesn’t say anything on which class this object belongs to since that is delegated to RCNN). The RPN_recall metric is mainly used for debugging on the accuracy issue of a FasterRCNN model.

Two Modes for tlt-evaluate

The tlt-evaluate command line for FasterRCNN has two modes. It can run with either TLT backend or TensorRT backend. This behavior is also controlled via the spec file. The evaluation_config in the spec file can have an optional trt_evaluation sub-field that specifies which backend the tlt-evaluate will run with.

By default (if the trt_evaluation sub-field is not present in evaluation_config), tlt-evaluate will use TLT as the backend. If the trt_evaluation sub-field is present, it can specify tlt-evaluate to run at TensorRT backend. In that case, the model to do inference can be either the .etlt model from tlt-export or the TensorRT engine file from tlt-export or tlt-converter.

To use a TensorRT engine file for TensorRT backend based tlt-evaluate, the trt_evaluation sub-field should look like this:

Copy
Copied!

            
            trt_evaluation {
trt_engine: '/workspace/tlt-experiments/data/faster_rcnn/trt.int8.engine'
max_workspace_size_MB: 2000
}

To use a .etlt model for TensorRT backend based tlt-evaluate, the trt_evaluation sub-field should look like this:

Copy
Copied!

            
            trt_evaluation {
etlt_model {
model: '/workspace/tlt-experiments/data/faster_rcnn/resnet18.epoch12.etlt'
calibration_cache: '/workspace/tlt-experiments/data/faster_rcnn/cal.bin'
}
trt_data_type: 'int8'
max_workspace_size_MB: 2000
}

If the TensorRT inference data type is not INT8, the calibration_cache sub-field that provides the path to the INT8 calibration cache is not needed. In INT8 case, the calibration cache should be generated via the tlt-export command line in INT8 mode. See also the documentation of FasterRCNN spec file for the details of the trt_evaluation message structure.

Evaluating an SSD Model

To run evaluation for an SSD model use this command:

Copy
Copied!

            
            tlt-evaluate ssd [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments

-e, --experiment_spec_fil: Experiment spec file to set up the evaluation experiment. This should be the same as the training specification file.
-m, --model: Path to the model file to use for evaluation.
-k, --key: Provide the key to load the model.

Optional Arguments

-h, --help: show this help message and exit.

Evaluating a DSSD model

To run evaluation for an DSSD model use this command:

Copy
Copied!

            
            tlt-evaluate ssd [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments

-e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as training spec file.
-m, --model: Path to the model file to use for evaluation.
-k, --key: Provide the key to load the model.

Optional Arguments

-h, --help: Show this help message and exit.

Evaluating a YOLOv3 Model

To run evaluation for a YOLOv3 model use this command:

Copy
Copied!

            
            tlt-evaluate yolo [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments

-e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as the training specification file.
-m, --model: Path to the model file to use for evaluation.
-k, --key: Provide the key to load the model.

Optional Arguments

-h, --help: show this help message and exit.

Evaluating a RetinaNet Model

To run evaluation for a RetinaNet model use this command:

Copy
Copied!

            
            tlt-evaluate retinanet [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments

-e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as the training specification file.
-m, --model: Path to the model file to use for evaluation.
-k, --key: Provide the key to load the model.

Optional Arguments

-h, --help: Show this help message and exit.

Evaluating a MaskRCNN Model

To run evaluation for a MaskRCNN model use this command:

Copy
Copied!

            
            tlt-evaluate mask_rcnn [-h] -e <experiment_spec_file> -m <model_file> -k <key>

Required Arguments

-e, --experiment_spec_file: Experiment spec file to set up the evaluation experiment. This should be the same as the training spec file.
-m, --model: Path to the model file to use for evaluation.
-k, --key: Provide the key to load the model. This argument is not required if -m is followed by a TensorRT engine.

Optional Arguments

-h, --help: show this help message and exit.