Using Inference on a Model

The tlt-infer command runs the inference on a specified set of input images. In the classification mode, tlt-infer provides class label output over the command line for a single image or a csv file containing the image path and the corresponding labels for multiple images. In DetectNet_v2, SSD, RetinaNet, DSSD, YOLOV3, or FasterRCNN mode, tlt-infer produces output images with bounding boxes rendered on them after inference. Optionally, you can also serialize the output meta-data in kitti_format. In MaskRCNN, tlt-infer produces annotated images with bounding boxes and masks rendered on them after inference. TensorRT python inference can also be enabled.

Running Inference on a Classification Model

Execute tlt-infer on a classification model trained on the Transfer Learning Toolkit.

tlt-infer classification [-h]
                          -m <model>
                          -i <image>
                          -d <image  dir>
                         [-b <batch size>]
                          -k <key>
                          -cm <classmap>

Here are the parameters of the tlt-infer tool:

Required arguments

  • -m, --model: Path to the pretrained model (TLT model).

  • -i, --image: A single image file for inference.

  • -d, --image_dir: The directory of input images for inference.

  • -k, --key: Key to load model.

  • -cm, --class_map: The json file that specifies the class index and label mapping.

Optional arguments

  • --batch_size: Inference batch size, default: 1

  • -h, --help: show this help message and exit


The inference tool requires a cluster_params.json file to configure the post processing block. When executing with -d, or directory mode, a result.csv file will be created and stored in the directory you specify using -d. The result.csv has the file path in the first column and predicted labels in the second.


In both single image and directory modes, a classmap (-cm) is required, which should be a byproduct (-classmap.json) of your training process.

Running Inference on a DetectNet_v2 Model

The tlt-infer tool for object detection networks which may be used to visualize bboxes, or generate frame by frame kitti format labels on a single image or a directory of images. An example of the command for this tool is shown here:

tlt-infer detectnet_v2 [-h] -e </path/to/inference/spec/file> \
          -i </path/to/inference/input> \
          -o </path/to/inference/output> \
          -k <model key>

Required Parameters

  • -e, --inference_spec: Path to an inference spec file.

  • -i, --inference_input: The directory of input images or a single image for inference.

  • -o, --inference_output: The directory to the output images and labels. The annotated images are in inference_output/images_annotated and labels are in inference_output/labels.

  • -k, --enc_key: Key to load model

The tool automatically generates bbox rendered images in output_path/images_annotated. In order to get the bbox labels in KITTI format, please configure the bbox_handler_config spec file using the kitti_dump parameter as mentioned here. This will generate the output in output_path/labels.

Running Inference on a FasterRCNN Model

The tlt-infer tool for FasterRCNN networks can be used to visualize bboxes, or generate frame by frame KITTI format labels on a directory of images. You can execute this tool from the command line as shown here:

tlt-infer faster_rcnn [-h] -e <experiment_spec> [-k <enc_key>]

Required Arguments

  • -e, --experiment_spec_file: Path to the experiment specification file for FasterRCNN training.

Optional Arguments

  • -h, --help: Print help log and exit.

  • -k, --enc_key: The encoding key, can override the one in the spec file.

Two Modes for tlt-infer

The tlt-infer command line for FasterRCNN has two modes. It can run with either TLT backend or TensorRT backend. This behavior is also controlled via the spec file. The inference_config in the spec file can have an optional trt_inference sub-field that specifies which backend the tlt-infer will run with. By default(if the trt_inference sub-field is not present in inference_config), tlt-infer will use TLT as the backend. If the trt_inference sub-field is present, it can specify tlt-infer to run at TensorRT backend. In that case, the model to do inference can be either the .etlt model from tlt-export or the TensorRT engine file from tlt-export or tlt-converter.

To use a TensorRT engine file for TensorRT backend based tlt-infer, the trt_inference sub-field should look like this:

trt_inference {
trt_engine: '/workspace/tlt-experiments/data/faster_rcnn/trt.int8.engine'

To use a .etlt model for TensorRT backend based tlt-infer, the trt_inference sub-field should look like this:

trt_inference {
etlt_model {
model: '/workspace/tlt-experiments/data/faster_rcnn/resnet18.epoch12.etlt'
calibration_cache: '/workspace/tlt-experiments/data/faster_rcnn/cal.bin'
trt_data_type: 'int8'

If the TensorRT inference data type is not INT8, the calibration_cache sub-field that provides the path to the INT8 calibration cache is not needed. In INT8 case, the calibration cache should be generated via the tlt-export command line in INT8 mode. See also the documentation of FasterRCNN spec file for the details of the trt_inference message structure.

Running Inference on an SSD Model

The tlt-infer tool for SSD networks can be used to visualize bboxes, or generate frame by frame KITTI format labels on a directory of images. Here’s an example of using this tool:

tlt-infer ssd  -i <input directory>
               -o <output annotated image directory>
               -e <experiment spec file>
               -m <model file>
               [-l <output label directory>]
               [-t <visualization threshold>]
               -k <key>

Required Arguments

  • -m, --model: Path to the pretrained model (TLT model).

  • -i, --in_image_dir: The directory of input images for inference.

  • -o, --out_image_dir: The directory path to output annotated images.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

Optional Arguments

  • -t, --draw_conf_thres: Threshold for drawing a bbox. default: 0.3

  • -h, --help: Show this help message and exit

  • -l, --out_label_dir: The directory to output KITTI labels.

Running Inference on a DSSD Model

The tlt-infer tool for DSSD networks can be used to visualize bboxes, or generate frame by frame KITTI format labels on a directory of images. Here’s an example of using this tool:

tlt-infer dssd  -i <input directory>
               -o <output annotated image directory>
               -e <experiment spec file>
               -m <model file>
               [-l <output label directory>]
               [-t <visualization threshold>]
               -k <key>

Required Arguments

  • -m, --model: Path to the pretrained model (TLT model).

  • -i, --in_image_dir: The directory of input images for inference.

  • -o, --out_image_dir: The directory path to output annotated images.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

Optional Arguments

  • -t, --draw_conf_thres: Threshold for drawing a bbox. default: 0.3

  • -h, --help: Show this help message and exit

  • -l, --out_label_dir: The directory to output KITTI labels.

Running Inference on a YOLOv3 Model

The tlt-infer tool for YOLOv3 networks can be used to visualize bboxes, or generate frame by frame KITTI format labels on a directory of images. Here’s an example of using this tool:

tlt-infer yolo -i <input directory>
               -o <output annotated image directory>
               -e <experiment spec file>
               -m <model file>
               [-l <output label directory>]
               [-t <visualization threshold>]
               -k <key>

Required Arguments

  • -m, --mode: Path to the pretrained model (TLT model).

  • -i, --in_image_dir: The directory of input images for inference.

  • -o, --out_image_dir: The directory path to output annotated images.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

Optional Arguments

  • -t, --draw_conf_thres: Threshold for drawing a bbox. default: 0.3

  • -h, --help: Show this help message and exit

  • -l, --out_label_dir: The directory to output KITTI labels.

Running Inference on a RetinaNet Model

The tlt-infer tool for RetinaNet networks can be used to visualize bboxes, or generate frame by frame KITTI format labels on a directory of images. Two modes are supported, namely TLT model model and TensorRT engine mode. You can execute the TLT model mode using the following command:

tlt-infer retinanet -i <input directory>
               -o <output annotated image directory>
               -e <experiment spec file>
               -m <model file>
               [-l <output label directory>]
               [-t <visualization threshold>]
               -k <key>

Required Arguments

  • -m, --model: Path to the pretrained model (TLT model).

  • -i, --in_image_dir: The directory of input images for inference.

  • -o, --out_image_dir: The directory path to output annotated images.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

Optional Arguments

  • -t, --draw_conf_thres: Threshold for drawing a bbox. default: 0.3

  • -h, --help: Show this help message and exit

  • -l, --out_label_dir: The directory to output KITTI labels.

Alternatively, you can execute the TensorRT engine mode as follows:

tlt-infer retinanet -i <input directory>
               -o <output annotated image directory>
               -e <experiment spec file>
               -p <engine path>
               [-t <visualization threshold>]
               -k <key>

Required Arguments

  • -p, --engine_path: Path to the TensorRT (TLT exported).

  • -i, --in_image_dir: The directory of input images for inference.

  • -o, --out_image_dir: The directory path to output annotated images.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

Optional Arguments

  • -t, --draw_conf_thres: Threshold for drawing a bbox. default: 0.3

  • -h, --help: Show this help message and exit

  • -l, --out_label_dir: The directory to output KITTI labels.

Alternatively, you can execute the TensorRT engine mode as follows:

tlt-infer retinanet -i <input directory>
               -o <output annotated image directory>
               -e <experiment spec file>
               -p <engine path>
               [-t <visualization threshold>]
               -k <key>

Required Arguments

  • -p, --engine_path: Path to the TensorRT (TLT exported).

  • -i, --in_image_dir: The directory of input images for inference.

  • -o, --out_image_dir: The directory path to output annotated images.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

Optional arguments

  • -t, --draw_conf_thres: Threshold for drawing a bbox. default: 0.3

  • -h, --help: Show this help message and exit

Running Inference on a MaskRCNN Model

The tlt-infer tool for MaskRCNN networks can be used to visualize bboxes, or generate frame by frame COCO format labels on a directory of images. Here’s an example of using this tool:

tlt-infer mask_rcnn -i <input directory>
               -o <output annotated image directory>
               -e <experiment spec file>
               -m <model file>
               [-l <label file>]
               [-b <batch size>]
               [-t <visualization threshold>]
               -k <key>

Required Arguments

  • -m, --model: Path to the trained model (TLT model).

  • -i, --input_dir: The directory of input images for inference.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

  • -o, --out_dir: The directory path to output annotated images.

Optional Arguments

  • -t, --threshold: Threshold for drawing a bbox. default: 0.3.

  • -h, --help: Show this help message and exit.

  • -l, --label_file: The label txt file containing groundtruth class labels.

  • --include_mask: Whether to draw masks on the annotated output.

When calling tlt-infer with --trt, the command expects a TensorRT engine as input:

tlt-infer mask_rcnn --trt
               -i <input image>
               -o <output annotated image>
               -e <experiment spec file>
               -m <TensorRT engine file>
               [-l <output label file>]
               [-c <class label file>]
               [-t <visualization threshold>]
               [-mt <mask_threshold>]

Required Arguments

  • -m, --model: Path to the trained model (TLT model).

  • -i, --in_image_path: A directory of input images or a single image file for inference.

  • -k, --key: Key to load model.

  • -e, --config_path: Path to an experiment spec file for training.

Optional Arguments

  • -t, --threshold: Confidence threshold for drawing a bbox. Default: 0.6.

  • -mt, --mask_threshold: Confidence threshold for drawing a mask. Default: 0.4.

  • -o, --out_image_path: The output directory of annotated images or a single annotated image file.

  • -c, --class_label: The path to groundtruth label file. If used, the annotated image will display label names.

  • -l, --out_label_file: The output directory of predicted labels in json format or a single json file.

  • --include_mask: Whether to draw masks on the annotated output.