Accelerating TensorFlow 1.14.0 With TensorRT 5.1.5 Using The 19.07 or 19.08 Container

These release notes are for accelerating TensorFlow 1.14.0 with TensorRT version 5.1.5 using the TensorFlow 19.07NGC container or the TensorFlow 19.08NGC container. For specific details about TensorRT, see the TensorRT 5.1.5 Release Notes.

Key Features And Enhancements

This release includes the following key features and enhancements.
  • Migrated TensorRT conversion sources from the contrib directory to the compilerdirectory in preparation for TensorFlow 2.0. The Python code can be found at //tensorflow/python/compiler/tensorrt.
  • Added a user friendly TrtGraphConverter API for TensorRT conversion.
  • Expanded support for TensorFlow operators in TensorRT conversion (for example, Gather, Slice, Pack, Unpack, ArgMin, ArgMax, DepthSpaceShuffle). Refer to the TF-TRT User Guide for a complete list of supported operators.
  • Support added for TensorFlow operator CombinedNonMaxSuppression in TensorRT conversion which significantly accelerates SSD object detection models.
  • Integrated TensorRT 5.1.5 into TensorFlow. See the TensorRT 5.1.5 Release Notes for a full list of new features.

Compatibility

Limitations Of Accelerating TensorFlow With TensorRT

  • TF-TRT is not supported in the TensorRT containers.

Deprecated Features

  • The old API of TF-TRT is deprecated. It still works in TensorFlow 1.14, however, it may be removed in TensorFlow 2.0. The old API is a Python function named create_inference_graph which is not replaced by the Python class TrtGraphConverter with a number of methods. Refer to TF-TRT User Guide for more information about the API and how to use it.

Known Issues

  • Precision mode in the TF-TRT API is a string with one of the following values: FP32, FP16 or INT8. In TensorFlow 1.13, these strings were supported in lowercase, however, in TensorFlow 1.14 only uppercase is supported.
  • INT8 calibration (see the TF-TRT User Guide for more information about how to use INT8) is a very slow process that can take 1 hour depending on the model. We are working on optimizing this algorithm in TensorRT.
  • The pip package of TensorFlow 1.14 released by Google is missing TensorRT. This will be fixed in the next release of TensorFlow by Google. In the meantime, you can use the NVIDIA container for TensorFlow.