Accelerating TensorFlow 1.12.0-rc2 With TensorRT 5.0.2 Using The 18.11 Container

These release notes are for accelerating TensorFlow 1.12.0-rc2 with TensorRT version 5.0.2 using the TensorFlow 18.11 container. For specific details about TensorRT, see the TensorRT 5.0.2 Release Notes.

Key Features and Enhancements

This release includes the following key features and enhancements.
  • Added support for dilated convolution.

  • Fixed a bug in the Identity op.

  • Fixed a bug in the Relu6 op.

  • Support added to allow empty const tensor.

  • Added object detection example to nvidia-examples/inference.

Compatibility

Deprecated Features

  • Support for accelerating TensorFlow with TensorRT 3.x will be removed in a future release (likely TensorFlow 1.13). The generated plan files are not portable across platforms or TensorRT versions. Plans are specific to the exact GPU model they were built on (in addition to platforms and the TensorRT version) and must be retargeted to the specific GPU in case you want to run them on a different GPU. Therefore, models that were accelerated using TensorRT 3.x will no longer run. If you have a production model that was accelerated with TensorRT 3.x, you will need to convert your model with TensorRT 4.x or later again.

    For more information, see the Note in Serializing A Model In C++ or Serializing A Model In Python.

Known Issues

  • In the TF-TRT API, the minimum_segment_size argument default value is 3. In the image classification examples under nvidia-examples/inference, we define a command line argument for minimum_segment_size which has its own default value. In 18.10, the default value was 7 and in 18.11 we changed it to 2. Smaller values for this argument would cause to convert more TensorFlow nodes to TensorRT which typically should improve the performance, however, we have observed cases where the performance gets worse. In particular, Resnet-50 with smaller batch sizes gets slower with minimum_segment_size=2 comparing to minimum_segment_size=7.