Abstract

TensorRT optimizes the largest subgraphs possible in the TensorFlow graph. The more compute in the subgraph, the greater benefit obtained from TensorRT. You want most of the graph optimized and replaced with the fewest number of TensorRT nodes for best performance. Based on the operations in your graph, it’s possible that the final graph might have more than one TensorRT node. TensorFlow integration with TensorRT (TF-TRT) optimizes and executes compatible subgraphs, allowing TensorFlow to execute the remaining graph. This document describes the key features, software enhancements and improvements, and known issues when integrating TensorRT.