tensorrt.plugin.autotune#

tensorrt.plugin.autotune(plugin_id: str) Callable#

Wraps a function to define autotune logic for a plugin already registered through trt.plugin.register.

Autotuning is the process by which TensorRT executes the plugin over IO type/format combinations, and any custom tactics advertised as being supported by the plugin. The (type, format, tactic) combination with the lowest latency is used to execute the plugin once the engine is built.

Note

An autotune function is optional. If not specified, TensorRT will assume the plugin only supports input types specified at network creation, output types specifeid through trt.plugin.register, and linear formats for all I/O.

This API is only intended to be used as a decorator. The decorated function is not required to have type hints for input arguments or return value; however, any type hints specified will be validated against the trt.plugin.register signature for consistency.

The schema for the function is as follows:

(inp0: TensorDesc, inp1: TensorDesc, ..., attr0: SupportedAttrType, attr1: SupportedAttrType, outputs: Tuple[TensorDesc]) -> List[AutoTuneCombination]
  • Input tensors are passed first, each described by a TensorDesc.

  • Plugin attributes are declared next. Not all attributes included in trt.plugin.register must be specified here – they could be a subset.

  • The function should return a list of AutoTuneCombinations.

Parameters

plugin_id – The ID for the plugin in the form “{namespace}::{name}”, which must match that used during trt.plugin.register

An elementwise add plugin which supports both FP32 and FP16 linear I/O and wants to be tuned over 2 custom tactics.#
 1import tensorrt.plugin as trtp
 2
 3@trtp.register("my::add_plugin")
 4def add_plugin_desc(inp0: trtp.TensorDesc, block_size: int) -> Tuple[trtp.TensorDesc]:
 5    return inp0.like()
 6
 7@trtp.autotune("my::add_plugin")
 8def add_plugin_autotune(inp0: trtp.TensorDesc, block_size: int, outputs: Tuple[trtp.TensorDesc]) -> List[trtp.AutoTuneCombination]:
 9
10    return [trtp.AutoTuneCombination("FP32|FP16, FP32|FP16", "LINEAR", [1, 2])]
Same as above example but using index-by-index construction of an AutoTuneCombination#
 1import tensorrt.plugin as trtp
 2
 3@trtp.register("my::add_plugin")
 4def add_plugin_desc(inp0: trtp.TensorDesc, block_size: int) -> Tuple[trtp.TensorDesc]:
 5    return inp0.like()
 6
 7@trtp.autotune("my::add_plugin")
 8def add_plugin_autotune(inp0: trtp.TensorDesc, block_size: int, outputs: Tuple[trtp.TensorDesc]) -> List[trtp.AutoTuneCombination]:
 9    c = trtp.AutoTuneCombination()
10    c.pos(0, "FP32|FP16", "LINEAR")
11    c.pos(1, "FP32|FP16") # index 1 is the output. Omitting format is the same as declaring it to be LINEAR.
12    c.tactics([1, 2])
13    return [c]

See also: