The TensorRT API version 1 namespace. More...

Namespaces
namespace	anonymous_namespace{NvInfer.h}

namespace	anonymous_namespace{NvInferRuntime.h}

namespace	apiv

namespace	consistency

namespace	impl

namespace	plugin

namespace	safe
	The safety subset of TensorRT's API version 1 namespace.

namespace	serialize

namespace	v_1_0
	Forward declare IErrorRecorder for use in other interfaces.

Classes
class	Dims2
	Descriptor for two-dimensional data. More...

class	Dims3
	Descriptor for three-dimensional data. More...

class	Dims4
	Descriptor for four-dimensional data. More...

class	Dims64

class	DimsExprs
	Analog of class Dims with expressions instead of constants for the dimensions. More...

class	DimsHW
	Descriptor for two-dimensional spatial data. More...

struct	DynamicPluginTensorDesc
	Summarizes tensors that a plugin might see for an input or output. More...

class	IActivationLayer
	An Activation layer in a network definition. More...

class	IAlgorithm
	Describes a variation of execution of a layer. An algorithm is represented by IAlgorithmVariant and the IAlgorithmIOInfo for each of its inputs and outputs. An algorithm can be selected or reproduced using AlgorithmSelector::selectAlgorithms(). More...

class	IAlgorithmContext
	Describes the context and requirements, that could be fulfilled by one or more instances of IAlgorithm. More...

class	IAlgorithmIOInfo
	Carries information about input or output of the algorithm. IAlgorithmIOInfo for all the input and output along with IAlgorithmVariant denotes the variation of algorithm and can be used to select or reproduce an algorithm using IAlgorithmSelector::selectAlgorithms(). More...

class	IAlgorithmVariant
	provides a unique 128-bit identifier, which along with the input and output information denotes the variation of algorithm and can be used to select or reproduce an algorithm, using IAlgorithmSelector::selectAlgorithms() More...

class	IAssertionLayer
	An assertion layer in a network. More...

class	IBuilder
	Builds an engine from a network definition. More...

class	IBuilderConfig
	Holds properties for configuring a builder to produce an engine. More...

class	ICastLayer
	A cast layer in a network. More...

class	IConcatenationLayer
	A concatenation layer in a network definition. More...

class	IConditionLayer
	This layer represents a condition input to an IIfConditional. More...

class	IConstantLayer
	Layer that represents a constant value. More...

class	IConvolutionLayer
	A convolution layer in a network definition. More...

class	ICudaEngine
	An engine for executing inference on a built network, with functionally unsafe features. More...

class	IDeconvolutionLayer
	A deconvolution layer in a network definition. More...

class	IDequantizeLayer
	A Dequantize layer in a network definition. More...

class	IDimensionExpr
	An IDimensionExpr represents an integer expression constructed from constants, input dimensions, and binary operations. These expressions are can be used in overrides of IPluginV2DynamicExt::getOutputDimensions or IPluginV3OneBuild::getOutputShapes() to define output dimensions in terms of input dimensions. More...

class	IEinsumLayer
	An Einsum layer in a network. More...

class	IElementWiseLayer
	A elementwise layer in a network definition. More...

class	IEngineInspector
	An engine inspector which prints out the layer information of an engine or an execution context. More...

class	IExecutionContext
	Context for executing inference using an engine, with functionally unsafe features. More...

class	IExprBuilder
	Object for constructing IDimensionExpr. More...

class	IFillLayer
	Generate a tensor according to a specified mode. More...

class	IGatherLayer
	A Gather layer in a network definition. Supports several kinds of gathering. More...

class	IGridSampleLayer
	A GridSample layer in a network definition. More...

class	IHostMemory
	Class to handle library allocated memory that is accessible to the user. More...

class	IIdentityLayer
	A layer that represents the identity function. More...

class	IIfConditional
	Helper for constructing conditionally-executed subgraphs. More...

class	IIfConditionalBoundaryLayer
	This is a base class for Conditional boundary layers. More...

class	IIfConditionalInputLayer
	This layer represents an input to an IIfConditional. More...

class	IIfConditionalOutputLayer
	This layer represents an output of an IIfConditional. More...

class	IInt8Calibrator
	Application-implemented interface for calibration. More...

class	IIteratorLayer
	A layer to do iterations. More...

class	ILayer
	Base class for all layer classes in a network definition. More...

class	ILogger
	Application-implemented logging interface for the builder, refitter and runtime. More...

class	ILoggerFinder
	A virtual base class to find a logger. Allows a plugin to find an instance of a logger if it needs to emit a log message. A pointer to an instance of this class is passed to a plugin shared library on initialization when that plugin is serialized as part of a version-compatible plan. See the plugin chapter in the developer guide for details. More...

class	ILoop
	Helper for creating a recurrent subgraph. More...

class	ILoopBoundaryLayer
	This is a base class for Loop boundary layers. More...

class	ILoopOutputLayer
	An ILoopOutputLayer is the sole way to get output from a loop. More...

class	ILRNLayer
	A LRN layer in a network definition. More...

class	IMatrixMultiplyLayer
	Layer that represents a Matrix Multiplication. More...

class	INetworkDefinition
	A network definition for input to the builder. More...

class	INMSLayer
	A non-maximum suppression layer in a network definition. More...

class	INoCopy
	Forward declaration of IEngineInspector for use by other interfaces. More...

class	INonZeroLayer

class	INormalizationLayer
	A normalization layer in a network definition. More...

class	InterfaceInfo
	Version information associated with a TRT interface. More...

class	IOneHotLayer
	A OneHot layer in a network definition. More...

class	IOptimizationProfile
	Optimization profile for dynamic input dimensions and shape tensors. More...

class	IPaddingLayer
	Layer that represents a padding operation. More...

class	IParametricReLULayer
	Layer that represents a parametric ReLU operation. More...

class	IPluginRegistry
	Single registration point for all plugins in an application. It is used to find plugin implementations during engine deserialization. Internally, the plugin registry is considered to be a singleton so all plugins in an application are part of the same global registry. Note that the plugin registry is only supported for plugins of type IPluginV2 and should also have a corresponding IPluginCreator implementation. More...

class	IPluginResourceContext
	Interface for plugins to access per context resources provided by TensorRT. More...

class	IPluginV2
	Plugin class for user-implemented layers. More...

class	IPluginV2DynamicExt
	Similar to IPluginV2Ext, but with support for dynamic shapes. More...

class	IPluginV2Ext
	Plugin class for user-implemented layers. More...

class	IPluginV2IOExt
	Plugin class for user-implemented layers. More...

class	IPluginV2Layer
	Layer type for pluginV2. More...

class	IPluginV3Layer
	Layer type for V3 plugins. More...

class	IPoolingLayer
	A Pooling layer in a network definition. More...

class	IQuantizeLayer
	A Quantize layer in a network definition. More...

class	IRaggedSoftMaxLayer
	A RaggedSoftmax layer in a network definition. More...

class	IRecurrenceLayer
	A recurrence layer in a network definition. More...

class	IReduceLayer
	Layer that represents a reduction across a non-bool tensor. More...

class	IRefitter
	Updates weights in an engine. More...

class	IResizeLayer
	A resize layer in a network definition. More...

class	IReverseSequenceLayer
	A ReverseSequence layer in a network definition. More...

class	IRuntime
	Allows a serialized functionally unsafe engine to be deserialized. More...

class	IScaleLayer
	A Scale layer in a network definition. More...

class	IScatterLayer
	A scatter layer in a network definition. Supports several kinds of scattering. More...

class	ISelectLayer
	Select elements from two data tensors based on a condition tensor. More...

class	ISerializationConfig
	Holds properties for configuring an engine to serialize the binary. More...

class	IShapeLayer
	Layer type for getting shape of a tensor. More...

class	IShuffleLayer
	Layer type for shuffling data. More...

class	ISliceLayer
	Slices an input tensor into an output tensor based on the offset and strides. More...

class	ISoftMaxLayer
	A Softmax layer in a network definition. More...

class	ITensor
	A tensor in a network definition. More...

class	ITimingCache
	Class to handle tactic timing info collected from builder. More...

class	ITopKLayer
	Layer that represents a TopK reduction. More...

class	ITripLimitLayer
	A layer that represents a trip-count limiter. More...

class	IUnaryLayer
	Layer that represents an unary operation. More...

class	IVersionedInterface
	An Interface class for version control. More...

struct	Permutation
	Represents a permutation of dimensions. More...

class	PluginField
	Structure containing plugin attribute field names and associated data This information can be parsed to decode necessary plugin metadata. More...

struct	PluginFieldCollection
	Plugin field collection struct. More...

class	PluginRegistrar
	Register the plugin creator to the registry The static registry object will be instantiated when the plugin library is loaded. This static object will register all creators available in the library to the registry. More...

struct	PluginTensorDesc
	Fields that a plugin might see for an input or output. More...

class	Weights
	An array of weights used as a layer parameter. More...

Typedefs
using	TensorFormats = uint32_t
	It is capable of representing one or more TensorFormat by binary OR operations, e.g., 1U << TensorFormat::kCHW4 \| 1U << TensorFormat::kCHW32. More...

using	IInt8EntropyCalibrator = v_1_0::IInt8EntropyCalibrator

using	IInt8EntropyCalibrator2 = v_1_0::IInt8EntropyCalibrator2

using	IInt8MinMaxCalibrator = v_1_0::IInt8MinMaxCalibrator

using	IInt8LegacyCalibrator = v_1_0::IInt8LegacyCalibrator

using	IAlgorithmSelector = v_1_0::IAlgorithmSelector

using	QuantizationFlags = uint32_t
	Represents one or more QuantizationFlag values using binary OR operations. More...

using	BuilderFlags = uint32_t
	Represents one or more BuilderFlag values using binary OR operations, e.g., 1U << BuilderFlag::kFP16 \| 1U << BuilderFlag::kDEBUG. More...

using	IProgressMonitor = v_1_0::IProgressMonitor

using	NetworkDefinitionCreationFlags = uint32_t
	Represents one or more NetworkDefinitionCreationFlag flags using binary OR operations. e.g., 1U << NetworkDefinitionCreationFlag::kSTRONGLY_TYPED. More...

using	IPluginCapability = v_1_0::IPluginCapability

using	IPluginV3 = v_1_0::IPluginV3

using	IPluginV3OneCore = v_1_0::IPluginV3OneCore

using	IPluginV3OneBuild = v_1_0::IPluginV3OneBuild

using	IPluginV3OneRuntime = v_1_0::IPluginV3OneRuntime

using	IPluginCreatorV3One = v_1_0::IPluginCreatorV3One

using	IProfiler = v_1_0::IProfiler

using	TempfileControlFlags = uint32_t
	Represents a collection of one or more TempfileControlFlag values combined using bitwise-OR operations. More...

using	TacticSources = uint32_t
	Represents a collection of one or more TacticSource values combine using bitwise-OR operations. More...

using	SerializationFlags = uint32_t
	Represents one or more SerializationFlag values using binary OR operations, e.g., 1U << SerializationFlag::kEXCLUDE_LEAN_RUNTIME. More...

using	IOutputAllocator = v_1_0::IOutputAllocator

using	IDebugListener = v_1_0::IDebugListener

using	IGpuAsyncAllocator = v_1_0::IGpuAsyncAllocator

using	char_t = char
	char_t is the type used by TensorRT to represent all valid characters. More...

using	AsciiChar = char_t

using	IErrorRecorder = v_1_0::IErrorRecorder

using	Dims = Dims64

using	InterfaceKind = char const *

using	AllocatorFlags = uint32_t

using	IGpuAllocator = v_1_0::IGpuAllocator

using	IStreamReader = v_1_0::IStreamReader

using	IPluginResource = v_1_0::IPluginResource

using	PluginFormat = TensorFormat
	PluginFormat is reserved for backward compatibility. More...

using	IPluginCreatorInterface = v_1_0::IPluginCreatorInterface

using	IPluginCreator = v_1_0::IPluginCreator

Enumerations
enum class	LayerType : int32_t { kCONVOLUTION = 0 , kCAST = 1 , kACTIVATION = 2 , kPOOLING = 3 , kLRN = 4 , kSCALE = 5 , kSOFTMAX = 6 , kDECONVOLUTION = 7 , kCONCATENATION = 8 , kELEMENTWISE = 9 , kPLUGIN = 10 , kUNARY = 11 , kPADDING = 12 , kSHUFFLE = 13 , kREDUCE = 14 , kTOPK = 15 , kGATHER = 16 , kMATRIX_MULTIPLY = 17 , kRAGGED_SOFTMAX = 18 , kCONSTANT = 19 , kIDENTITY = 20 , kPLUGIN_V2 = 21 , kSLICE = 22 , kSHAPE = 23 , kPARAMETRIC_RELU = 24 , kRESIZE = 25 , kTRIP_LIMIT = 26 , kRECURRENCE = 27 , kITERATOR = 28 , kLOOP_OUTPUT = 29 , kSELECT = 30 , kFILL = 31 , kQUANTIZE = 32 , kDEQUANTIZE = 33 , kCONDITION = 34 , kCONDITIONAL_INPUT = 35 , kCONDITIONAL_OUTPUT = 36 , kSCATTER = 37 , kEINSUM = 38 , kASSERTION = 39 , kONE_HOT = 40 , kNON_ZERO = 41 , kGRID_SAMPLE = 42 , kNMS = 43 , kREVERSE_SEQUENCE = 44 , kNORMALIZATION = 45 , kPLUGIN_V3 = 46 }
	The type values of layer classes. More...

enum class	ActivationType : int32_t { kRELU = 0 , kSIGMOID = 1 , kTANH = 2 , kLEAKY_RELU = 3 , kELU = 4 , kSELU = 5 , kSOFTSIGN = 6 , kSOFTPLUS = 7 , kCLIP = 8 , kHARD_SIGMOID = 9 , kSCALED_TANH = 10 , kTHRESHOLDED_RELU = 11 , kGELU_ERF = 12 , kGELU_TANH = 13 }
	Enumerates the types of activation to perform in an activation layer. More...

enum class	PaddingMode : int32_t { kEXPLICIT_ROUND_DOWN = 0 , kEXPLICIT_ROUND_UP = 1 , kSAME_UPPER = 2 , kSAME_LOWER = 3 }
	Enumerates the modes of padding to perform in convolution, deconvolution and pooling layer, padding mode takes precedence if setPaddingMode() and setPrePadding() are also used. More...

enum class	PoolingType : int32_t { kMAX = 0 , kAVERAGE = 1 , kMAX_AVERAGE_BLEND = 2 }
	The type of pooling to perform in a pooling layer. More...

enum class	ScaleMode : int32_t { kUNIFORM = 0 , kCHANNEL = 1 , kELEMENTWISE = 2 }
	Controls how shift, scale and power are applied in a Scale layer. More...

enum class	ElementWiseOperation : int32_t { kSUM = 0 , kPROD = 1 , kMAX = 2 , kMIN = 3 , kSUB = 4 , kDIV = 5 , kPOW = 6 , kFLOOR_DIV = 7 , kAND = 8 , kOR = 9 , kXOR = 10 , kEQUAL = 11 , kGREATER = 12 , kLESS = 13 }
	Enumerates the binary operations that may be performed by an ElementWise layer. More...

enum class	GatherMode : int32_t { kDEFAULT = 0 , kELEMENT = 1 , kND = 2 }
	Control form of IGatherLayer. More...

enum class	UnaryOperation : int32_t { kEXP = 0 , kLOG = 1 , kSQRT = 2 , kRECIP = 3 , kABS = 4 , kNEG = 5 , kSIN = 6 , kCOS = 7 , kTAN = 8 , kSINH = 9 , kCOSH = 10 , kASIN = 11 , kACOS = 12 , kATAN = 13 , kASINH = 14 , kACOSH = 15 , kATANH = 16 , kCEIL = 17 , kFLOOR = 18 , kERF = 19 , kNOT = 20 , kSIGN = 21 , kROUND = 22 , kISINF = 23 , kISNAN = 24 }
	Enumerates the unary operations that may be performed by a Unary layer. More...

enum class	ReduceOperation : int32_t { kSUM = 0 , kPROD = 1 , kMAX = 2 , kMIN = 3 , kAVG = 4 }
	Enumerates the reduce operations that may be performed by a Reduce layer. More...

enum class	SampleMode : int32_t { kSTRICT_BOUNDS = 0 , kWRAP = 1 , kCLAMP = 2 , kFILL = 3 , kREFLECT = 4 }
	Controls how ISliceLayer and IGridSample handle out-of-bounds coordinates. More...

enum class	TopKOperation : int32_t { kMAX = 0 , kMIN = 1 }
	Enumerates the operations that may be performed by a TopK layer. More...

enum class	MatrixOperation : int32_t { kNONE = 0 , kTRANSPOSE = 1 , kVECTOR = 2 }
	Enumerates the operations that may be performed on a tensor by IMatrixMultiplyLayer before multiplication. More...

enum class	InterpolationMode : int32_t { kNEAREST = 0 , kLINEAR = 1 , kCUBIC = 2 }
	Enumerates various modes of interpolation. More...

enum class	ResizeCoordinateTransformation : int32_t { kALIGN_CORNERS = 0 , kASYMMETRIC = 1 , kHALF_PIXEL = 2 }
	The resize coordinate transformation function. More...

enum class	ResizeSelector : int32_t { kFORMULA = 0 , kUPPER = 1 }
	The coordinate selector when resize to single pixel output. More...

enum class	ResizeRoundMode : int32_t { kHALF_UP = 0 , kHALF_DOWN = 1 , kFLOOR = 2 , kCEIL = 3 }
	The rounding mode for nearest neighbor resize. More...

enum class	LoopOutput : int32_t { kLAST_VALUE = 0 , kCONCATENATE = 1 , kREVERSE = 2 }

enum class	TripLimit : int32_t { kCOUNT = 0 , kWHILE = 1 }

enum class	FillOperation : int32_t { kLINSPACE = 0 , kRANDOM_UNIFORM = 1 , kRANDOM_NORMAL = 2 }
	Enumerates the tensor fill operations that may performed by a fill layer. More...

enum class	ScatterMode : int32_t { kELEMENT = 0 , kND = 1 }
	Control form of IScatterLayer. More...

enum class	BoundingBoxFormat : int32_t { kCORNER_PAIRS = 0 , kCENTER_SIZES = 1 }
	Representation of bounding box data used for the Boxes input tensor in INMSLayer. More...

enum class	CalibrationAlgoType : int32_t { kLEGACY_CALIBRATION = 0 , kENTROPY_CALIBRATION = 1 , kENTROPY_CALIBRATION_2 = 2 , kMINMAX_CALIBRATION = 3 }
	Version of calibration algorithm to use. More...

enum class	QuantizationFlag : int32_t { kCALIBRATE_BEFORE_FUSION = 0 }
	List of valid flags for quantizing the network to int8. More...

enum class	BuilderFlag : int32_t { kFP16 = 0 , kINT8 = 1 , kDEBUG = 2 , kGPU_FALLBACK = 3 , kREFIT = 4 , kDISABLE_TIMING_CACHE = 5 , kTF32 = 6 , kSPARSE_WEIGHTS = 7 , kSAFETY_SCOPE = 8 , kOBEY_PRECISION_CONSTRAINTS = 9 , kPREFER_PRECISION_CONSTRAINTS = 10 , kDIRECT_IO = 11 , kREJECT_EMPTY_ALGORITHMS = 12 , kVERSION_COMPATIBLE = 13 , kEXCLUDE_LEAN_RUNTIME = 14 , kFP8 = 15 , kERROR_ON_TIMING_CACHE_MISS = 16 , kBF16 = 17 , kDISABLE_COMPILATION_CACHE = 18 , kSTRIP_PLAN = 19 , kWEIGHTLESS = kSTRIP_PLAN , kREFIT_IDENTICAL = 20 , kWEIGHT_STREAMING = 21 , kINT4 = 22 }
	List of valid modes that the builder can enable when creating an engine from a network definition. More...

enum class	MemoryPoolType : int32_t { kWORKSPACE = 0 , kDLA_MANAGED_SRAM = 1 , kDLA_LOCAL_DRAM = 2 , kDLA_GLOBAL_DRAM = 3 , kTACTIC_DRAM = 4 , kTACTIC_SHARED_MEMORY = 5 }
	The type for memory pools used by TensorRT. More...

enum class	PreviewFeature : int32_t { kPROFILE_SHARING_0806 = 0 }
	Define preview features. More...

enum class	HardwareCompatibilityLevel : int32_t { kNONE = 0 , kAMPERE_PLUS = 1 }
	Describes requirements of compatibility with GPU architectures other than that of the GPU on which the engine was built. More...

enum class	NetworkDefinitionCreationFlag : int32_t { kEXPLICIT_BATCH = 0 , kSTRONGLY_TYPED = 1 }
	List of immutable network properties expressed at network creation time. NetworkDefinitionCreationFlag is used with createNetworkV2() to specify immutable properties of the network. More...

enum class	EngineCapability : int32_t { kSTANDARD = 0 , kSAFETY = 1 , kDLA_STANDALONE = 2 }
	List of supported engine capability flows. More...

enum class	DimensionOperation : int32_t { kSUM = 0 , kPROD = 1 , kMAX = 2 , kMIN = 3 , kSUB = 4 , kEQUAL = 5 , kLESS = 6 , kFLOOR_DIV = 7 , kCEIL_DIV = 8 }
	An operation on two IDimensionExpr, which represent integer expressions used in dimension computations. More...

enum class	TensorLocation : int32_t { kDEVICE = 0 , kHOST = 1 }
	The location for tensor data storage, device or host. More...

enum class	WeightsRole : int32_t { kKERNEL = 0 , kBIAS = 1 , kSHIFT = 2 , kSCALE = 3 , kCONSTANT = 4 , kANY = 5 }
	How a layer uses particular Weights. More...

enum class	DeviceType : int32_t { kGPU = 0 , kDLA = 1 }
	The device that this layer/network will execute on. More...

enum class	TempfileControlFlag : int32_t { kALLOW_IN_MEMORY_FILES = 0 , kALLOW_TEMPORARY_FILES = 1 }
	Flags used to control TensorRT's behavior when creating executable temporary files. More...

enum class	OptProfileSelector : int32_t { kMIN = 0 , kOPT = 1 , kMAX = 2 }
	When setting or querying optimization profile parameters (such as shape tensor inputs or dynamic dimensions), select whether we are interested in the minimum, optimum, or maximum values for these parameters. The minimum and maximum specify the permitted range that is supported at runtime, while the optimum value is used for the kernel selection. This should be the "typical" value that is expected to occur at runtime. More...

enum class	TacticSource : int32_t { kCUBLAS = 0 , kCUBLAS_LT = 1 , kCUDNN = 2 , kEDGE_MASK_CONVOLUTIONS = 3 , kJIT_CONVOLUTIONS = 4 }
	List of tactic sources for TensorRT. More...

enum class	ProfilingVerbosity : int32_t { kLAYER_NAMES_ONLY = 0 , kNONE = 1 , kDETAILED = 2 }
	List of verbosity levels of layer information exposed in NVTX annotations and in IEngineInspector. More...

enum class	SerializationFlag : int32_t { kEXCLUDE_WEIGHTS = 0 , kEXCLUDE_LEAN_RUNTIME = 1 }
	List of valid flags that the engine can enable when serializing the bytes. More...

enum class	ExecutionContextAllocationStrategy : int32_t { kSTATIC = 0 , kON_PROFILE_CHANGE = 1 , kUSER_MANAGED = 2 }
	Different memory allocation behaviors for IExecutionContext. More...

enum class	LayerInformationFormat : int32_t { kONELINE = 0 , kJSON = 1 }
	The format in which the IEngineInspector prints the layer information. More...

enum class	DataType : int32_t { kFLOAT = 0 , kHALF = 1 , kINT8 = 2 , kINT32 = 3 , kBOOL = 4 , kUINT8 = 5 , kFP8 = 6 , kBF16 = 7 , kINT64 = 8 , kINT4 = 9 }
	The type of weights and tensors. More...

enum class	TensorFormat : int32_t { kLINEAR = 0 , kCHW2 = 1 , kHWC8 = 2 , kCHW4 = 3 , kCHW16 = 4 , kCHW32 = 5 , kDHWC8 = 6 , kCDHW32 = 7 , kHWC = 8 , kDLA_LINEAR = 9 , kDLA_HWC4 = 10 , kHWC16 = 11 , kDHWC = 12 }
	Format of the input/output tensors. More...

enum class	APILanguage : int32_t { kCPP = 0 , kPYTHON = 1 }
	Programming language used in the implementation of a TRT interface. More...

enum class	AllocatorFlag : int32_t { kRESIZABLE = 0 }
	Allowed type of memory allocation. More...

enum class	ErrorCode : int32_t { kSUCCESS = 0 , kUNSPECIFIED_ERROR = 1 , kINTERNAL_ERROR = 2 , kINVALID_ARGUMENT = 3 , kINVALID_CONFIG = 4 , kFAILED_ALLOCATION = 5 , kFAILED_INITIALIZATION = 6 , kFAILED_EXECUTION = 7 , kFAILED_COMPUTATION = 8 , kINVALID_STATE = 9 , kUNSUPPORTED_STATE = 10 }
	Error codes that can be returned by TensorRT during execution. More...

enum class	TensorIOMode : int32_t { kNONE = 0 , kINPUT = 1 , kOUTPUT = 2 }
	Definition of tensor IO Mode. More...

enum class	PluginVersion : uint8_t { kV2 = 0 , kV2_EXT = 1 , kV2_IOEXT = 2 , kV2_DYNAMICEXT = 3 , kV2_DYNAMICEXT_PYTHON = kPLUGIN_VERSION_PYTHON_BIT \| 3 }

enum class	PluginCreatorVersion : int32_t { kV1 = 0 , kV1_PYTHON = kPLUGIN_VERSION_PYTHON_BIT }
	Enum to identify version of the plugin creator. More...

enum class	PluginFieldType : int32_t { kFLOAT16 = 0 , kFLOAT32 = 1 , kFLOAT64 = 2 , kINT8 = 3 , kINT16 = 4 , kINT32 = 5 , kCHAR = 6 , kDIMS = 7 , kUNKNOWN = 8 , kBF16 = 9 , kINT64 = 10 , kFP8 = 11 , kINT4 = 12 }
	The possible field types for custom layer. More...

enum class	PluginCapabilityType : int32_t { kCORE = 0 , kBUILD = 1 , kRUNTIME = 2 }
	Enumerates the different capability types a IPluginV3 object may have. More...

enum class	TensorRTPhase : int32_t { kBUILD = 0 , kRUNTIME = 1 }
	Indicates a phase of operation of TensorRT. More...

Functions
template<>
constexpr int32_t	EnumMax< LayerType > () noexcept

template<>
constexpr int32_t	EnumMax< ScaleMode > () noexcept

template<>
constexpr int32_t	EnumMax< GatherMode > () noexcept

template<>
constexpr int32_t	EnumMax< UnaryOperation > () noexcept

template<>
constexpr int32_t	EnumMax< ReduceOperation > () noexcept

template<>
constexpr int32_t	EnumMax< SampleMode > () noexcept

template<>
constexpr int32_t	EnumMax< TopKOperation > () noexcept

template<>
constexpr int32_t	EnumMax< MatrixOperation > () noexcept

template<>
constexpr int32_t	EnumMax< LoopOutput > () noexcept

template<>
constexpr int32_t	EnumMax< TripLimit > () noexcept

template<>
constexpr int32_t	EnumMax< FillOperation > () noexcept

template<>
constexpr int32_t	EnumMax< ScatterMode > () noexcept

template<>
constexpr int32_t	EnumMax< BoundingBoxFormat > () noexcept

template<>
constexpr int32_t	EnumMax< CalibrationAlgoType > () noexcept

template<>
constexpr int32_t	EnumMax< QuantizationFlag > () noexcept

template<>
constexpr int32_t	EnumMax< BuilderFlag > () noexcept

template<>
constexpr int32_t	EnumMax< MemoryPoolType > () noexcept

template<>
constexpr int32_t	EnumMax< NetworkDefinitionCreationFlag > () noexcept

nvinfer1::IPluginRegistry *	getBuilderPluginRegistry (nvinfer1::EngineCapability capability) noexcept
	Return the plugin registry for building a Standard engine, or nullptr if no registry exists. More...

nvinfer1::safe::IPluginRegistry *	getBuilderSafePluginRegistry (nvinfer1::EngineCapability capability) noexcept
	Return the plugin registry for building a Safety engine, or nullptr if no registry exists. More...

template<>
constexpr int32_t	EnumMax< DimensionOperation > () noexcept
	Maximum number of elements in DimensionOperation enum. More...

template<>
constexpr int32_t	EnumMax< WeightsRole > () noexcept
	Maximum number of elements in WeightsRole enum. More...

template<>
constexpr int32_t	EnumMax< DeviceType > () noexcept
	Maximum number of elements in DeviceType enum. More...

template<>
constexpr int32_t	EnumMax< TempfileControlFlag > () noexcept
	Maximum number of elements in TempfileControlFlag enum. More...

template<>
constexpr int32_t	EnumMax< OptProfileSelector > () noexcept
	Number of different values of OptProfileSelector enum. More...

template<>
constexpr int32_t	EnumMax< TacticSource > () noexcept
	Maximum number of tactic sources in TacticSource enum. More...

template<>
constexpr int32_t	EnumMax< ProfilingVerbosity > () noexcept
	Maximum number of profile verbosity levels in ProfilingVerbosity enum. More...

template<>
constexpr int32_t	EnumMax< SerializationFlag > () noexcept
	Maximum number of serialization flags in SerializationFlag enum. More...

template<>
constexpr int32_t	EnumMax< ExecutionContextAllocationStrategy > () noexcept
	Maximum number of memory allocation strategies in ExecutionContextAllocationStrategy enum. More...

template<>
constexpr int32_t	EnumMax< LayerInformationFormat > () noexcept

template<typename T >
constexpr int32_t	EnumMax () noexcept
	Maximum number of elements in an enumeration type. More...

Detailed Description

The TensorRT API version 1 namespace.

Typedef Documentation

◆ AllocatorFlags

using nvinfer1::AllocatorFlags = typedef uint32_t

◆ AsciiChar

using nvinfer1::AsciiChar = typedef char_t

AsciiChar is the type used by TensorRT to represent valid ASCII characters. This type is widely used in automotive safety context.

◆ BuilderFlags

using nvinfer1::BuilderFlags = typedef uint32_t

Represents one or more BuilderFlag values using binary OR operations, e.g., 1U << BuilderFlag::kFP16 | 1U << BuilderFlag::kDEBUG.

See also: IBuilderConfig::setFlags(), IBuilderConfig::getFlags()

◆ char_t

using nvinfer1::char_t = typedef char

char_t is the type used by TensorRT to represent all valid characters.

◆ Dims

using nvinfer1::Dims = typedef Dims64

Alias for Dims64.

◆ IAlgorithmSelector

using nvinfer1::IAlgorithmSelector = typedef v_1_0::IAlgorithmSelector

◆ IDebugListener

using nvinfer1::IDebugListener = typedef v_1_0::IDebugListener

◆ IErrorRecorder

typedef v_1_0::IErrorRecorder nvinfer1::IErrorRecorder

◆ IGpuAllocator

using nvinfer1::IGpuAllocator = typedef v_1_0::IGpuAllocator

◆ IGpuAsyncAllocator

using nvinfer1::IGpuAsyncAllocator = typedef v_1_0::IGpuAsyncAllocator

◆ IInt8EntropyCalibrator

using nvinfer1::IInt8EntropyCalibrator = typedef v_1_0::IInt8EntropyCalibrator

◆ IInt8EntropyCalibrator2

using nvinfer1::IInt8EntropyCalibrator2 = typedef v_1_0::IInt8EntropyCalibrator2

◆ IInt8LegacyCalibrator

using nvinfer1::IInt8LegacyCalibrator = typedef v_1_0::IInt8LegacyCalibrator

◆ IInt8MinMaxCalibrator

using nvinfer1::IInt8MinMaxCalibrator = typedef v_1_0::IInt8MinMaxCalibrator

◆ InterfaceKind

using nvinfer1::InterfaceKind = typedef char const*

◆ IOutputAllocator

using nvinfer1::IOutputAllocator = typedef v_1_0::IOutputAllocator

◆ IPluginCapability

using nvinfer1::IPluginCapability = typedef v_1_0::IPluginCapability

◆ IPluginCreator

using nvinfer1::IPluginCreator = typedef v_1_0::IPluginCreator

◆ IPluginCreatorInterface

using nvinfer1::IPluginCreatorInterface = typedef v_1_0::IPluginCreatorInterface

◆ IPluginCreatorV3One

using nvinfer1::IPluginCreatorV3One = typedef v_1_0::IPluginCreatorV3One

◆ IPluginResource

using nvinfer1::IPluginResource = typedef v_1_0::IPluginResource

◆ IPluginV3

using nvinfer1::IPluginV3 = typedef v_1_0::IPluginV3

◆ IPluginV3OneBuild

using nvinfer1::IPluginV3OneBuild = typedef v_1_0::IPluginV3OneBuild

◆ IPluginV3OneCore

using nvinfer1::IPluginV3OneCore = typedef v_1_0::IPluginV3OneCore

◆ IPluginV3OneRuntime

using nvinfer1::IPluginV3OneRuntime = typedef v_1_0::IPluginV3OneRuntime

◆ IProfiler

using nvinfer1::IProfiler = typedef v_1_0::IProfiler

◆ IProgressMonitor

using nvinfer1::IProgressMonitor = typedef v_1_0::IProgressMonitor

◆ IStreamReader

using nvinfer1::IStreamReader = typedef v_1_0::IStreamReader

◆ NetworkDefinitionCreationFlags

using nvinfer1::NetworkDefinitionCreationFlags = typedef uint32_t

Represents one or more NetworkDefinitionCreationFlag flags using binary OR operations. e.g., 1U << NetworkDefinitionCreationFlag::kSTRONGLY_TYPED.

See also: IBuilder::createNetworkV2

◆ PluginFormat

using nvinfer1::PluginFormat = typedef TensorFormat

PluginFormat is reserved for backward compatibility.

See also: IPluginV2::supportsFormat()

◆ QuantizationFlags

using nvinfer1::QuantizationFlags = typedef uint32_t

Represents one or more QuantizationFlag values using binary OR operations.

See also: IBuilderConfig::getQuantizationFlags(), IBuilderConfig::setQuantizationFlags()

◆ SerializationFlags

using nvinfer1::SerializationFlags = typedef uint32_t

Represents one or more SerializationFlag values using binary OR operations, e.g., 1U << SerializationFlag::kEXCLUDE_LEAN_RUNTIME.

See also: ISerializationConfig::setFlags(), ISerializationConfig::getFlags()

◆ TacticSources

using nvinfer1::TacticSources = typedef uint32_t

Represents a collection of one or more TacticSource values combine using bitwise-OR operations.

See also: IBuilderConfig::setTacticSources(), IBuilderConfig::getTacticSources()

◆ TempfileControlFlags

using nvinfer1::TempfileControlFlags = typedef uint32_t

Represents a collection of one or more TempfileControlFlag values combined using bitwise-OR operations.

See also: TempfileControlFlag, IRuntime::setTempfileControlFlags(), IRuntime::getTempfileControlFlags()

◆ TensorFormats

using nvinfer1::TensorFormats = typedef uint32_t

It is capable of representing one or more TensorFormat by binary OR operations, e.g., 1U << TensorFormat::kCHW4 | 1U << TensorFormat::kCHW32.

See also: ITensor::getAllowedFormats(), ITensor::setAllowedFormats(),

Enumeration Type Documentation

◆ ActivationType

enum class nvinfer1::ActivationType : int32_t

strong

Enumerates the types of activation to perform in an activation layer.

Enumerator
kRELU	Rectified linear activation.
kSIGMOID	Sigmoid activation.
kTANH	TanH activation.
kLEAKY_RELU	LeakyRelu activation: x>=0 ? x : alpha * x.
kELU	Elu activation: x>=0 ? x : alpha * (exp(x) - 1).
kSELU	Selu activation: x>0 ? beta * x : beta * (alpha*exp(x) - alpha)
kSOFTSIGN	Softsign activation: x / (1+\|x\|)
kSOFTPLUS	Parametric softplus activation: alphalog(exp(betax)+1)
kCLIP	Clip activation: max(alpha, min(beta, x))
kHARD_SIGMOID	Hard sigmoid activation: max(0, min(1, alpha*x+beta))
kSCALED_TANH	Scaled tanh activation: alphatanh(betax)
kTHRESHOLDED_RELU	Thresholded ReLU activation: x>alpha ? x : 0.
kGELU_ERF	GELU erf activation: 0.5 * x * (1 + erf(sqrt(0.5) * x))
kGELU_TANH	GELU tanh activation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (0.044715F * pow(x, 3) + x)))

◆ AllocatorFlag

enum class nvinfer1::AllocatorFlag : int32_t

strong

Allowed type of memory allocation.

Enumerator
kRESIZABLE	TensorRT may call realloc() on this allocation.

◆ APILanguage

enum class nvinfer1::APILanguage : int32_t

strong

Programming language used in the implementation of a TRT interface.

Enumerator
kCPP
kPYTHON

◆ BoundingBoxFormat

enum class nvinfer1::BoundingBoxFormat : int32_t

strong

Representation of bounding box data used for the Boxes input tensor in INMSLayer.

See also: INMSLayer

Enumerator
kCORNER_PAIRS	(x1, y1, x2, y2) where (x1, y1) and (x2, y2) are any pair of diagonal corners
kCENTER_SIZES	(x_center, y_center, width, height) where (x_center, y_center) is the center point of the box

◆ BuilderFlag

enum class nvinfer1::BuilderFlag : int32_t

strong

List of valid modes that the builder can enable when creating an engine from a network definition.

See also: IBuilderConfig::setFlags(), IBuilderConfig::getFlags()

Enumerator
kFP16	Enable FP16 layer selection, with FP32 fallback.
kINT8	Enable Int8 layer selection, with FP32 fallback with FP16 fallback if kFP16 also specified.
kDEBUG	Enable debugging of layers via synchronizing after every layer.
kGPU_FALLBACK	Enable layers marked to execute on GPU if layer cannot execute on DLA.
kREFIT	Enable building a refittable engine.
kDISABLE_TIMING_CACHE	Disable reuse of timing information across identical layers.
kTF32	Allow (but not require) computations on tensors of type DataType::kFLOAT to use TF32. TF32 computes inner products by rounding the inputs to 10-bit mantissas before multiplying, but accumulates the sum using 23-bit mantissas. Enabled by default.
kSPARSE_WEIGHTS	Allow the builder to examine weights and use optimized functions when weights have suitable sparsity.
kSAFETY_SCOPE	Change the allowed parameters in the EngineCapability::kSTANDARD flow to match the restrictions that EngineCapability::kSAFETY check against for DeviceType::kGPU and EngineCapability::kDLA_STANDALONE check against the DeviceType::kDLA case. This flag is forced to true if EngineCapability::kSAFETY at build time if it is unset. This flag is only supported in NVIDIA Drive(R) products.
kOBEY_PRECISION_CONSTRAINTS	Require that layers execute in specified precisions. Build fails otherwise.
kPREFER_PRECISION_CONSTRAINTS	Prefer that layers execute in specified precisions. Fall back (with warning) to another precision if build would otherwise fail.
kDIRECT_IO	Require that no reformats be inserted between a layer and a network I/O tensor for which ITensor::setAllowedFormats was called. Build fails if a reformat is required for functional correctness.
kREJECT_EMPTY_ALGORITHMS	Fail if IAlgorithmSelector::selectAlgorithms returns an empty set of algorithms.
kVERSION_COMPATIBLE	Restrict to lean runtime operators to provide version forward compatibility for the plan. This flag is only supported by NVIDIA Volta and later GPUs. This flag is not supported in NVIDIA Drive(R) products.
kEXCLUDE_LEAN_RUNTIME	Exclude lean runtime from the plan when version forward compatability is enabled. By default, this flag is unset, so the lean runtime will be included in the plan. If BuilderFlag::kVERSION_COMPATIBLE is not set then the value of this flag will be ignored.
kFP8	Enable plugins with FP8 input/output. This flag is not supported with hardware-compatibility mode. \see HardwareCompatibilityLevel
kERROR_ON_TIMING_CACHE_MISS	Emit error when a tactic being timed is not present in the timing cache. This flag has an effect only when IBuilderConfig has an associated ITimingCache.
kBF16	Enable DataType::kBF16 layer selection, with FP32 fallback. This flag is only supported by NVIDIA Ampere and later GPUs.
kDISABLE_COMPILATION_CACHE	Disable caching of JIT-compilation results during engine build. By default, JIT-compiled code will be serialized as part of the timing cache, which may significantly increase the cache size. Setting this flag prevents the code from being serialized. This flag has an effect only when BuilderFlag::DISABLE_TIMING_CACHE is not set.
kSTRIP_PLAN	Strip the refittable weights from the engine plan file.
kWEIGHTLESS	Deprecated: Deprecated in TensorRT 10.0. Superseded by kSTRIP_PLAN.
kREFIT_IDENTICAL	Create a refittable engine under the assumption that the refit weights will be identical to those provided at build time. The resulting engine will have the same performance as a non-refittable one. All refittable weights can be refitted through the refit API, but if the refit weights are not identical to the build-time weights, behavior is undefined. When used alongside 'kSTRIP_PLAN', this flag will result in a small plan file for which weights are later supplied via refitting. This enables use of a single set of weights with different inference backends, or with TensorRT plans for multiple GPU architectures.
kWEIGHT_STREAMING	Enable weight streaming for the current engine. Weight streaming from the host enables execution of models that do not fit in GPU memory by allowing TensorRT to intelligently stream network weights from the CPU DRAM. Please see ICudaEngine::getMinimumWeightStreamingBudget for the default memory budget when this flag is enabled. Enabling this feature changes the behavior of IRuntime::deserializeCudaEngine to allocate the entire network’s weights on the CPU DRAM instead of GPU memory. Then, ICudaEngine::createExecutionContext will determine the optimal split of weights between the CPU and GPU and place weights accordingly. Future TensorRT versions may enable this flag by default. Warning Enabling this flag may marginally increase build time. Enabling this feature will significantly increase the latency of ICudaEngine::createExecutionContext. See also IRuntime::deserializeCudaEngine, ICudaEngine::getMinimumWeightStreamingBudget, ICudaEngine::setWeightStreamingBudget
kINT4	Enable plugins with INT4 input/output.

◆ CalibrationAlgoType

enum class nvinfer1::CalibrationAlgoType : int32_t

strong

Version of calibration algorithm to use.

Deprecated:: Deprecated in TensorRT 10.1. Superseded by explicit quantization.

Enumerator
kLEGACY_CALIBRATION	Legacy calibration.
kENTROPY_CALIBRATION	Legacy entropy calibration.
kENTROPY_CALIBRATION_2	Entropy calibration.
kMINMAX_CALIBRATION	Minmax calibration.

◆ DataType

enum class nvinfer1::DataType : int32_t

strong

The type of weights and tensors.

Enumerator
kFLOAT	32-bit floating point format.
kHALF	IEEE 16-bit floating-point format – has a 5 bit exponent and 11 bit significand.
kINT8	Signed 8-bit integer representing a quantized floating-point value.
kINT32	Signed 32-bit integer format.
kBOOL	8-bit boolean. 0 = false, 1 = true, other values undefined.
kUINT8	Unsigned 8-bit integer format. Cannot be used to represent quantized floating-point values. Use the IdentityLayer to convert kUINT8 network-level inputs to {kFLOAT, kHALF} prior to use with other TensorRT layers, or to convert intermediate output before kUINT8 network-level outputs from {kFLOAT, kHALF} to kUINT8. kUINT8 conversions are only supported for {kFLOAT, kHALF}. kUINT8 to {kFLOAT, kHALF} conversion will convert the integer values to equivalent floating point values. {kFLOAT, kHALF} to kUINT8 conversion will convert the floating point values to integer values by truncating towards zero. This conversion has undefined behavior for floating point values outside the range [0.0F, 256.0F) after truncation. kUINT8 conversions are not supported for {kINT8, kINT32, kBOOL}.
kFP8	Signed 8-bit floating point with 1 sign bit, 4 exponent bits, 3 mantissa bits, and exponent-bias 7.
kBF16	Brain float – has an 8 bit exponent and 8 bit significand.
kINT64	Signed 64-bit integer type.
kINT4	Signed 4-bit integer type.

◆ DeviceType

enum class nvinfer1::DeviceType : int32_t

strong

The device that this layer/network will execute on.

Enumerator
kGPU	GPU Device.
kDLA	DLA Core.

◆ DimensionOperation

enum class nvinfer1::DimensionOperation : int32_t

strong

An operation on two IDimensionExpr, which represent integer expressions used in dimension computations.

For example, given two IDimensionExpr x and y and an IExprBuilder& eb, eb.operation(DimensionOperation::kSUM, x, y) creates a representation of x+y.

See also: IDimensionExpr, IExprBuilder

Enumerator
kSUM	Sum of the two operands.
kPROD	Product of the two operands.
kMAX	Maximum of the two operands.
kMIN	Minimum of the two operands.
kSUB	Substract the second element from the first.
kEQUAL	1 if operands are equal, 0 otherwise.
kLESS	1 if first operand is less than second operand, 0 otherwise.
kFLOOR_DIV	Floor division of the first element by the second.
kCEIL_DIV	Division rounding up.

◆ ElementWiseOperation

enum class nvinfer1::ElementWiseOperation : int32_t

strong

Enumerates the binary operations that may be performed by an ElementWise layer.

Operations kAND, kOR, and kXOR must have inputs of DataType::kBOOL.

Operation kPOW must have inputs of floating-point type or DataType::kINT8.

All other operations must have inputs of floating-point type, DataType::kINT8, DataType::kINT32, or DataType::kINT64.

See also: IElementWiseLayer

Enumerator
kSUM	Sum of the two elements.
kPROD	Product of the two elements.
kMAX	Maximum of the two elements.
kMIN	Minimum of the two elements.
kSUB	Subtract the second element from the first.
kDIV	Divide the first element by the second.
kPOW	The first element to the power of the second element.
kFLOOR_DIV	Floor division of the first element by the second.
kAND	Logical AND of two elements.
kOR	Logical OR of two elements.
kXOR	Logical XOR of two elements.
kEQUAL	Check if two elements are equal.
kGREATER	Check if element in first tensor is greater than corresponding element in second tensor.
kLESS	Check if element in first tensor is less than corresponding element in second tensor.

◆ EngineCapability

enum class nvinfer1::EngineCapability : int32_t

strong

List of supported engine capability flows.

The EngineCapability determines the restrictions of a network during build time and what runtime it targets. When BuilderFlag::kSAFETY_SCOPE is not set (by default), EngineCapability::kSTANDARD does not provide any restrictions on functionality and the resulting serialized engine can be executed with TensorRT's standard runtime APIs in the nvinfer1 namespace. EngineCapability::kSAFETY provides a restricted subset of network operations that are safety certified and the resulting serialized engine can be executed with TensorRT's safe runtime APIs in the nvinfer1::safe namespace. EngineCapability::kDLA_STANDALONE provides a restricted subset of network operations that are DLA compatible and the resulting serialized engine can be executed using standalone DLA runtime APIs. See sampleCudla for an example of integrating cuDLA APIs with TensorRT APIs.

Enumerator

kSTANDARD

Standard: TensorRT flow without targeting the safety runtime. This flow supports both DeviceType::kGPU and DeviceType::kDLA.

kSAFETY

Safety: TensorRT flow with restrictions targeting the safety runtime. See safety documentation for list of supported layers and formats. This flow supports only DeviceType::kGPU.

This flag is only supported in NVIDIA Drive(R) products.

kDLA_STANDALONE

DLA Standalone: TensorRT flow with restrictions targeting external, to TensorRT, DLA runtimes. See DLA documentation for list of supported layers and formats. This flow supports only DeviceType::kDLA.

◆ ErrorCode

enum class nvinfer1::ErrorCode : int32_t

strong

Error codes that can be returned by TensorRT during execution.

Enumerator
kSUCCESS	Execution completed successfully.
kUNSPECIFIED_ERROR	An error that does not fall into any other category. This error is included for forward compatibility.
kINTERNAL_ERROR	A non-recoverable TensorRT error occurred. TensorRT is in an invalid internal state when this error is emitted and any further calls to TensorRT will result in undefined behavior.
kINVALID_ARGUMENT	An argument passed to the function is invalid in isolation. This is a violation of the API contract.
kINVALID_CONFIG	An error occurred when comparing the state of an argument relative to other arguments. For example, the dimensions for concat differ between two tensors outside of the channel dimension. This error is triggered when an argument is correct in isolation, but not relative to other arguments. This is to help to distinguish from the simple errors from the more complex errors. This is a violation of the API contract.
kFAILED_ALLOCATION	An error occurred when performing an allocation of memory on the host or the device. A memory allocation error is normally fatal, but in the case where the application provided its own memory allocation routine, it is possible to increase the pool of available memory and resume execution.
kFAILED_INITIALIZATION	One, or more, of the components that TensorRT relies on did not initialize correctly. This is a system setup issue.
kFAILED_EXECUTION	An error occurred during execution that caused TensorRT to end prematurely, either an asynchronous error, user cancellation, or other execution errors reported by CUDA/DLA. In a dynamic system, the data can be thrown away and the next frame can be processed or execution can be retried. This is either an execution error or a memory error.
kFAILED_COMPUTATION	An error occurred during execution that caused the data to become corrupted, but execution finished. Examples of this error are NaN squashing or integer overflow. In a dynamic system, the data can be thrown away and the next frame can be processed or execution can be retried. This is either a data corruption error, an input error, or a range error. This is not used in safety but may be used in standard.
kINVALID_STATE	TensorRT was put into a bad state by incorrect sequence of function calls. An example of an invalid state is specifying a layer to be DLA only without GPU fallback, and that layer is not supported by DLA. This can occur in situations where a service is optimistically executing networks for multiple different configurations without checking proper error configurations, and instead throwing away bad configurations caught by TensorRT. This is a violation of the API contract, but can be recoverable. Example of a recovery: GPU fallback is disabled and conv layer with large filter(63x63) is specified to run on DLA. This will fail due to DLA not supporting the large kernel size. This can be recovered by either turning on GPU fallback or setting the layer to run on the GPU.
kUNSUPPORTED_STATE	An error occurred due to the network not being supported on the device due to constraints of the hardware or system. An example is running an unsafe layer in a safety certified context, or a resource requirement for the current network is greater than the capabilities of the target device. The network is otherwise correct, but the network and hardware combination is problematic. This can be recoverable. Examples: Scratch space requests larger than available device memory and can be recovered by increasing allowed workspace size. Tensor size exceeds the maximum element count and can be recovered by reducing the maximum batch size.

◆ ExecutionContextAllocationStrategy

enum class nvinfer1::ExecutionContextAllocationStrategy : int32_t

strong

Different memory allocation behaviors for IExecutionContext.

IExecutionContext requires a block of device memory for internal activation tensors during inference. The user can either let the execution context manage the memory in various ways or allocate the memory themselves.

See also: ICudaEngine::createExecutionContext(); IExecutionContext::setDeviceMemory()

Enumerator
kSTATIC	Default static allocation with the maximum size across all profiles.
kON_PROFILE_CHANGE	Reallocate for a profile when it's selected.
kUSER_MANAGED	The user supplies custom allocation to the execution context.

◆ FillOperation

enum class nvinfer1::FillOperation : int32_t

strong

Enumerates the tensor fill operations that may performed by a fill layer.

See also: IFillLayer

Enumerator

kLINSPACE

Compute each value via an affine function of its indices. For example, suppose the parameters for the IFillLayer are:

Dimensions = [3,4]
Alpha = 1
Beta = [100,10]

Element [i,j] of the output is Alpha + Beta[0]*i + Beta[1]*j. Thus the output matrix is:

 1  11  21  31

101 111 121 131 201 211 221 231

A static beta b is implicitly a 1D tensor, i.e. Beta = [b].

kRANDOM_UNIFORM

Randomly draw values from a uniform distribution.

kRANDOM_NORMAL

Randomly draw values from a normal distribution.

◆ GatherMode

enum class nvinfer1::GatherMode : int32_t

strong

Control form of IGatherLayer.

See also: IGatherLayer

Enumerator
kDEFAULT	Similar to ONNX Gather.
kELEMENT	Similar to ONNX GatherElements.
kND	Similar to ONNX GatherND.

◆ HardwareCompatibilityLevel

enum class nvinfer1::HardwareCompatibilityLevel : int32_t

strong

Describes requirements of compatibility with GPU architectures other than that of the GPU on which the engine was built.

Levels except kNONE are only supported for engines built on NVIDIA Ampere and later GPUs.

Warning: Note that compatibility with future hardware depends on CUDA forward compatibility support.

Enumerator

kNONE

Do not require hardware compatibility with GPU architectures other than that of the GPU on which the engine was built.

kAMPERE_PLUS

Require that the engine is compatible with Ampere and newer GPUs. This will limit the combined usage of driver reserved and backend kernel max shared memory to 48KiB, may reduce the number of available tactics for each layer, and may prevent some fusions from occurring. Thus this can decrease the performance, especially for tf32 models. This option will disable cuDNN, cuBLAS, and cuBLAS LT as tactic sources.

The driver reserved shared memory can be queried from cuDeviceGetAttribute(&reservedShmem, CU_DEVICE_ATTRIBUTE_RESERVED_SHARED_MEMORY_PER_BLOCK).

◆ InterpolationMode

enum class nvinfer1::InterpolationMode : int32_t

strong

Enumerates various modes of interpolation.

Enumerator
kNEAREST	ND (0 < N <= 8) nearest neighbor resizing.
kLINEAR	Supports linear (1D), bilinear (2D), and trilinear (3D) interpolation.
kCUBIC	Supports bicubic (2D) interpolation.

◆ LayerInformationFormat

enum class nvinfer1::LayerInformationFormat : int32_t

strong

The format in which the IEngineInspector prints the layer information.

See also: IEngineInspector::getLayerInformation(), IEngineInspector::getEngineInformation()

Enumerator
kONELINE	Print layer information in one line per layer.
kJSON	Print layer information in JSON format.

◆ LayerType

enum class nvinfer1::LayerType : int32_t

strong

The type values of layer classes.

See also: ILayer::getType()

Enumerator
kCONVOLUTION	Convolution layer.
kCAST	Cast layer.
kACTIVATION	Activation layer.
kPOOLING	Pooling layer.
kLRN	LRN layer.
kSCALE	Scale layer.
kSOFTMAX	SoftMax layer.
kDECONVOLUTION	Deconvolution layer.
kCONCATENATION	Concatenation layer.
kELEMENTWISE	Elementwise layer.
kPLUGIN	Plugin layer.
kUNARY	UnaryOp operation Layer.
kPADDING	Padding layer.
kSHUFFLE	Shuffle layer.
kREDUCE	Reduce layer.
kTOPK	TopK layer.
kGATHER	Gather layer.
kMATRIX_MULTIPLY	Matrix multiply layer.
kRAGGED_SOFTMAX	Ragged softmax layer.
kCONSTANT	Constant layer.
kIDENTITY	Identity layer.
kPLUGIN_V2	PluginV2 layer.
kSLICE	Slice layer.
kSHAPE	Shape layer.
kPARAMETRIC_RELU	Parametric ReLU layer.
kRESIZE	Resize Layer.
kTRIP_LIMIT	Loop Trip limit layer.
kRECURRENCE	Loop Recurrence layer.
kITERATOR	Loop Iterator layer.
kLOOP_OUTPUT	Loop output layer.
kSELECT	Select layer.
kFILL	Fill layer.
kQUANTIZE	Quantize layer.
kDEQUANTIZE	Dequantize layer.
kCONDITION	Condition layer.
kCONDITIONAL_INPUT	Conditional Input layer.
kCONDITIONAL_OUTPUT	Conditional Output layer.
kSCATTER	Scatter layer.
kEINSUM	Einsum layer.
kASSERTION	Assertion layer.
kONE_HOT	OneHot layer.
kNON_ZERO	NonZero layer.
kGRID_SAMPLE	Grid sample layer.
kNMS	NMS layer.
kREVERSE_SEQUENCE	Reverse sequence layer.
kNORMALIZATION	Normalization layer.
kPLUGIN_V3	PluginV3 layer.

◆ LoopOutput

enum class nvinfer1::LoopOutput : int32_t

strong

Enumerator
kLAST_VALUE	Output value is value of tensor for last iteration.
kCONCATENATE	Output value is concatenation of values of tensor for each iteration, in forward order.
kREVERSE	Output value is concatenation of values of tensor for each iteration, in reverse order.

◆ MatrixOperation

enum class nvinfer1::MatrixOperation : int32_t

strong

Enumerates the operations that may be performed on a tensor by IMatrixMultiplyLayer before multiplication.

Enumerator

kNONE

Treat x as a matrix if it has two dimensions, or as a collection of matrices if x has more than two dimensions, where the last two dimensions are the matrix dimensions. x must have at least two dimensions.

kTRANSPOSE

Like kNONE, but transpose the matrix dimensions.

kVECTOR

Treat x as a vector if it has one dimension, or as a collection of vectors if x has more than one dimension. x must have at least one dimension.

The first input tensor with dimensions [M,K] used with MatrixOperation::kVECTOR is equivalent to a tensor with dimensions [M, 1, K] with MatrixOperation::kNONE, i.e. is treated as M row vectors of length K, or dimensions [M, K, 1] with MatrixOperation::kTRANSPOSE.

The second input tensor with dimensions [M,K] used with MatrixOperation::kVECTOR is equivalent to a tensor with dimensions [M, K, 1] with MatrixOperation::kNONE, i.e. is treated as M column vectors of length K, or dimensions [M, 1, K] with MatrixOperation::kTRANSPOSE.

◆ MemoryPoolType

enum class nvinfer1::MemoryPoolType : int32_t

strong

The type for memory pools used by TensorRT.

See also: IBuilderConfig::setMemoryPoolLimit, IBuilderConfig::getMemoryPoolLimit

Enumerator
kWORKSPACE	kWORKSPACE is used by TensorRT to store intermediate buffers within an operation. This defaults to max device memory. Set to a smaller value to restrict tactics that use over the threshold en masse. For more targeted removal of tactics use the IAlgorithmSelector interface.
kDLA_MANAGED_SRAM	kDLA_MANAGED_SRAM is a fast software managed RAM used by DLA to communicate within a layer. The size of this pool must be at least 4 KiB and must be a power of 2. This defaults to 1 MiB. Orin has capacity of 1 MiB per core.
kDLA_LOCAL_DRAM	kDLA_LOCAL_DRAM is host RAM used by DLA to share intermediate tensor data across operations. The size of this pool must be at least 4 KiB and must be a power of 2. This defaults to 1 GiB.
kDLA_GLOBAL_DRAM	kDLA_GLOBAL_DRAM is host RAM used by DLA to store weights and metadata for execution. The size of this pool must be at least 4 KiB and must be a power of 2. This defaults to 512 MiB.
kTACTIC_DRAM	kTACTIC_DRAM is the device DRAM used by the optimizer to run tactics. On embedded devices, where host and device memory are unified, this includes all host memory required by TensorRT to build the network up to the point of each memory allocation. This defaults to 75% of totalGlobalMem as reported by cudaGetDeviceProperties when cudaGetDeviceProperties.embedded is true, and 100% otherwise.
kTACTIC_SHARED_MEMORY	kTACTIC_SHARED_MEMORY defines the maximum sum of shared memory reserved by the driver and used for executing CUDA kernels. Adjust this value to restrict tactics that exceed the specified threshold en masse. The default value is device max capability. This value must be less than 1GiB. The driver reserved shared memory can be queried from cuDeviceGetAttribute(&reservedShmem, CU_DEVICE_ATTRIBUTE_RESERVED_SHARED_MEMORY_PER_BLOCK). Updating this flag will override the shared memory limit set by HardwareCompatibilityLevel, which defaults to 48KiB - reservedShmem.

◆ NetworkDefinitionCreationFlag

enum class nvinfer1::NetworkDefinitionCreationFlag : int32_t

strong

List of immutable network properties expressed at network creation time. NetworkDefinitionCreationFlag is used with createNetworkV2() to specify immutable properties of the network.

See also: IBuilder::createNetworkV2

Enumerator

kEXPLICIT_BATCH

Ignored because networks are always "explicit batch" in TensorRT 10.0.

\deprecated Deprecated in TensorRT 10.0.

kSTRONGLY_TYPED

Mark the network to be strongly typed. Every tensor in the network has a data type defined in the network following only type inference rules and the inputs/operator annotations. Setting layer precision and layer output types is not allowed, and the network output types will be inferred based on the input types and the type inference rules.

◆ OptProfileSelector

enum class nvinfer1::OptProfileSelector : int32_t

strong

When setting or querying optimization profile parameters (such as shape tensor inputs or dynamic dimensions), select whether we are interested in the minimum, optimum, or maximum values for these parameters. The minimum and maximum specify the permitted range that is supported at runtime, while the optimum value is used for the kernel selection. This should be the "typical" value that is expected to occur at runtime.

See also: IOptimizationProfile::setDimensions(), IOptimizationProfile::setShapeValues()

Enumerator
kMIN	This is used to set or get the minimum permitted value for dynamic dimensions etc.
kOPT	This is used to set or get the value that is used in the optimization (kernel selection).
kMAX	This is used to set or get the maximum permitted value for dynamic dimensions etc.

◆ PaddingMode

enum class nvinfer1::PaddingMode : int32_t

strong

Enumerates the modes of padding to perform in convolution, deconvolution and pooling layer, padding mode takes precedence if setPaddingMode() and setPrePadding() are also used.

There are two padding styles, EXPLICIT and SAME with each style having two variants. The EXPLICIT style determine if the final sampling location is used or not. The SAME style determine if the asymmetry in the padding is on the pre or post padding.

Shorthand:
    I = dimensions of input image.
    B = prePadding, before the image data. For deconvolution, prePadding is set before output.
    A = postPadding, after the image data. For deconvolution, postPadding is set after output.
    P = delta between input and output
    S = stride
    F = filter
    O = output
    D = dilation
    M = I + B + A ; The image data plus any padding
    DK = 1 + D * (F - 1)

Formulas for Convolution:

EXPLICIT_ROUND_DOWN:
O = floor((M - DK) / S) + 1
EXPLICIT_ROUND_UP:
O = ceil((M - DK) / S) + 1
SAME_UPPER:
O = ceil(I / S)

P = floor((I - 1) / S) * S + DK - I;

B = floor(P / 2)

A = P - B
SAME_LOWER:
O = ceil(I / S)

P = floor((I - 1) / S) * S + DK - I;

A = floor(P / 2)

B = P - A

Formulas for Deconvolution:

EXPLICIT_ROUND_DOWN:
EXPLICIT_ROUND_UP:
O = (I - 1) * S + DK - (B + A)
SAME_UPPER:
O = min(I * S, (I - 1) * S + DK)

P = max(DK - S, 0)

B = floor(P / 2)

A = P - B
SAME_LOWER:
O = min(I * S, (I - 1) * S + DK)

P = max(DK - S, 0)

A = floor(P / 2)

B = P - A

Formulas for Pooling:

EXPLICIT_ROUND_DOWN:
O = floor((M - F) / S) + 1
EXPLICIT_ROUND_UP:
O = ceil((M - F) / S) + 1
SAME_UPPER:
O = ceil(I / S)

P = floor((I - 1) / S) * S + F - I;

B = floor(P / 2)

A = P - B
SAME_LOWER:
O = ceil(I / S)

P = floor((I - 1) / S) * S + F - I;

A = floor(P / 2)

B = P - A

Pooling Example 1:

Given I = {6, 6}, B = {3, 3}, A = {2, 2}, S = {2, 2}, F = {3, 3}. What is O?

(B, A can be calculated for SAME_UPPER and SAME_LOWER mode)

EXPLICIT_ROUND_DOWN:
Computation:

M = {6, 6} + {3, 3} + {2, 2} ==> {11, 11}

O ==> floor((M - F) / S) + 1

==> floor(({11, 11} - {3, 3}) / {2, 2}) + {1, 1}

==> floor({8, 8} / {2, 2}) + {1, 1}

==> {5, 5}
EXPLICIT_ROUND_UP:
Computation:

M = {6, 6} + {3, 3} + {2, 2} ==> {11, 11}

O ==> ceil((M - F) / S) + 1

==> ceil(({11, 11} - {3, 3}) / {2, 2}) + {1, 1}

==> ceil({8, 8} / {2, 2}) + {1, 1}

==> {5, 5}

The sample points are {0, 2, 4, 6, 8} in each dimension.
SAME_UPPER:
Computation:

I = {6, 6}

S = {2, 2}

O = ceil(I / S) = {3, 3}

P = floor((I - 1) / S) * S + F - I

==> floor(({6, 6} - {1, 1}) / {2, 2}) * {2, 2} + {3, 3} - {6, 6}

==> {4, 4} + {3, 3} - {6, 6}

==> {1, 1}

B = floor({1, 1} / {2, 2})

==> {0, 0}

A = {1, 1} - {0, 0}

==> {1, 1}
SAME_LOWER:
Computation:

I = {6, 6}

S = {2, 2}

O = ceil(I / S) = {3, 3}

P = floor((I - 1) / S) * S + F - I

==> {1, 1}

A = floor({1, 1} / {2, 2})

==> {0, 0}

B = {1, 1} - {0, 0}

==> {1, 1}

The sample pointers are {0, 2, 4} in each dimension. SAMPLE_UPPER has {O0, O1, O2, pad} in output in each dimension. SAMPLE_LOWER has {pad, O0, O1, O2} in output in each dimension.

Pooling Example 2:

Given I = {6, 6}, B = {3, 3}, A = {3, 3}, S = {2, 2}, F = {3, 3}. What is O?

Enumerator
kEXPLICIT_ROUND_DOWN	Use explicit padding, rounding output size down.
kEXPLICIT_ROUND_UP	Use explicit padding, rounding output size up.
kSAME_UPPER	Use SAME padding, with prePadding <= postPadding.
kSAME_LOWER	Use SAME padding, with prePadding >= postPadding.

◆ PluginCapabilityType

enum class nvinfer1::PluginCapabilityType : int32_t

strong

Enumerates the different capability types a IPluginV3 object may have.

Enumerator
kCORE	Core capability. Every IPluginV3 object must have this.
kBUILD	Build capability. IPluginV3 objects provided to TensorRT build phase must have this.
kRUNTIME	Runtime capability. IPluginV3 objects provided to TensorRT build and execution phases must have this.

◆ PluginCreatorVersion

enum class nvinfer1::PluginCreatorVersion : int32_t

strong

Enum to identify version of the plugin creator.

Enumerator
kV1	IPluginCreator.
kV1_PYTHON	IPluginCreator-based Python plugin creators.

◆ PluginFieldType

enum class nvinfer1::PluginFieldType : int32_t

strong

The possible field types for custom layer.

Enumerator
kFLOAT16	FP16 field type.
kFLOAT32	FP32 field type.
kFLOAT64	FP64 field type.
kINT8	INT8 field type.
kINT16	INT16 field type.
kINT32	INT32 field type.
kCHAR	char field type.
kDIMS	nvinfer1::Dims field type.
kUNKNOWN	Unknown field type.
kBF16	BF16 field type.
kINT64	INT64 field type.
kFP8	FP8 field type.
kINT4	INT4 field type.

◆ PluginVersion

enum class nvinfer1::PluginVersion : uint8_t

strong

Enumerator
kV2	IPluginV2.
kV2_EXT	IPluginV2Ext.
kV2_IOEXT	IPluginV2IOExt.
kV2_DYNAMICEXT	IPluginV2DynamicExt.
kV2_DYNAMICEXT_PYTHON	IPluginV2DynamicExt-based Python plugins.

◆ PoolingType

enum class nvinfer1::PoolingType : int32_t

strong

The type of pooling to perform in a pooling layer.

Enumerator
kMAX	Maximum over elements.
kAVERAGE	Average over elements. If the tensor is padded, the count includes the padding.
kMAX_AVERAGE_BLEND	Blending between max and average pooling: (1-blendFactor)maxPool + blendFactoravgPool.

◆ PreviewFeature

enum class nvinfer1::PreviewFeature : int32_t

strong

Define preview features.

Preview Features have been fully tested but are not yet as stable as other features in TensorRT. They are provided as opt-in features for at least one release.

Enumerator

kPROFILE_SHARING_0806

Allows optimization profiles to be shared across execution contexts.

Deprecated:: Deprecated in TensorRT 10.0. The default value for this flag is on and can not be changed.

◆ ProfilingVerbosity

enum class nvinfer1::ProfilingVerbosity : int32_t

strong

List of verbosity levels of layer information exposed in NVTX annotations and in IEngineInspector.

See also: IBuilderConfig::setProfilingVerbosity(), IBuilderConfig::getProfilingVerbosity(), IEngineInspector

Enumerator
kLAYER_NAMES_ONLY	Print only the layer names. This is the default setting.
kNONE	Do not print any layer information.
kDETAILED	Print detailed layer information including layer names and layer parameters.

◆ QuantizationFlag

enum class nvinfer1::QuantizationFlag : int32_t

strong

List of valid flags for quantizing the network to int8.

See also: IBuilderConfig::setQuantizationFlag(), IBuilderConfig::getQuantizationFlag()

Deprecated:: Deprecated in TensorRT 10.1. Superseded by explicit quantization.

Enumerator
kCALIBRATE_BEFORE_FUSION	Run int8 calibration pass before layer fusion. Only valid for IInt8LegacyCalibrator and IInt8EntropyCalibrator. The builder always runs the int8 calibration pass before layer fusion for IInt8MinMaxCalibrator and IInt8EntropyCalibrator2. Disabled by default.

◆ ReduceOperation

enum class nvinfer1::ReduceOperation : int32_t

strong

Enumerates the reduce operations that may be performed by a Reduce layer.

The table shows the result of reducing across an empty volume of a given type.

Operation	kFLOAT and kHALF	kINT32	kINT8
kSUM	0	0	0
kPROD	1	1	1
kMAX	negative infinity	INT_MIN	-128
kMIN	positive infinity	INT_MAX	127
kAVG	NaN	0	-128

The current version of TensorRT usually performs reduction for kINT8 via kFLOAT or kHALF. The kINT8 values show the quantized representations of the floating-point values.

Enumerator
kSUM
kPROD
kMAX
kMIN
kAVG

◆ ResizeCoordinateTransformation

enum class nvinfer1::ResizeCoordinateTransformation : int32_t

strong

The resize coordinate transformation function.

See also: IResizeLayer::setCoordinateTransformation()

Enumerator

kALIGN_CORNERS

Think of each value in the tensor as a unit volume, and the coordinate is a point inside this volume. The coordinate point is drawn as a star (*) in the below diagram, and multiple values range has a length. Define x_origin as the coordinate of axis x in the input tensor, x_resized as the coordinate of axis x in the output tensor, length_origin as length of the input tensor in axis x, and length_resize as length of the output tensor in axis x.

|<--------------length---------->|
|    0     |    1     |    2     |    3     |
*          *          *          *

x_origin = x_resized * (length_origin - 1) / (length_resize - 1)

kASYMMETRIC

|<-----------—length------------------—>| | 0 | 1 | 2 | 3 |

x_origin = x_resized * (length_origin / length_resize)

kHALF_PIXEL

|<-----------—length------------------—>| | 0 | 1 | 2 | 3 |

x_origin = (x_resized + 0.5) * (length_origin / length_resize) - 0.5

◆ ResizeRoundMode

enum class nvinfer1::ResizeRoundMode : int32_t

strong

The rounding mode for nearest neighbor resize.

See also: IResizeLayer::setNearestRounding()

Enumerator
kHALF_UP	Round half up.
kHALF_DOWN	Round half down.
kFLOOR	Round to floor.
kCEIL	Round to ceil.

◆ ResizeSelector

enum class nvinfer1::ResizeSelector : int32_t

strong

The coordinate selector when resize to single pixel output.

See also: IResizeLayer::setSelectorForSinglePixel()

Enumerator
kFORMULA	Use formula to map the original index.
kUPPER	Select the upper left pixel.

◆ SampleMode

enum class nvinfer1::SampleMode : int32_t

strong

Controls how ISliceLayer and IGridSample handle out-of-bounds coordinates.

See also: ISliceLayer and IGridSample

Enumerator
kSTRICT_BOUNDS	Fail with error when the coordinates are out of bounds.
kWRAP	Coordinates wrap around periodically.
kCLAMP	Out of bounds indices are clamped to bounds.
kFILL	Use fill input value when coordinates are out of bounds.
kREFLECT	Coordinates reflect. The axis of reflection is the middle of the perimeter pixel and the reflections are repeated indefinitely within the padded regions. Repeats values for a single pixel and throws error for zero pixels.

◆ ScaleMode

enum class nvinfer1::ScaleMode : int32_t

strong

Controls how shift, scale and power are applied in a Scale layer.

See also: IScaleLayer

Enumerator
kUNIFORM	Identical coefficients across all elements of the tensor.
kCHANNEL	Per-channel coefficients.
kELEMENTWISE	Elementwise coefficients.

◆ ScatterMode

enum class nvinfer1::ScatterMode : int32_t

strong

Control form of IScatterLayer.

See also: IScatterLayer

Enumerator
kELEMENT	Similar to ONNX ScatterElements.
kND	Similar to ONNX ScatterND.

◆ SerializationFlag

enum class nvinfer1::SerializationFlag : int32_t

strong

List of valid flags that the engine can enable when serializing the bytes.

See also: ISerializationConfig::setFlags(), ISerializationConfig::getFlags()

Enumerator
kEXCLUDE_WEIGHTS	Exclude the weights that can be refitted.
kEXCLUDE_LEAN_RUNTIME	Exclude the lean runtime.

◆ TacticSource

enum class nvinfer1::TacticSource : int32_t

strong

List of tactic sources for TensorRT.

See also: TacticSources, IBuilderConfig::setTacticSources(), IBuilderConfig::getTacticSources()

Enumerator
kCUBLAS	cuBLAS tactics. Disabled by default. Note Disabling kCUBLAS will cause the cuBLAS handle passed to plugins in attachToContext to be null. Deprecated: Deprecated in TensorRT 10.0.
kCUBLAS_LT	cuBLAS LT tactics. Disabled by default. Deprecated: Deprecated in TensorRT 9.0.
kCUDNN	cuDNN tactics. Disabled by default. Note Disabling kCUDNN will cause the cuDNN handle passed to plugins in attachToContext to be null. Deprecated: Deprecated in TensorRT 10.0.
kEDGE_MASK_CONVOLUTIONS	Enables convolution tactics implemented with edge mask tables. These tactics tradeoff memory for performance by consuming additional memory space proportional to the input size. Enabled by default.
kJIT_CONVOLUTIONS	Enables convolution tactics implemented with source-code JIT fusion. The engine building time may increase when this is enabled. Enabled by default.

◆ TempfileControlFlag

enum class nvinfer1::TempfileControlFlag : int32_t

strong

Flags used to control TensorRT's behavior when creating executable temporary files.

On some platforms the TensorRT runtime may need to create files in a temporary directory or use platform-specific APIs to create files in-memory to load temporary DLLs that implement runtime code. These flags allow the application to explicitly control TensorRT's use of these files. This will preclude the use of certain TensorRT APIs for deserializing and loading lean runtimes.

Enumerator

kALLOW_IN_MEMORY_FILES

Allow creating and loading files in-memory (or unnamed files).

kALLOW_TEMPORARY_FILES

Allow creating and loading named files in a temporary directory on the filesystem.

\see IRuntime::setTemporaryDirectory()

◆ TensorFormat

enum class nvinfer1::TensorFormat : int32_t

strong

Format of the input/output tensors.

This enum is used by both plugins and network I/O tensors.

See also: IPluginV2::supportsFormat(), safe::ICudaEngine::getBindingFormat()

Many of the formats are vector-major or vector-minor. These formats specify a vector dimension and scalars per vector. For example, suppose that the tensor has has dimensions [M,N,C,H,W], the vector dimension is C and there are V scalars per vector.

A vector-major format splits the vectorized dimension into two axes in the memory layout. The vectorized dimension is replaced by an axis of length ceil(C/V) and a new dimension of length V is appended. For the example tensor, the memory layout is equivalent to an array with dimensions [M][N][ceil(C/V)][H][W][V]. Tensor coordinate (m,n,c,h,w) maps to array location [m][n][c/V][h][w][c%V].
A vector-minor format moves the vectorized dimension to become the last axis in the memory layout. For the example tensor, the memory layout is equivalent to an array with dimensions [M][N][H][W][ceil(C/V)*V]. Tensor coordinate (m,n,c,h,w) maps array location subscript [m][n][h][w][c].

In interfaces that refer to "components per element", that's the value of V above.

For more information about data formats, see the topic "Data Format Description" located in the TensorRT Developer Guide. https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#data-format-desc

Enumerator
kLINEAR	Memory layout is similar to an array in C or C++. The stride of each dimension is the product of the dimensions after it. The last dimension has unit stride. For DLA usage, the tensor sizes are limited to C,H,W in the range [1,8192].
kCHW2	Vector-major format with two scalars per vector. Vector dimension is third to last. This format requires FP16 or BF16 and at least three dimensions.
kHWC8	Vector-minor format with eight scalars per vector. Vector dimension is third to last. This format requires FP16 or BF16 and at least three dimensions.
kCHW4	Vector-major format with four scalars per vector. Vector dimension is third to last. This format requires INT8 or FP16 and at least three dimensions. For INT8, the length of the vector dimension must be a build-time constant. Deprecated usage: If running on the DLA, this format can be used for acceleration with the caveat that C must be less than or equal to 4. If used as DLA input and the build option kGPU_FALLBACK is not specified, it needs to meet line stride requirement of DLA format. Column stride in bytes must be a multiple of 64 on Orin.
kCHW16	Vector-major format with 16 scalars per vector. Vector dimension is third to last. This format requires INT8 or FP16 and at least three dimensions. For DLA usage, this format maps to the native feature format for FP16, and the tensor sizes are limited to C,H,W in the range [1,8192].
kCHW32	Vector-major format with 32 scalars per vector. Vector dimension is third to last. This format requires at least three dimensions. For DLA usage, this format maps to the native feature format for INT8, and the tensor sizes are limited to C,H,W in the range [1,8192].
kDHWC8	Vector-minor format with eight scalars per vector. Vector dimension is fourth to last. This format requires FP16 or BF16 and at least four dimensions.
kCDHW32	Vector-major format with 32 scalars per vector. Vector dimension is fourth to last. This format requires FP16 or INT8 and at least four dimensions.
kHWC	Vector-minor format where channel dimension is third to last and unpadded. This format requires either FP32 or UINT8 and at least three dimensions.
kDLA_LINEAR	DLA planar format. For a tensor with dimension {N, C, H, W}, the W axis always has unit stride. The stride for stepping along the H axis is rounded up to 64 bytes. The memory layout is equivalent to a C array with dimensions [N][C][H][roundUp(W, 64/elementSize)] where elementSize is 2 for FP16 and 1 for Int8, with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c][h][w].
kDLA_HWC4	DLA image format. For a tensor with dimension {N, C, H, W} the C axis always has unit stride. The stride for stepping along the H axis is rounded up to 64 bytes on Orin. C can only be 1, 3 or 4. If C == 1, it will map to grayscale format. If C == 3 or C == 4, it will map to color image format. And if C == 3, the stride for stepping along the W axis needs to be padded to 4 in elements. When C is {1, 3, 4}, then C' is {1, 4, 4} respectively, the memory layout is equivalent to a C array with dimensions [N][H][roundUp(W, 64/C'/elementSize)][C'] on Orin where elementSize is 2 for FP16 and 1 for Int8. The tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].
kHWC16	Vector-minor format with 16 scalars per vector. Vector dimension is third to last. This requires FP16 and at least three dimensions.
kDHWC	Vector-minor format with one scalar per vector. Vector dimension is fourth to last. This format requires FP32 and at least four dimensions.

◆ TensorIOMode

enum class nvinfer1::TensorIOMode : int32_t

strong

Definition of tensor IO Mode.

Enumerator
kNONE	Tensor is not an input or output.
kINPUT	Tensor is input to the engine.
kOUTPUT	Tensor is output by the engine.

◆ TensorLocation

enum class nvinfer1::TensorLocation : int32_t

strong

The location for tensor data storage, device or host.

Enumerator
kDEVICE	Data stored on device.
kHOST	Data stored on host.

◆ TensorRTPhase

enum class nvinfer1::TensorRTPhase : int32_t

strong

Indicates a phase of operation of TensorRT.

Enumerator
kBUILD	Build phase of TensorRT.
kRUNTIME	Execution phase of TensorRT.

◆ TopKOperation

enum class nvinfer1::TopKOperation : int32_t

strong

Enumerates the operations that may be performed by a TopK layer.

Enumerator
kMAX	Maximum of the elements.
kMIN	Minimum of the elements.

◆ TripLimit

enum class nvinfer1::TripLimit : int32_t

strong

Enumerator
kCOUNT	Tensor is a scalar of type kINT32 or kINT64 that contains the trip count.
kWHILE	Tensor is a scalar of type kBOOL. Loop terminates when value is false.

◆ UnaryOperation

enum class nvinfer1::UnaryOperation : int32_t

strong

Enumerates the unary operations that may be performed by a Unary layer.

Operations kNOT must have inputs of DataType::kBOOL.

Operation kSIGN and kABS must have inputs of floating-point type, DataType::kINT8, DataType::kINT32 or DataType::kINT64.

Operation kISINF must have inputs of floating-point type.

All other operations must have inputs of floating-point type.

See also: IUnaryLayer

Enumerator
kEXP	Exponentiation.
kLOG	Log (base e).
kSQRT	Square root.
kRECIP	Reciprocal.
kABS	Absolute value.
kNEG	Negation.
kSIN	Sine.
kCOS	Cosine.
kTAN	Tangent.
kSINH	Hyperbolic sine.
kCOSH	Hyperbolic cosine.
kASIN	Inverse sine.
kACOS	Inverse cosine.
kATAN	Inverse tangent.
kASINH	Inverse hyperbolic sine.
kACOSH	Inverse hyperbolic cosine.
kATANH	Inverse hyperbolic tangent.
kCEIL	Ceiling.
kFLOOR	Floor.
kERF	Gauss error function.
kNOT	Logical NOT.
kSIGN	Sign, If input > 0, output 1; if input < 0, output -1; if input == 0, output 0.
kROUND	Round to nearest even for floating-point data type.
kISINF	Return true if input value equals +/- infinity for floating-point data type.
kISNAN	Return true if input value is a NaN for floating-point data type.

◆ WeightsRole

enum class nvinfer1::WeightsRole : int32_t

strong

How a layer uses particular Weights.

The power weights of an IScaleLayer are omitted. Refitting those is not supported.

Enumerator
kKERNEL	kernel for IConvolutionLayer or IDeconvolutionLayer
kBIAS	bias for IConvolutionLayer or IDeconvolutionLayer
kSHIFT	shift part of IScaleLayer
kSCALE	scale part of IScaleLayer
kCONSTANT	weights for IConstantLayer
kANY	Any other weights role.

Function Documentation

◆ EnumMax()

template<typename T >

constexpr int32_t nvinfer1::EnumMax ( )

constexprnoexcept

Maximum number of elements in an enumeration type.

◆ EnumMax< BoundingBoxFormat >()

template<>

constexpr int32_t nvinfer1::EnumMax< BoundingBoxFormat > ( )

inlineconstexprnoexcept

Maximum number of elements in BoundingBoxFormat enum.

See also: BoundingBoxFormat

◆ EnumMax< BuilderFlag >()

template<>

constexpr int32_t nvinfer1::EnumMax< BuilderFlag > ( )

inlineconstexprnoexcept

Maximum number of builder flags in BuilderFlag enum.

See also: BuilderFlag

◆ EnumMax< CalibrationAlgoType >()

template<>

constexpr int32_t nvinfer1::EnumMax< CalibrationAlgoType > ( )

inlineconstexprnoexcept

Maximum number of elements in CalibrationAlgoType enum.

See also: DataType

◆ EnumMax< DeviceType >()

template<>

constexpr int32_t nvinfer1::EnumMax< DeviceType > ( )

inlineconstexprnoexcept

Maximum number of elements in DeviceType enum.

See also: DeviceType

◆ EnumMax< DimensionOperation >()

template<>

constexpr int32_t nvinfer1::EnumMax< DimensionOperation > ( )

inlineconstexprnoexcept

Maximum number of elements in DimensionOperation enum.

See also: DimensionOperation

◆ EnumMax< ExecutionContextAllocationStrategy >()

template<>

constexpr int32_t nvinfer1::EnumMax< ExecutionContextAllocationStrategy > ( )

inlineconstexprnoexcept

Maximum number of memory allocation strategies in ExecutionContextAllocationStrategy enum.

See also: ExecutionContextAllocationStrategy

◆ EnumMax< FillOperation >()

template<>

constexpr int32_t nvinfer1::EnumMax< FillOperation > ( )

inlineconstexprnoexcept

Maximum number of elements in FillOperation enum.

See also: FillOperation

◆ EnumMax< GatherMode >()

template<>

constexpr int32_t nvinfer1::EnumMax< GatherMode > ( )

inlineconstexprnoexcept

Maximum number of elements in GatherMode enum.

See also: GatherMode

◆ EnumMax< LayerInformationFormat >()

template<>

constexpr int32_t nvinfer1::EnumMax< LayerInformationFormat > ( )

inlineconstexprnoexcept

Maximum number of layer information formats in LayerInformationFormat enum.

See also: LayerInformationFormat

◆ EnumMax< LayerType >()

template<>

constexpr int32_t nvinfer1::EnumMax< LayerType > ( )

inlineconstexprnoexcept

Maximum number of elements in LayerType enum.

See also: LayerType

◆ EnumMax< LoopOutput >()

template<>

constexpr int32_t nvinfer1::EnumMax< LoopOutput > ( )

inlineconstexprnoexcept

Maximum number of elements in LoopOutput enum.

See also: DataType

◆ EnumMax< MatrixOperation >()

template<>

constexpr int32_t nvinfer1::EnumMax< MatrixOperation > ( )

inlineconstexprnoexcept

Maximum number of elements in MatrixOperation enum.

See also: DataType

◆ EnumMax< MemoryPoolType >()

template<>

constexpr int32_t nvinfer1::EnumMax< MemoryPoolType > ( )

inlineconstexprnoexcept

Maximum number of memory pool types in the MemoryPoolType enum.

See also: MemoryPoolType

◆ EnumMax< NetworkDefinitionCreationFlag >()

template<>

constexpr int32_t nvinfer1::EnumMax< NetworkDefinitionCreationFlag > ( )

inlineconstexprnoexcept

Maximum number of elements in NetworkDefinitionCreationFlag enum.

See also: NetworkDefinitionCreationFlag

◆ EnumMax< OptProfileSelector >()

template<>

constexpr int32_t nvinfer1::EnumMax< OptProfileSelector > ( )

inlineconstexprnoexcept

Number of different values of OptProfileSelector enum.

See also: OptProfileSelector

◆ EnumMax< ProfilingVerbosity >()

template<>

constexpr int32_t nvinfer1::EnumMax< ProfilingVerbosity > ( )

inlineconstexprnoexcept

Maximum number of profile verbosity levels in ProfilingVerbosity enum.

See also: ProfilingVerbosity

◆ EnumMax< QuantizationFlag >()

template<>

constexpr int32_t nvinfer1::EnumMax< QuantizationFlag > ( )

inlineconstexprnoexcept

Maximum number of quantization flags in QuantizationFlag enum.

See also: QuantizationFlag

◆ EnumMax< ReduceOperation >()

template<>

constexpr int32_t nvinfer1::EnumMax< ReduceOperation > ( )

inlineconstexprnoexcept

Maximum number of elements in ReduceOperation enum.

See also: ReduceOperation

◆ EnumMax< SampleMode >()

template<>

constexpr int32_t nvinfer1::EnumMax< SampleMode > ( )

inlineconstexprnoexcept

Maximum number of elements in SampleMode enum.

See also: SampleMode

◆ EnumMax< ScaleMode >()

template<>

constexpr int32_t nvinfer1::EnumMax< ScaleMode > ( )

inlineconstexprnoexcept

Maximum number of elements in ScaleMode enum.

See also: ScaleMode

◆ EnumMax< ScatterMode >()

template<>

constexpr int32_t nvinfer1::EnumMax< ScatterMode > ( )

inlineconstexprnoexcept

Maximum number of elements in ScatterMode enum.

See also: ScatterMode

◆ EnumMax< SerializationFlag >()

template<>

constexpr int32_t nvinfer1::EnumMax< SerializationFlag > ( )

inlineconstexprnoexcept

Maximum number of serialization flags in SerializationFlag enum.

See also: SerializationFlag

◆ EnumMax< TacticSource >()

template<>

constexpr int32_t nvinfer1::EnumMax< TacticSource > ( )

inlineconstexprnoexcept

Maximum number of tactic sources in TacticSource enum.

See also: TacticSource

◆ EnumMax< TempfileControlFlag >()

template<>

constexpr int32_t nvinfer1::EnumMax< TempfileControlFlag > ( )

inlineconstexprnoexcept

Maximum number of elements in TempfileControlFlag enum.

See also: TempfileControlFlag

◆ EnumMax< TopKOperation >()

template<>

constexpr int32_t nvinfer1::EnumMax< TopKOperation > ( )

inlineconstexprnoexcept

Maximum number of elements in TopKOperation enum.

See also: TopKOperation

◆ EnumMax< TripLimit >()

template<>

constexpr int32_t nvinfer1::EnumMax< TripLimit > ( )

inlineconstexprnoexcept

Maximum number of elements in TripLimit enum.

See also: DataType

◆ EnumMax< UnaryOperation >()

template<>

constexpr int32_t nvinfer1::EnumMax< UnaryOperation > ( )

inlineconstexprnoexcept

Maximum number of elements in UnaryOperation enum.

See also: UnaryOperation

◆ EnumMax< WeightsRole >()

template<>

constexpr int32_t nvinfer1::EnumMax< WeightsRole > ( )

inlineconstexprnoexcept

Maximum number of elements in WeightsRole enum.

See also: WeightsRole

◆ getBuilderPluginRegistry()

nvinfer1::IPluginRegistry * nvinfer1::getBuilderPluginRegistry ( nvinfer1::EngineCapability capability )

noexcept

Return the plugin registry for building a Standard engine, or nullptr if no registry exists.

Also return nullptr if the input argument is not EngineCapability::kSTANDARD. Engine capabilities EngineCapability::kSTANDARD and EngineCapability::kSAFETY have distinct plugin registries. When building a Safety engine, use nvinfer1::getBuilderSafePluginRegistry(). Use IPluginRegistry::registerCreator from the registry to register plugins. Plugins registered in a registry associated with a specific engine capability are only available when building engines with that engine capability.

There is no plugin registry for EngineCapability::kDLA_STANDALONE.

◆ getBuilderSafePluginRegistry()

nvinfer1::safe::IPluginRegistry * nvinfer1::getBuilderSafePluginRegistry ( nvinfer1::EngineCapability capability )

noexcept

Return the plugin registry for building a Safety engine, or nullptr if no registry exists.

Also return nullptr if the input argument is not EngineCapability::kSAFETY. When building a Standard engine, use nvinfer1::getBuilderPluginRegistry(). Use safe::IPluginRegistry::registerCreator from the registry to register plugins.

Namespaces

Classes

Typedefs

Enumerations

Functions

Detailed Description

Typedef Documentation

◆ AllocatorFlags

◆ AsciiChar

◆ BuilderFlags

◆ char_t

◆ Dims

◆ IAlgorithmSelector

◆ IDebugListener

◆ IErrorRecorder

◆ IGpuAllocator

◆ IGpuAsyncAllocator

◆ IInt8EntropyCalibrator

◆ IInt8EntropyCalibrator2

◆ IInt8LegacyCalibrator

◆ IInt8MinMaxCalibrator

◆ InterfaceKind

◆ IOutputAllocator

◆ IPluginCapability

◆ IPluginCreator

◆ IPluginCreatorInterface

◆ IPluginCreatorV3One

◆ IPluginResource

◆ IPluginV3

◆ IPluginV3OneBuild

◆ IPluginV3OneCore

◆ IPluginV3OneRuntime

◆ IProfiler

◆ IProgressMonitor

◆ IStreamReader

◆ NetworkDefinitionCreationFlags

◆ PluginFormat

◆ QuantizationFlags

◆ SerializationFlags

◆ TacticSources

◆ TempfileControlFlags

◆ TensorFormats

Enumeration Type Documentation

◆ ActivationType

◆ AllocatorFlag

◆ APILanguage

◆ BoundingBoxFormat

◆ BuilderFlag

◆ CalibrationAlgoType

◆ DataType

◆ DeviceType

◆ DimensionOperation

◆ ElementWiseOperation

◆ EngineCapability

◆ ErrorCode

◆ ExecutionContextAllocationStrategy

◆ FillOperation

◆ GatherMode

◆ HardwareCompatibilityLevel

◆ InterpolationMode

◆ LayerInformationFormat

◆ LayerType

◆ LoopOutput

◆ MatrixOperation

◆ MemoryPoolType

◆ NetworkDefinitionCreationFlag

◆ OptProfileSelector

◆ PaddingMode

◆ PluginCapabilityType

◆ PluginCreatorVersion

◆ PluginFieldType

◆ PluginVersion

◆ PoolingType

◆ PreviewFeature

◆ ProfilingVerbosity

◆ QuantizationFlag

◆ ReduceOperation

◆ ResizeCoordinateTransformation

◆ ResizeRoundMode

◆ ResizeSelector