TensorRT 10.0.0
nvinfer1 Namespace Reference

The TensorRT API version 1 namespace. More...

Namespaces

namespace  anonymous_namespace{NvInfer.h}
 
namespace  anonymous_namespace{NvInferRuntime.h}
 
namespace  consistency
 
namespace  impl
 
namespace  plugin
 
namespace  safe
 The safety subset of TensorRT's API version 1 namespace.
 
namespace  v_1_0
 Forward declare IErrorRecorder for use in other interfaces.
 

Classes

class  Dims2
 Descriptor for two-dimensional data. More...
 
class  Dims3
 Descriptor for three-dimensional data. More...
 
class  Dims4
 Descriptor for four-dimensional data. More...
 
class  Dims64
 
class  DimsExprs
 Analog of class Dims with expressions instead of constants for the dimensions. More...
 
class  DimsHW
 Descriptor for two-dimensional spatial data. More...
 
struct  DynamicPluginTensorDesc
 Summarizes tensors that a plugin might see for an input or output. More...
 
class  IActivationLayer
 An Activation layer in a network definition. More...
 
class  IAlgorithm
 Describes a variation of execution of a layer. An algorithm is represented by IAlgorithmVariant and the IAlgorithmIOInfo for each of its inputs and outputs. An algorithm can be selected or reproduced using AlgorithmSelector::selectAlgorithms(). More...
 
class  IAlgorithmContext
 Describes the context and requirements, that could be fulfilled by one or more instances of IAlgorithm. More...
 
class  IAlgorithmIOInfo
 Carries information about input or output of the algorithm. IAlgorithmIOInfo for all the input and output along with IAlgorithmVariant denotes the variation of algorithm and can be used to select or reproduce an algorithm using IAlgorithmSelector::selectAlgorithms(). More...
 
class  IAlgorithmVariant
 provides a unique 128-bit identifier, which along with the input and output information denotes the variation of algorithm and can be used to select or reproduce an algorithm, using IAlgorithmSelector::selectAlgorithms() More...
 
class  IAssertionLayer
 An assertion layer in a network. More...
 
class  IBuilder
 Builds an engine from a network definition. More...
 
class  IBuilderConfig
 Holds properties for configuring a builder to produce an engine. More...
 
class  ICastLayer
 A cast layer in a network. More...
 
class  IConcatenationLayer
 A concatenation layer in a network definition. More...
 
class  IConditionLayer
 This layer represents a condition input to an IIfConditional. More...
 
class  IConstantLayer
 Layer that represents a constant value. More...
 
class  IConvolutionLayer
 A convolution layer in a network definition. More...
 
class  ICudaEngine
 An engine for executing inference on a built network, with functionally unsafe features. More...
 
class  IDeconvolutionLayer
 A deconvolution layer in a network definition. More...
 
class  IDequantizeLayer
 A Dequantize layer in a network definition. More...
 
class  IDimensionExpr
 An IDimensionExpr represents an integer expression constructed from constants, input dimensions, and binary operations. These expressions are can be used in overrides of IPluginV2DynamicExt::getOutputDimensions or IPluginV3OneBuild::getOutputShapes() to define output dimensions in terms of input dimensions. More...
 
class  IEinsumLayer
 An Einsum layer in a network. More...
 
class  IElementWiseLayer
 A elementwise layer in a network definition. More...
 
class  IEngineInspector
 An engine inspector which prints out the layer information of an engine or an execution context. More...
 
class  IExecutionContext
 Context for executing inference using an engine, with functionally unsafe features. More...
 
class  IExprBuilder
 Object for constructing IDimensionExpr. More...
 
class  IFillLayer
 Generate a tensor according to a specified mode. More...
 
class  IGatherLayer
 A Gather layer in a network definition. Supports several kinds of gathering. More...
 
class  IGridSampleLayer
 A GridSample layer in a network definition. More...
 
class  IHostMemory
 Class to handle library allocated memory that is accessible to the user. More...
 
class  IIdentityLayer
 A layer that represents the identity function. More...
 
class  IIfConditional
 Helper for constructing conditionally-executed subgraphs. More...
 
class  IIfConditionalBoundaryLayer
 This is a base class for Conditional boundary layers. More...
 
class  IIfConditionalInputLayer
 This layer represents an input to an IIfConditional. More...
 
class  IIfConditionalOutputLayer
 This layer represents an output of an IIfConditional. More...
 
class  IInt8Calibrator
 Application-implemented interface for calibration. More...
 
class  IIteratorLayer
 A layer to do iterations. More...
 
class  ILayer
 Base class for all layer classes in a network definition. More...
 
class  ILogger
 Application-implemented logging interface for the builder, refitter and runtime. More...
 
class  ILoggerFinder
 A virtual base class to find a logger. Allows a plugin to find an instance of a logger if it needs to emit a log message. A pointer to an instance of this class is passed to a plugin shared library on initialization when that plugin is serialized as part of a version-compatible plan. See the plugin chapter in the developer guide for details. More...
 
class  ILoop
 Helper for creating a recurrent subgraph. More...
 
class  ILoopBoundaryLayer
 This is a base class for Loop boundary layers. More...
 
class  ILoopOutputLayer
 An ILoopOutputLayer is the sole way to get output from a loop. More...
 
class  ILRNLayer
 A LRN layer in a network definition. More...
 
class  IMatrixMultiplyLayer
 Layer that represents a Matrix Multiplication. More...
 
class  INetworkDefinition
 A network definition for input to the builder. More...
 
class  INMSLayer
 A non-maximum suppression layer in a network definition. More...
 
class  INoCopy
 Forward declaration of IEngineInspector for use by other interfaces. More...
 
class  INonZeroLayer
 
class  INormalizationLayer
 A normalization layer in a network definition. More...
 
class  InterfaceInfo
 Version information associated with a TRT interface. More...
 
class  IOneHotLayer
 A OneHot layer in a network definition. More...
 
class  IOptimizationProfile
 Optimization profile for dynamic input dimensions and shape tensors. More...
 
class  IPaddingLayer
 Layer that represents a padding operation. More...
 
class  IParametricReLULayer
 Layer that represents a parametric ReLU operation. More...
 
class  IPluginRegistry
 Single registration point for all plugins in an application. It is used to find plugin implementations during engine deserialization. Internally, the plugin registry is considered to be a singleton so all plugins in an application are part of the same global registry. Note that the plugin registry is only supported for plugins of type IPluginV2 and should also have a corresponding IPluginCreator implementation. More...
 
class  IPluginResourceContext
 Interface for plugins to access per context resources provided by TensorRT. More...
 
class  IPluginV2
 Plugin class for user-implemented layers. More...
 
class  IPluginV2DynamicExt
 Similar to IPluginV2Ext, but with support for dynamic shapes. More...
 
class  IPluginV2Ext
 Plugin class for user-implemented layers. More...
 
class  IPluginV2IOExt
 Plugin class for user-implemented layers. More...
 
class  IPluginV2Layer
 Layer type for pluginV2. More...
 
class  IPluginV3Layer
 Layer type for V3 plugins. More...
 
class  IPoolingLayer
 A Pooling layer in a network definition. More...
 
class  IQuantizeLayer
 A Quantize layer in a network definition. More...
 
class  IRaggedSoftMaxLayer
 A RaggedSoftmax layer in a network definition. More...
 
class  IRecurrenceLayer
 A recurrence layer in a network definition. More...
 
class  IReduceLayer
 Layer that represents a reduction across a non-bool tensor. More...
 
class  IRefitter
 Updates weights in an engine. More...
 
class  IResizeLayer
 A resize layer in a network definition. More...
 
class  IReverseSequenceLayer
 A ReverseSequence layer in a network definition. More...
 
class  IRuntime
 Allows a serialized functionally unsafe engine to be deserialized. More...
 
class  IScaleLayer
 A Scale layer in a network definition. More...
 
class  IScatterLayer
 A scatter layer in a network definition. Supports several kinds of scattering. More...
 
class  ISelectLayer
 A select layer in a network definition. More...
 
class  ISerializationConfig
 Holds properties for configuring an engine to serialize the binary. More...
 
class  IShapeLayer
 Layer type for getting shape of a tensor. More...
 
class  IShuffleLayer
 Layer type for shuffling data. More...
 
class  ISliceLayer
 Slices an input tensor into an output tensor based on the offset and strides. More...
 
class  ISoftMaxLayer
 A Softmax layer in a network definition. More...
 
class  ITensor
 A tensor in a network definition. More...
 
class  ITimingCache
 Class to handle tactic timing info collected from builder. More...
 
class  ITopKLayer
 Layer that represents a TopK reduction. More...
 
class  ITripLimitLayer
 A layer that represents a trip-count limiter. More...
 
class  IUnaryLayer
 Layer that represents an unary operation. More...
 
class  IVersionedInterface
 An Interface class for version control. More...
 
struct  Permutation
 Represents a permutation of dimensions. More...
 
class  PluginField
 Structure containing plugin attribute field names and associated data This information can be parsed to decode necessary plugin metadata. More...
 
struct  PluginFieldCollection
 Plugin field collection struct. More...
 
class  PluginRegistrar
 Register the plugin creator to the registry The static registry object will be instantiated when the plugin library is loaded. This static object will register all creators available in the library to the registry. More...
 
struct  PluginTensorDesc
 Fields that a plugin might see for an input or output. More...
 
class  Weights
 An array of weights used as a layer parameter. More...
 

Typedefs

using TensorFormats = uint32_t
 It is capable of representing one or more TensorFormat by binary OR operations, e.g., 1U << TensorFormat::kCHW4 | 1U << TensorFormat::kCHW32. More...
 
using IInt8EntropyCalibrator = v_1_0::IInt8EntropyCalibrator
 
using IInt8EntropyCalibrator2 = v_1_0::IInt8EntropyCalibrator2
 
using IInt8MinMaxCalibrator = v_1_0::IInt8MinMaxCalibrator
 
using IInt8LegacyCalibrator = v_1_0::IInt8LegacyCalibrator
 
using IAlgorithmSelector = v_1_0::IAlgorithmSelector
 
using QuantizationFlags = uint32_t
 Represents one or more QuantizationFlag values using binary OR operations. More...
 
using BuilderFlags = uint32_t
 Represents one or more BuilderFlag values using binary OR operations, e.g., 1U << BuilderFlag::kFP16 | 1U << BuilderFlag::kDEBUG. More...
 
using IProgressMonitor = v_1_0::IProgressMonitor
 
using NetworkDefinitionCreationFlags = uint32_t
 Represents one or more NetworkDefinitionCreationFlag flags using binary OR operations. e.g., 1U << NetworkDefinitionCreationFlag::kSTRONGLY_TYPED. More...
 
using IPluginCapability = v_1_0::IPluginCapability
 
using IPluginV3 = v_1_0::IPluginV3
 
using IPluginV3OneCore = v_1_0::IPluginV3OneCore
 
using IPluginV3OneBuild = v_1_0::IPluginV3OneBuild
 
using IPluginV3OneRuntime = v_1_0::IPluginV3OneRuntime
 
using IPluginCreatorV3One = v_1_0::IPluginCreatorV3One
 
using IProfiler = v_1_0::IProfiler
 
using TempfileControlFlags = uint32_t
 Represents a collection of one or more TempfileControlFlag values combined using bitwise-OR operations. More...
 
using TacticSources = uint32_t
 Represents a collection of one or more TacticSource values combine using bitwise-OR operations. More...
 
using SerializationFlags = uint32_t
 Represents one or more SerializationFlag values using binary OR operations, e.g., 1U << SerializationFlag::kEXCLUDE_LEAN_RUNTIME. More...
 
using IOutputAllocator = v_1_0::IOutputAllocator
 
using IDebugListener = v_1_0::IDebugListener
 
using IGpuAsyncAllocator = v_1_0::IGpuAsyncAllocator
 
using char_t = char
 char_t is the type used by TensorRT to represent all valid characters. More...
 
using AsciiChar = char_t
 
using IErrorRecorder = v_1_0::IErrorRecorder
 
using Dims = Dims64
 
using InterfaceKind = char const *
 
using AllocatorFlags = uint32_t
 
using IGpuAllocator = v_1_0::IGpuAllocator
 
using IStreamReader = v_1_0::IStreamReader
 
using IPluginResource = v_1_0::IPluginResource
 
using PluginFormat = TensorFormat
 PluginFormat is reserved for backward compatibility. More...
 
using IPluginCreatorInterface = v_1_0::IPluginCreatorInterface
 
using IPluginCreator = v_1_0::IPluginCreator
 

Enumerations

enum class  LayerType : int32_t {
  kCONVOLUTION = 0 , kCAST = 1 , kACTIVATION = 2 , kPOOLING = 3 ,
  kLRN = 4 , kSCALE = 5 , kSOFTMAX = 6 , kDECONVOLUTION = 7 ,
  kCONCATENATION = 8 , kELEMENTWISE = 9 , kPLUGIN = 10 , kUNARY = 11 ,
  kPADDING = 12 , kSHUFFLE = 13 , kREDUCE = 14 , kTOPK = 15 ,
  kGATHER = 16 , kMATRIX_MULTIPLY = 17 , kRAGGED_SOFTMAX = 18 , kCONSTANT = 19 ,
  kIDENTITY = 20 , kPLUGIN_V2 = 21 , kSLICE = 22 , kSHAPE = 23 ,
  kPARAMETRIC_RELU = 24 , kRESIZE = 25 , kTRIP_LIMIT = 26 , kRECURRENCE = 27 ,
  kITERATOR = 28 , kLOOP_OUTPUT = 29 , kSELECT = 30 , kFILL = 31 ,
  kQUANTIZE = 32 , kDEQUANTIZE = 33 , kCONDITION = 34 , kCONDITIONAL_INPUT = 35 ,
  kCONDITIONAL_OUTPUT = 36 , kSCATTER = 37 , kEINSUM = 38 , kASSERTION = 39 ,
  kONE_HOT = 40 , kNON_ZERO = 41 , kGRID_SAMPLE = 42 , kNMS = 43 ,
  kREVERSE_SEQUENCE = 44 , kNORMALIZATION = 45 , kPLUGIN_V3 = 46
}
 The type values of layer classes. More...
 
enum class  ActivationType : int32_t {
  kRELU = 0 , kSIGMOID = 1 , kTANH = 2 , kLEAKY_RELU = 3 ,
  kELU = 4 , kSELU = 5 , kSOFTSIGN = 6 , kSOFTPLUS = 7 ,
  kCLIP = 8 , kHARD_SIGMOID = 9 , kSCALED_TANH = 10 , kTHRESHOLDED_RELU = 11 ,
  kGELU_ERF = 12 , kGELU_TANH = 13
}
 Enumerates the types of activation to perform in an activation layer. More...
 
enum class  PaddingMode : int32_t { kEXPLICIT_ROUND_DOWN = 0 , kEXPLICIT_ROUND_UP = 1 , kSAME_UPPER = 2 , kSAME_LOWER = 3 }
 Enumerates the modes of padding to perform in convolution, deconvolution and pooling layer, padding mode takes precedence if setPaddingMode() and setPrePadding() are also used. More...
 
enum class  PoolingType : int32_t { kMAX = 0 , kAVERAGE = 1 , kMAX_AVERAGE_BLEND = 2 }
 The type of pooling to perform in a pooling layer. More...
 
enum class  ScaleMode : int32_t { kUNIFORM = 0 , kCHANNEL = 1 , kELEMENTWISE = 2 }
 Controls how shift, scale and power are applied in a Scale layer. More...
 
enum class  ElementWiseOperation : int32_t {
  kSUM = 0 , kPROD = 1 , kMAX = 2 , kMIN = 3 ,
  kSUB = 4 , kDIV = 5 , kPOW = 6 , kFLOOR_DIV = 7 ,
  kAND = 8 , kOR = 9 , kXOR = 10 , kEQUAL = 11 ,
  kGREATER = 12 , kLESS = 13
}
 Enumerates the binary operations that may be performed by an ElementWise layer. More...
 
enum class  GatherMode : int32_t { kDEFAULT = 0 , kELEMENT = 1 , kND = 2 }
 Control form of IGatherLayer. More...
 
enum class  UnaryOperation : int32_t {
  kEXP = 0 , kLOG = 1 , kSQRT = 2 , kRECIP = 3 ,
  kABS = 4 , kNEG = 5 , kSIN = 6 , kCOS = 7 ,
  kTAN = 8 , kSINH = 9 , kCOSH = 10 , kASIN = 11 ,
  kACOS = 12 , kATAN = 13 , kASINH = 14 , kACOSH = 15 ,
  kATANH = 16 , kCEIL = 17 , kFLOOR = 18 , kERF = 19 ,
  kNOT = 20 , kSIGN = 21 , kROUND = 22 , kISINF = 23
}
 Enumerates the unary operations that may be performed by a Unary layer. More...
 
enum class  ReduceOperation : int32_t {
  kSUM = 0 , kPROD = 1 , kMAX = 2 , kMIN = 3 ,
  kAVG = 4
}
 Enumerates the reduce operations that may be performed by a Reduce layer. More...
 
enum class  SampleMode : int32_t {
  kSTRICT_BOUNDS = 0 , kWRAP = 1 , kCLAMP = 2 , kFILL = 3 ,
  kREFLECT = 4
}
 Controls how ISliceLayer and IGridSample handle out-of-bounds coordinates. More...
 
enum class  TopKOperation : int32_t { kMAX = 0 , kMIN = 1 }
 Enumerates the operations that may be performed by a TopK layer. More...
 
enum class  MatrixOperation : int32_t { kNONE = 0 , kTRANSPOSE = 1 , kVECTOR = 2 }
 Enumerates the operations that may be performed on a tensor by IMatrixMultiplyLayer before multiplication. More...
 
enum class  InterpolationMode : int32_t { kNEAREST = 0 , kLINEAR = 1 , kCUBIC = 2 }
 Enumerates various modes of interpolation. More...
 
enum class  ResizeCoordinateTransformation : int32_t { kALIGN_CORNERS = 0 , kASYMMETRIC = 1 , kHALF_PIXEL = 2 }
 The resize coordinate transformation function. More...
 
enum class  ResizeSelector : int32_t { kFORMULA = 0 , kUPPER = 1 }
 The coordinate selector when resize to single pixel output. More...
 
enum class  ResizeRoundMode : int32_t { kHALF_UP = 0 , kHALF_DOWN = 1 , kFLOOR = 2 , kCEIL = 3 }
 The rounding mode for nearest neighbor resize. More...
 
enum class  LoopOutput : int32_t { kLAST_VALUE = 0 , kCONCATENATE = 1 , kREVERSE = 2 }
 
enum class  TripLimit : int32_t { kCOUNT = 0 , kWHILE = 1 }
 
enum class  FillOperation : int32_t { kLINSPACE = 0 , kRANDOM_UNIFORM = 1 , kRANDOM_NORMAL = 2 }
 Enumerates the tensor fill operations that may performed by a fill layer. More...
 
enum class  ScatterMode : int32_t { kELEMENT = 0 , kND = 1 }
 Control form of IScatterLayer. More...
 
enum class  BoundingBoxFormat : int32_t { kCORNER_PAIRS = 0 , kCENTER_SIZES = 1 }
 Representation of bounding box data used for the Boxes input tensor in INMSLayer. More...
 
enum class  CalibrationAlgoType : int32_t { kLEGACY_CALIBRATION = 0 , kENTROPY_CALIBRATION = 1 , kENTROPY_CALIBRATION_2 = 2 , kMINMAX_CALIBRATION = 3 }
 Version of calibration algorithm to use. More...
 
enum class  QuantizationFlag : int32_t { kCALIBRATE_BEFORE_FUSION = 0 }
 List of valid flags for quantizing the network to int8. More...
 
enum class  BuilderFlag : int32_t {
  kFP16 = 0 , kINT8 = 1 , kDEBUG = 2 , kGPU_FALLBACK = 3 ,
  kREFIT = 4 , kDISABLE_TIMING_CACHE = 5 , kTF32 = 6 , kSPARSE_WEIGHTS = 7 ,
  kSAFETY_SCOPE = 8 , kOBEY_PRECISION_CONSTRAINTS = 9 , kPREFER_PRECISION_CONSTRAINTS = 10 , kDIRECT_IO = 11 ,
  kREJECT_EMPTY_ALGORITHMS = 12 , kVERSION_COMPATIBLE = 13 , kEXCLUDE_LEAN_RUNTIME = 14 , kFP8 = 15 ,
  kERROR_ON_TIMING_CACHE_MISS = 16 , kBF16 = 17 , kDISABLE_COMPILATION_CACHE = 18 , kSTRIP_PLAN = 19 ,
  kWEIGHTLESS = kSTRIP_PLAN , kREFIT_IDENTICAL = 20 , kWEIGHT_STREAMING = 21
}
 List of valid modes that the builder can enable when creating an engine from a network definition. More...
 
enum class  MemoryPoolType : int32_t {
  kWORKSPACE = 0 , kDLA_MANAGED_SRAM = 1 , kDLA_LOCAL_DRAM = 2 , kDLA_GLOBAL_DRAM = 3 ,
  kTACTIC_DRAM = 4 , kTACTIC_SHARED_MEMORY = 5
}
 The type for memory pools used by TensorRT. More...
 
enum class  PreviewFeature : int32_t { kPROFILE_SHARING_0806 = 0 }
 Define preview features. More...
 
enum class  HardwareCompatibilityLevel : int32_t { kNONE = 0 , kAMPERE_PLUS = 1 }
 Describes requirements of compatibility with GPU architectures other than that of the GPU on which the engine was built. More...
 
enum class  NetworkDefinitionCreationFlag : int32_t { kEXPLICIT_BATCH = 0 , kSTRONGLY_TYPED = 1 }
 List of immutable network properties expressed at network creation time. NetworkDefinitionCreationFlag is used with createNetworkV2() to specify immutable properties of the network. More...
 
enum class  EngineCapability : int32_t { kSTANDARD = 0 , kSAFETY = 1 , kDLA_STANDALONE = 2 }
 List of supported engine capability flows. More...
 
enum class  DimensionOperation : int32_t {
  kSUM = 0 , kPROD = 1 , kMAX = 2 , kMIN = 3 ,
  kSUB = 4 , kEQUAL = 5 , kLESS = 6 , kFLOOR_DIV = 7 ,
  kCEIL_DIV = 8
}
 An operation on two IDimensionExpr, which represent integer expressions used in dimension computations. More...
 
enum class  TensorLocation : int32_t { kDEVICE = 0 , kHOST = 1 }
 The location for tensor data storage, device or host. More...
 
enum class  WeightsRole : int32_t {
  kKERNEL = 0 , kBIAS = 1 , kSHIFT = 2 , kSCALE = 3 ,
  kCONSTANT = 4 , kANY = 5
}
 How a layer uses particular Weights. More...
 
enum class  DeviceType : int32_t { kGPU = 0 , kDLA = 1 }
 The device that this layer/network will execute on. More...
 
enum class  TempfileControlFlag : int32_t { kALLOW_IN_MEMORY_FILES = 0 , kALLOW_TEMPORARY_FILES = 1 }
 Flags used to control TensorRT's behavior when creating executable temporary files. More...
 
enum class  OptProfileSelector : int32_t { kMIN = 0 , kOPT = 1 , kMAX = 2 }
 When setting or querying optimization profile parameters (such as shape tensor inputs or dynamic dimensions), select whether we are interested in the minimum, optimum, or maximum values for these parameters. The minimum and maximum specify the permitted range that is supported at runtime, while the optimum value is used for the kernel selection. This should be the "typical" value that is expected to occur at runtime. More...
 
enum class  TacticSource : int32_t {
  kCUBLAS = 0 , kCUBLAS_LT = 1 , kCUDNN = 2 , kEDGE_MASK_CONVOLUTIONS = 3 ,
  kJIT_CONVOLUTIONS = 4
}
 List of tactic sources for TensorRT. More...
 
enum class  ProfilingVerbosity : int32_t { kLAYER_NAMES_ONLY = 0 , kNONE = 1 , kDETAILED = 2 }
 List of verbosity levels of layer information exposed in NVTX annotations and in IEngineInspector. More...
 
enum class  SerializationFlag : int32_t { kEXCLUDE_WEIGHTS = 0 , kEXCLUDE_LEAN_RUNTIME = 1 }
 List of valid flags that the engine can enable when serializing the bytes. More...
 
enum class  ExecutionContextAllocationStrategy : int32_t { kSTATIC = 0 , kON_PROFILE_CHANGE = 1 , kUSER_MANAGED = 2 }
 Different memory allocation behaviors for IExecutionContext. More...
 
enum class  LayerInformationFormat : int32_t { kONELINE = 0 , kJSON = 1 }
 The format in which the IEngineInspector prints the layer information. More...
 
enum class  DataType : int32_t {
  kFLOAT = 0 , kHALF = 1 , kINT8 = 2 , kINT32 = 3 ,
  kBOOL = 4 , kUINT8 = 5 , kFP8 = 6 , kBF16 = 7 ,
  kINT64 = 8 , kINT4 = 9
}
 The type of weights and tensors. More...
 
enum class  TensorFormat : int32_t {
  kLINEAR = 0 , kCHW2 = 1 , kHWC8 = 2 , kCHW4 = 3 ,
  kCHW16 = 4 , kCHW32 = 5 , kDHWC8 = 6 , kCDHW32 = 7 ,
  kHWC = 8 , kDLA_LINEAR = 9 , kDLA_HWC4 = 10 , kHWC16 = 11 ,
  kDHWC = 12
}
 Format of the input/output tensors. More...
 
enum class  APILanguage : int32_t { kCPP = 0 , kPYTHON = 1 }
 Programming language used in the implementation of a TRT interface. More...
 
enum class  AllocatorFlag : int32_t { kRESIZABLE = 0 }
 Allowed type of memory allocation. More...
 
enum class  ErrorCode : int32_t {
  kSUCCESS = 0 , kUNSPECIFIED_ERROR = 1 , kINTERNAL_ERROR = 2 , kINVALID_ARGUMENT = 3 ,
  kINVALID_CONFIG = 4 , kFAILED_ALLOCATION = 5 , kFAILED_INITIALIZATION = 6 , kFAILED_EXECUTION = 7 ,
  kFAILED_COMPUTATION = 8 , kINVALID_STATE = 9 , kUNSUPPORTED_STATE = 10
}
 Error codes that can be returned by TensorRT during execution. More...
 
enum class  TensorIOMode : int32_t { kNONE = 0 , kINPUT = 1 , kOUTPUT = 2 }
 Definition of tensor IO Mode. More...
 
enum class  PluginVersion : uint8_t {
  kV2 = 0 , kV2_EXT = 1 , kV2_IOEXT = 2 , kV2_DYNAMICEXT = 3 ,
  kV2_DYNAMICEXT_PYTHON = kPLUGIN_VERSION_PYTHON_BIT | 3
}
 
enum class  PluginCreatorVersion : int32_t { kV1 = 0 , kV1_PYTHON = kPLUGIN_VERSION_PYTHON_BIT }
 Enum to identify version of the plugin creator. More...
 
enum class  PluginFieldType : int32_t {
  kFLOAT16 = 0 , kFLOAT32 = 1 , kFLOAT64 = 2 , kINT8 = 3 ,
  kINT16 = 4 , kINT32 = 5 , kCHAR = 6 , kDIMS = 7 ,
  kUNKNOWN = 8 , kBF16 = 9 , kINT64 = 10 , kFP8 = 11
}
 The possible field types for custom layer. More...
 
enum class  PluginCapabilityType : int32_t { kCORE = 0 , kBUILD = 1 , kRUNTIME = 2 }
 Enumerates the different capability types a IPluginV3 object may have. More...
 
enum class  TensorRTPhase : int32_t { kBUILD = 0 , kRUNTIME = 1 }
 Indicates a phase of operation of TensorRT. More...
 

Functions

template<>
constexpr int32_t EnumMax< LayerType > () noexcept
 
template<>
constexpr int32_t EnumMax< ScaleMode > () noexcept
 
template<>
constexpr int32_t EnumMax< GatherMode > () noexcept
 
template<>
constexpr int32_t EnumMax< UnaryOperation > () noexcept
 
template<>
constexpr int32_t EnumMax< ReduceOperation > () noexcept
 
template<>
constexpr int32_t EnumMax< SampleMode > () noexcept
 
template<>
constexpr int32_t EnumMax< TopKOperation > () noexcept
 
template<>
constexpr int32_t EnumMax< MatrixOperation > () noexcept
 
template<>
constexpr int32_t EnumMax< LoopOutput > () noexcept
 
template<>
constexpr int32_t EnumMax< TripLimit > () noexcept
 
template<>
constexpr int32_t EnumMax< FillOperation > () noexcept
 
template<>
constexpr int32_t EnumMax< ScatterMode > () noexcept
 
template<>
constexpr int32_t EnumMax< BoundingBoxFormat > () noexcept
 
template<>
constexpr int32_t EnumMax< CalibrationAlgoType > () noexcept
 
template<>
constexpr int32_t EnumMax< QuantizationFlag > () noexcept
 
template<>
constexpr int32_t EnumMax< BuilderFlag > () noexcept
 
template<>
constexpr int32_t EnumMax< MemoryPoolType > () noexcept
 
template<>
constexpr int32_t EnumMax< NetworkDefinitionCreationFlag > () noexcept
 
nvinfer1::IPluginRegistrygetBuilderPluginRegistry (nvinfer1::EngineCapability capability) noexcept
 Return the plugin registry for building a Standard engine, or nullptr if no registry exists. More...
 
nvinfer1::safe::IPluginRegistrygetBuilderSafePluginRegistry (nvinfer1::EngineCapability capability) noexcept
 Return the plugin registry for building a Safety engine, or nullptr if no registry exists. More...
 
template<>
constexpr int32_t EnumMax< DimensionOperation > () noexcept
 Maximum number of elements in DimensionOperation enum. More...
 
template<>
constexpr int32_t EnumMax< WeightsRole > () noexcept
 Maximum number of elements in WeightsRole enum. More...
 
template<>
constexpr int32_t EnumMax< DeviceType > () noexcept
 Maximum number of elements in DeviceType enum. More...
 
template<>
constexpr int32_t EnumMax< TempfileControlFlag > () noexcept
 Maximum number of elements in TempfileControlFlag enum. More...
 
template<>
constexpr int32_t EnumMax< OptProfileSelector > () noexcept
 Number of different values of OptProfileSelector enum. More...
 
template<>
constexpr int32_t EnumMax< TacticSource > () noexcept
 Maximum number of tactic sources in TacticSource enum. More...
 
template<>
constexpr int32_t EnumMax< ProfilingVerbosity > () noexcept
 Maximum number of profile verbosity levels in ProfilingVerbosity enum. More...
 
template<>
constexpr int32_t EnumMax< SerializationFlag > () noexcept
 Maximum number of serialization flags in SerializationFlag enum. More...
 
template<>
constexpr int32_t EnumMax< ExecutionContextAllocationStrategy > () noexcept
 Maximum number of memory allocation strategies in ExecutionContextAllocationStrategy enum. More...
 
template<>
constexpr int32_t EnumMax< LayerInformationFormat > () noexcept
 
template<typename T >
constexpr int32_t EnumMax () noexcept
 Maximum number of elements in an enumeration type. More...
 

Detailed Description

The TensorRT API version 1 namespace.

Typedef Documentation

◆ AllocatorFlags

using nvinfer1::AllocatorFlags = typedef uint32_t

◆ AsciiChar

using nvinfer1::AsciiChar = typedef char_t

AsciiChar is the type used by TensorRT to represent valid ASCII characters. This type is widely used in automotive safety context.

◆ BuilderFlags

using nvinfer1::BuilderFlags = typedef uint32_t

Represents one or more BuilderFlag values using binary OR operations, e.g., 1U << BuilderFlag::kFP16 | 1U << BuilderFlag::kDEBUG.

See also
IBuilderConfig::setFlags(), IBuilderConfig::getFlags()

◆ char_t

using nvinfer1::char_t = typedef char

char_t is the type used by TensorRT to represent all valid characters.

◆ Dims

using nvinfer1::Dims = typedef Dims64

Alias for Dims64.

◆ IAlgorithmSelector

◆ IDebugListener

◆ IErrorRecorder

◆ IGpuAllocator

◆ IGpuAsyncAllocator

◆ IInt8EntropyCalibrator

◆ IInt8EntropyCalibrator2

◆ IInt8LegacyCalibrator

◆ IInt8MinMaxCalibrator

◆ InterfaceKind

using nvinfer1::InterfaceKind = typedef char const*

◆ IOutputAllocator

◆ IPluginCapability

◆ IPluginCreator

◆ IPluginCreatorInterface

◆ IPluginCreatorV3One

◆ IPluginResource

◆ IPluginV3

◆ IPluginV3OneBuild

◆ IPluginV3OneCore

◆ IPluginV3OneRuntime

◆ IProfiler

◆ IProgressMonitor

◆ IStreamReader

◆ NetworkDefinitionCreationFlags

using nvinfer1::NetworkDefinitionCreationFlags = typedef uint32_t

Represents one or more NetworkDefinitionCreationFlag flags using binary OR operations. e.g., 1U << NetworkDefinitionCreationFlag::kSTRONGLY_TYPED.

See also
IBuilder::createNetworkV2

◆ PluginFormat

PluginFormat is reserved for backward compatibility.

See also
IPluginV2::supportsFormat()

◆ QuantizationFlags

using nvinfer1::QuantizationFlags = typedef uint32_t

Represents one or more QuantizationFlag values using binary OR operations.

See also
IBuilderConfig::getQuantizationFlags(), IBuilderConfig::setQuantizationFlags()

◆ SerializationFlags

using nvinfer1::SerializationFlags = typedef uint32_t

Represents one or more SerializationFlag values using binary OR operations, e.g., 1U << SerializationFlag::kEXCLUDE_LEAN_RUNTIME.

See also
ISerializationConfig::setFlags(), ISerializationConfig::getFlags()

◆ TacticSources

using nvinfer1::TacticSources = typedef uint32_t

Represents a collection of one or more TacticSource values combine using bitwise-OR operations.

See also
IBuilderConfig::setTacticSources(), IBuilderConfig::getTacticSources()

◆ TempfileControlFlags

using nvinfer1::TempfileControlFlags = typedef uint32_t

Represents a collection of one or more TempfileControlFlag values combined using bitwise-OR operations.

See also
TempfileControlFlag, IRuntime::setTempfileControlFlags(), IRuntime::getTempfileControlFlags()

◆ TensorFormats

using nvinfer1::TensorFormats = typedef uint32_t

It is capable of representing one or more TensorFormat by binary OR operations, e.g., 1U << TensorFormat::kCHW4 | 1U << TensorFormat::kCHW32.

See also
ITensor::getAllowedFormats(), ITensor::setAllowedFormats(),

Enumeration Type Documentation

◆ ActivationType

enum class nvinfer1::ActivationType : int32_t
strong

Enumerates the types of activation to perform in an activation layer.

Enumerator
kRELU 

Rectified linear activation.

kSIGMOID 

Sigmoid activation.

kTANH 

TanH activation.

kLEAKY_RELU 

LeakyRelu activation: x>=0 ? x : alpha * x.

kELU 

Elu activation: x>=0 ? x : alpha * (exp(x) - 1).

kSELU 

Selu activation: x>0 ? beta * x : beta * (alpha*exp(x) - alpha)

kSOFTSIGN 

Softsign activation: x / (1+|x|)

kSOFTPLUS 

Parametric softplus activation: alpha*log(exp(beta*x)+1)

kCLIP 

Clip activation: max(alpha, min(beta, x))

kHARD_SIGMOID 

Hard sigmoid activation: max(0, min(1, alpha*x+beta))

kSCALED_TANH 

Scaled tanh activation: alpha*tanh(beta*x)

kTHRESHOLDED_RELU 

Thresholded ReLU activation: x>alpha ? x : 0.

kGELU_ERF 

GELU erf activation: 0.5 * x * (1 + erf(sqrt(0.5) * x))

kGELU_TANH 

GELU tanh activation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (0.044715F * pow(x, 3) + x)))

◆ AllocatorFlag

enum class nvinfer1::AllocatorFlag : int32_t
strong

Allowed type of memory allocation.

Enumerator
kRESIZABLE 

TensorRT may call realloc() on this allocation.

◆ APILanguage

enum class nvinfer1::APILanguage : int32_t
strong

Programming language used in the implementation of a TRT interface.

Enumerator
kCPP 
kPYTHON 

◆ BoundingBoxFormat

enum class nvinfer1::BoundingBoxFormat : int32_t
strong

Representation of bounding box data used for the Boxes input tensor in INMSLayer.

See also
INMSLayer
Enumerator
kCORNER_PAIRS 

(x1, y1, x2, y2) where (x1, y1) and (x2, y2) are any pair of diagonal corners

kCENTER_SIZES 

(x_center, y_center, width, height) where (x_center, y_center) is the center point of the box

◆ BuilderFlag

enum class nvinfer1::BuilderFlag : int32_t
strong

List of valid modes that the builder can enable when creating an engine from a network definition.

See also
IBuilderConfig::setFlags(), IBuilderConfig::getFlags()
Enumerator
kFP16 

Enable FP16 layer selection, with FP32 fallback.

kINT8 

Enable Int8 layer selection, with FP32 fallback with FP16 fallback if kFP16 also specified.

kDEBUG 

Enable debugging of layers via synchronizing after every layer.

kGPU_FALLBACK 

Enable layers marked to execute on GPU if layer cannot execute on DLA.

kREFIT 

Enable building a refittable engine.

kDISABLE_TIMING_CACHE 

Disable reuse of timing information across identical layers.

kTF32 

Allow (but not require) computations on tensors of type DataType::kFLOAT to use TF32. TF32 computes inner products by rounding the inputs to 10-bit mantissas before multiplying, but accumulates the sum using 23-bit mantissas. Enabled by default.

kSPARSE_WEIGHTS 

Allow the builder to examine weights and use optimized functions when weights have suitable sparsity.

kSAFETY_SCOPE 

Change the allowed parameters in the EngineCapability::kSTANDARD flow to match the restrictions that EngineCapability::kSAFETY check against for DeviceType::kGPU and EngineCapability::kDLA_STANDALONE check against the DeviceType::kDLA case. This flag is forced to true if EngineCapability::kSAFETY at build time if it is unset.

This flag is only supported in NVIDIA Drive(R) products.

kOBEY_PRECISION_CONSTRAINTS 

Require that layers execute in specified precisions. Build fails otherwise.

kPREFER_PRECISION_CONSTRAINTS 

Prefer that layers execute in specified precisions. Fall back (with warning) to another precision if build would otherwise fail.

kDIRECT_IO 

Require that no reformats be inserted between a layer and a network I/O tensor for which ITensor::setAllowedFormats was called. Build fails if a reformat is required for functional correctness.

kREJECT_EMPTY_ALGORITHMS 

Fail if IAlgorithmSelector::selectAlgorithms returns an empty set of algorithms.

kVERSION_COMPATIBLE 

Restrict to lean runtime operators to provide version forward compatibility for the plan.

This flag is only supported by NVIDIA Volta and later GPUs. This flag is not supported in NVIDIA Drive(R) products.

kEXCLUDE_LEAN_RUNTIME 

Exclude lean runtime from the plan when version forward compatability is enabled. By default, this flag is unset, so the lean runtime will be included in the plan.

If BuilderFlag::kVERSION_COMPATIBLE is not set then the value of this flag will be ignored.

kFP8 

Enable FP8 layer selection, with FP32 fallback.

This flag is not supported with hardware-compatibility mode.

\see HardwareCompatibilityLevel 
kERROR_ON_TIMING_CACHE_MISS 

Emit error when a tactic being timed is not present in the timing cache. This flag has an effect only when IBuilderConfig has an associated ITimingCache.

kBF16 

Enable DataType::kBF16 layer selection, with FP32 fallback. This flag is only supported by NVIDIA Ampere and later GPUs.

kDISABLE_COMPILATION_CACHE 

Disable caching of JIT-compilation results during engine build. By default, JIT-compiled code will be serialized as part of the timing cache, which may significantly increase the cache size. Setting this flag prevents the code from being serialized. This flag has an effect only when BuilderFlag::DISABLE_TIMING_CACHE is not set.

kSTRIP_PLAN 

Strip the refittable weights from the engine plan file.

kWEIGHTLESS 
Deprecated:
Deprecated in TensorRT 10.0. Superseded by kSTRIP_PLAN.
kREFIT_IDENTICAL 

Create a refittable engine under the assumption that the refit weights will be identical to those provided at build time. The resulting engine will have the same performance as a non-refittable one. All refittable weights can be refitted through the refit API, but if the refit weights are not identical to the build-time weights, behavior is undefined. When used alongside 'kSTRIP_PLAN', this flag will result in a small plan file for which weights are later supplied via refitting. This enables use of a single set of weights with different inference backends, or with TensorRT plans for multiple GPU architectures.

kWEIGHT_STREAMING 

Enable weight streaming for the current engine.

Weight streaming from the host enables execution of models that do not fit in GPU memory by allowing TensorRT to intelligently stream network weights from the CPU DRAM. Please see ICudaEngine::getMinimumWeightStreamingBudget for the default memory budget when this flag is enabled.

Enabling this feature changes the behavior of IRuntime::deserializeCudaEngine to allocate the entire network’s weights on the CPU DRAM instead of GPU memory. Then, ICudaEngine::createExecutionContext will determine the optimal split of weights between the CPU and GPU and place weights accordingly.

Future TensorRT versions may enable this flag by default.

Warning
Enabling this flag may marginally increase build time.
Enabling this feature will significantly increase the latency of ICudaEngine::createExecutionContext.
See also
IRuntime::deserializeCudaEngine, ICudaEngine::getMinimumWeightStreamingBudget, ICudaEngine::setWeightStreamingBudget

◆ CalibrationAlgoType

enum class nvinfer1::CalibrationAlgoType : int32_t
strong

Version of calibration algorithm to use.

Enumerator
kLEGACY_CALIBRATION 

Legacy calibration.

kENTROPY_CALIBRATION 

Legacy entropy calibration.

kENTROPY_CALIBRATION_2 

Entropy calibration.

kMINMAX_CALIBRATION 

Minmax calibration.

◆ DataType

enum class nvinfer1::DataType : int32_t
strong

The type of weights and tensors.

Enumerator
kFLOAT 

32-bit floating point format.

kHALF 

IEEE 16-bit floating-point format – has a 5 bit exponent and 11 bit significand.

kINT8 

Signed 8-bit integer representing a quantized floating-point value.

kINT32 

Signed 32-bit integer format.

kBOOL 

8-bit boolean. 0 = false, 1 = true, other values undefined.

kUINT8 

Unsigned 8-bit integer format. Cannot be used to represent quantized floating-point values. Use the IdentityLayer to convert kUINT8 network-level inputs to {kFLOAT, kHALF} prior to use with other TensorRT layers, or to convert intermediate output before kUINT8 network-level outputs from {kFLOAT, kHALF} to kUINT8. kUINT8 conversions are only supported for {kFLOAT, kHALF}. kUINT8 to {kFLOAT, kHALF} conversion will convert the integer values to equivalent floating point values. {kFLOAT, kHALF} to kUINT8 conversion will convert the floating point values to integer values by truncating towards zero. This conversion has undefined behavior for floating point values outside the range [0.0F, 256.0F) after truncation. kUINT8 conversions are not supported for {kINT8, kINT32, kBOOL}.

kFP8 

Signed 8-bit floating point with 1 sign bit, 4 exponent bits, 3 mantissa bits, and exponent-bias 7.

kBF16 

Brain float – has an 8 bit exponent and 8 bit significand.

kINT64 

Signed 64-bit integer type.

kINT4 

Signed 4-bit integer type.

◆ DeviceType

enum class nvinfer1::DeviceType : int32_t
strong

The device that this layer/network will execute on.

Enumerator
kGPU 

GPU Device.

kDLA 

DLA Core.

◆ DimensionOperation

enum class nvinfer1::DimensionOperation : int32_t
strong

An operation on two IDimensionExpr, which represent integer expressions used in dimension computations.

For example, given two IDimensionExpr x and y and an IExprBuilder& eb, eb.operation(DimensionOperation::kSUM, x, y) creates a representation of x+y.

See also
IDimensionExpr, IExprBuilder
Enumerator
kSUM 

Sum of the two operands.

kPROD 

Product of the two operands.

kMAX 

Maximum of the two operands.

kMIN 

Minimum of the two operands.

kSUB 

Substract the second element from the first.

kEQUAL 

1 if operands are equal, 0 otherwise.

kLESS 

1 if first operand is less than second operand, 0 otherwise.

kFLOOR_DIV 

Floor division of the first element by the second.

kCEIL_DIV 

Division rounding up.

◆ ElementWiseOperation

enum class nvinfer1::ElementWiseOperation : int32_t
strong

Enumerates the binary operations that may be performed by an ElementWise layer.

Operations kAND, kOR, and kXOR must have inputs of DataType::kBOOL.

Operation kPOW must have inputs of floating-point type or DataType::kINT8.

All other operations must have inputs of floating-point type, DataType::kINT8, DataType::kINT32, or DataType::kINT64.

See also
IElementWiseLayer
Enumerator
kSUM 

Sum of the two elements.

kPROD 

Product of the two elements.

kMAX 

Maximum of the two elements.

kMIN 

Minimum of the two elements.

kSUB 

Subtract the second element from the first.

kDIV 

Divide the first element by the second.

kPOW 

The first element to the power of the second element.

kFLOOR_DIV 

Floor division of the first element by the second.

kAND 

Logical AND of two elements.

kOR 

Logical OR of two elements.

kXOR 

Logical XOR of two elements.

kEQUAL 

Check if two elements are equal.

kGREATER 

Check if element in first tensor is greater than corresponding element in second tensor.

kLESS 

Check if element in first tensor is less than corresponding element in second tensor.

◆ EngineCapability

enum class nvinfer1::EngineCapability : int32_t
strong

List of supported engine capability flows.

The EngineCapability determines the restrictions of a network during build time and what runtime it targets. When BuilderFlag::kSAFETY_SCOPE is not set (by default), EngineCapability::kSTANDARD does not provide any restrictions on functionality and the resulting serialized engine can be executed with TensorRT's standard runtime APIs in the nvinfer1 namespace. EngineCapability::kSAFETY provides a restricted subset of network operations that are safety certified and the resulting serialized engine can be executed with TensorRT's safe runtime APIs in the nvinfer1::safe namespace. EngineCapability::kDLA_STANDALONE provides a restricted subset of network operations that are DLA compatible and the resulting serialized engine can be executed using standalone DLA runtime APIs. See sampleCudla for an example of integrating cuDLA APIs with TensorRT APIs.

Enumerator
kSTANDARD 

Standard: TensorRT flow without targeting the safety runtime. This flow supports both DeviceType::kGPU and DeviceType::kDLA.

kSAFETY 

Safety: TensorRT flow with restrictions targeting the safety runtime. See safety documentation for list of supported layers and formats. This flow supports only DeviceType::kGPU.

This flag is only supported in NVIDIA Drive(R) products.

kDLA_STANDALONE 

DLA Standalone: TensorRT flow with restrictions targeting external, to TensorRT, DLA runtimes. See DLA documentation for list of supported layers and formats. This flow supports only DeviceType::kDLA.

◆ ErrorCode

enum class nvinfer1::ErrorCode : int32_t
strong

Error codes that can be returned by TensorRT during execution.

Enumerator
kSUCCESS 

Execution completed successfully.

kUNSPECIFIED_ERROR 

An error that does not fall into any other category. This error is included for forward compatibility.

kINTERNAL_ERROR 

A non-recoverable TensorRT error occurred. TensorRT is in an invalid internal state when this error is emitted and any further calls to TensorRT will result in undefined behavior.

kINVALID_ARGUMENT 

An argument passed to the function is invalid in isolation. This is a violation of the API contract.

kINVALID_CONFIG 

An error occurred when comparing the state of an argument relative to other arguments. For example, the dimensions for concat differ between two tensors outside of the channel dimension. This error is triggered when an argument is correct in isolation, but not relative to other arguments. This is to help to distinguish from the simple errors from the more complex errors. This is a violation of the API contract.

kFAILED_ALLOCATION 

An error occurred when performing an allocation of memory on the host or the device. A memory allocation error is normally fatal, but in the case where the application provided its own memory allocation routine, it is possible to increase the pool of available memory and resume execution.

kFAILED_INITIALIZATION 

One, or more, of the components that TensorRT relies on did not initialize correctly. This is a system setup issue.

kFAILED_EXECUTION 

An error occurred during execution that caused TensorRT to end prematurely, either an asynchronous error, user cancellation, or other execution errors reported by CUDA/DLA. In a dynamic system, the data can be thrown away and the next frame can be processed or execution can be retried. This is either an execution error or a memory error.

kFAILED_COMPUTATION 

An error occurred during execution that caused the data to become corrupted, but execution finished. Examples of this error are NaN squashing or integer overflow. In a dynamic system, the data can be thrown away and the next frame can be processed or execution can be retried. This is either a data corruption error, an input error, or a range error. This is not used in safety but may be used in standard.

kINVALID_STATE 

TensorRT was put into a bad state by incorrect sequence of function calls. An example of an invalid state is specifying a layer to be DLA only without GPU fallback, and that layer is not supported by DLA. This can occur in situations where a service is optimistically executing networks for multiple different configurations without checking proper error configurations, and instead throwing away bad configurations caught by TensorRT. This is a violation of the API contract, but can be recoverable.

Example of a recovery: GPU fallback is disabled and conv layer with large filter(63x63) is specified to run on DLA. This will fail due to DLA not supporting the large kernel size. This can be recovered by either turning on GPU fallback or setting the layer to run on the GPU.

kUNSUPPORTED_STATE 

An error occurred due to the network not being supported on the device due to constraints of the hardware or system. An example is running an unsafe layer in a safety certified context, or a resource requirement for the current network is greater than the capabilities of the target device. The network is otherwise correct, but the network and hardware combination is problematic. This can be recoverable. Examples:

  • Scratch space requests larger than available device memory and can be recovered by increasing allowed workspace size.
  • Tensor size exceeds the maximum element count and can be recovered by reducing the maximum batch size.

◆ ExecutionContextAllocationStrategy

enum class nvinfer1::ExecutionContextAllocationStrategy : int32_t
strong

Different memory allocation behaviors for IExecutionContext.

IExecutionContext requires a block of device memory for internal activation tensors during inference. The user can either let the execution context manage the memory in various ways or allocate the memory themselves.

See also
ICudaEngine::createExecutionContext()
IExecutionContext::setDeviceMemory()
Enumerator
kSTATIC 

Default static allocation with the maximum size across all profiles.

kON_PROFILE_CHANGE 

Reallocate for a profile when it's selected.

kUSER_MANAGED 

The user supplies custom allocation to the execution context.

◆ FillOperation

enum class nvinfer1::FillOperation : int32_t
strong

Enumerates the tensor fill operations that may performed by a fill layer.

See also
IFillLayer
Enumerator
kLINSPACE 

Compute each value via an affine function of its indices. For example, suppose the parameters for the IFillLayer are:

  • Dimensions = [3,4]
  • Alpha = 1
  • Beta = [100,10]

Element [i,j] of the output is Alpha + Beta[0]*i + Beta[1]*j. Thus the output matrix is:

 1  11  21  31

101 111 121 131 201 211 221 231

A static beta b is implicitly a 1D tensor, i.e. Beta = [b].

kRANDOM_UNIFORM 

Randomly draw values from a uniform distribution.

kRANDOM_NORMAL 

Randomly draw values from a normal distribution.

◆ GatherMode

enum class nvinfer1::GatherMode : int32_t
strong

Control form of IGatherLayer.

See also
IGatherLayer
Enumerator
kDEFAULT 

Similar to ONNX Gather.

kELEMENT 

Similar to ONNX GatherElements.

kND 

Similar to ONNX GatherND.

◆ HardwareCompatibilityLevel

enum class nvinfer1::HardwareCompatibilityLevel : int32_t
strong

Describes requirements of compatibility with GPU architectures other than that of the GPU on which the engine was built.

Levels except kNONE are only supported for engines built on NVIDIA Ampere and later GPUs.

Warning
Note that compatibility with future hardware depends on CUDA forward compatibility support.
Enumerator
kNONE 

Do not require hardware compatibility with GPU architectures other than that of the GPU on which the engine was built.

kAMPERE_PLUS 

Require that the engine is compatible with Ampere and newer GPUs. This will limit the max shared memory usage to 48KiB, may reduce the number of available tactics for each layer, and may prevent some fusions from occurring. Thus this can decrease the performance, especially for tf32 models. This option will disable cuDNN, cuBLAS, and cuBLAS LT as tactic sources.

◆ InterpolationMode

enum class nvinfer1::InterpolationMode : int32_t
strong

Enumerates various modes of interpolation.

Enumerator
kNEAREST 

ND (0 < N <= 8) nearest neighbor resizing.

kLINEAR 

Supports linear (1D), bilinear (2D), and trilinear (3D) interpolation.

kCUBIC 

Supports bicubic (2D) interpolation.

◆ LayerInformationFormat

enum class nvinfer1::LayerInformationFormat : int32_t
strong

The format in which the IEngineInspector prints the layer information.

See also
IEngineInspector::getLayerInformation(), IEngineInspector::getEngineInformation()
Enumerator
kONELINE 

Print layer information in one line per layer.

kJSON 

Print layer information in JSON format.

◆ LayerType

enum class nvinfer1::LayerType : int32_t
strong

The type values of layer classes.

See also
ILayer::getType()
Enumerator
kCONVOLUTION 

Convolution layer.

kCAST 

Cast layer.

kACTIVATION 

Activation layer.

kPOOLING 

Pooling layer.

kLRN 

LRN layer.

kSCALE 

Scale layer.

kSOFTMAX 

SoftMax layer.

kDECONVOLUTION 

Deconvolution layer.

kCONCATENATION 

Concatenation layer.

kELEMENTWISE 

Elementwise layer.

kPLUGIN 

Plugin layer.

kUNARY 

UnaryOp operation Layer.

kPADDING 

Padding layer.

kSHUFFLE 

Shuffle layer.

kREDUCE 

Reduce layer.

kTOPK 

TopK layer.

kGATHER 

Gather layer.

kMATRIX_MULTIPLY 

Matrix multiply layer.

kRAGGED_SOFTMAX 

Ragged softmax layer.

kCONSTANT 

Constant layer.

kIDENTITY 

Identity layer.

kPLUGIN_V2 

PluginV2 layer.

kSLICE 

Slice layer.

kSHAPE 

Shape layer.

kPARAMETRIC_RELU 

Parametric ReLU layer.

kRESIZE 

Resize Layer.

kTRIP_LIMIT 

Loop Trip limit layer.

kRECURRENCE 

Loop Recurrence layer.

kITERATOR 

Loop Iterator layer.

kLOOP_OUTPUT 

Loop output layer.

kSELECT 

Select layer.

kFILL 

Fill layer.

kQUANTIZE 

Quantize layer.

kDEQUANTIZE 

Dequantize layer.

kCONDITION 

Condition layer.

kCONDITIONAL_INPUT 

Conditional Input layer.

kCONDITIONAL_OUTPUT 

Conditional Output layer.

kSCATTER 

Scatter layer.

kEINSUM 

Einsum layer.

kASSERTION 

Assertion layer.

kONE_HOT 

OneHot layer.

kNON_ZERO 

NonZero layer.

kGRID_SAMPLE 

Grid sample layer.

kNMS 

NMS layer.

kREVERSE_SEQUENCE 

Reverse sequence layer.

kNORMALIZATION 

Normalization layer.

kPLUGIN_V3 

PluginV3 layer.

◆ LoopOutput

enum class nvinfer1::LoopOutput : int32_t
strong
Enumerator
kLAST_VALUE 

Output value is value of tensor for last iteration.

kCONCATENATE 

Output value is concatenation of values of tensor for each iteration, in forward order.

kREVERSE 

Output value is concatenation of values of tensor for each iteration, in reverse order.

◆ MatrixOperation

enum class nvinfer1::MatrixOperation : int32_t
strong

Enumerates the operations that may be performed on a tensor by IMatrixMultiplyLayer before multiplication.

Enumerator
kNONE 

Treat x as a matrix if it has two dimensions, or as a collection of matrices if x has more than two dimensions, where the last two dimensions are the matrix dimensions. x must have at least two dimensions.

kTRANSPOSE 

Like kNONE, but transpose the matrix dimensions.

kVECTOR 

Treat x as a vector if it has one dimension, or as a collection of vectors if x has more than one dimension. x must have at least one dimension.

The first input tensor with dimensions [M,K] used with MatrixOperation::kVECTOR is equivalent to a tensor with dimensions [M, 1, K] with MatrixOperation::kNONE, i.e. is treated as M row vectors of length K, or dimensions [M, K, 1] with MatrixOperation::kTRANSPOSE.

The second input tensor with dimensions [M,K] used with MatrixOperation::kVECTOR is equivalent to a tensor with dimensions [M, K, 1] with MatrixOperation::kNONE, i.e. is treated as M column vectors of length K, or dimensions [M, 1, K] with MatrixOperation::kTRANSPOSE.

◆ MemoryPoolType

enum class nvinfer1::MemoryPoolType : int32_t
strong

The type for memory pools used by TensorRT.

See also
IBuilderConfig::setMemoryPoolLimit, IBuilderConfig::getMemoryPoolLimit
Enumerator
kWORKSPACE 

kWORKSPACE is used by TensorRT to store intermediate buffers within an operation. This defaults to max device memory. Set to a smaller value to restrict tactics that use over the threshold en masse. For more targeted removal of tactics use the IAlgorithmSelector interface.

kDLA_MANAGED_SRAM 

kDLA_MANAGED_SRAM is a fast software managed RAM used by DLA to communicate within a layer. The size of this pool must be at least 4 KiB and must be a power of 2. This defaults to 1 MiB. Orin has capacity of 1 MiB per core.

kDLA_LOCAL_DRAM 

kDLA_LOCAL_DRAM is host RAM used by DLA to share intermediate tensor data across operations. The size of this pool must be at least 4 KiB and must be a power of 2. This defaults to 1 GiB.

kDLA_GLOBAL_DRAM 

kDLA_GLOBAL_DRAM is host RAM used by DLA to store weights and metadata for execution. The size of this pool must be at least 4 KiB and must be a power of 2. This defaults to 512 MiB.

kTACTIC_DRAM 

kTACTIC_DRAM is the device DRAM used by the optimizer to run tactics. On embedded devices, where host and device memory are unified, this includes all host memory required by TensorRT to build the network up to the point of each memory allocation. This defaults to 75% of totalGlobalMem as reported by cudaGetDeviceProperties when cudaGetDeviceProperties.embedded is true, and 100% otherwise.

kTACTIC_SHARED_MEMORY 

kTACTIC_SHARED_MEMORY defines the maximum shared memory size utilized for executing the backend CUDA kernel implementation. Adjust this value to restrict tactics that exceed the specified threshold en masse. The default value is device max capability. This value must be less than 1GiB.

Updating this flag will override the shared memory limit set by HardwareCompatibilityLevel, which defaults to 48KiB.

◆ NetworkDefinitionCreationFlag

enum class nvinfer1::NetworkDefinitionCreationFlag : int32_t
strong

List of immutable network properties expressed at network creation time. NetworkDefinitionCreationFlag is used with createNetworkV2() to specify immutable properties of the network.

See also
IBuilder::createNetworkV2
Enumerator
kEXPLICIT_BATCH 

Ignored because networks are always "explicit batch" in TensorRT 10.0.

\deprecated Deprecated in TensorRT 10.0. 
kSTRONGLY_TYPED 

Mark the network to be strongly typed. Every tensor in the network has a data type defined in the network following only type inference rules and the inputs/operator annotations. Setting layer precision and layer output types is not allowed, and the network output types will be inferred based on the input types and the type inference rules.

◆ OptProfileSelector

enum class nvinfer1::OptProfileSelector : int32_t
strong

When setting or querying optimization profile parameters (such as shape tensor inputs or dynamic dimensions), select whether we are interested in the minimum, optimum, or maximum values for these parameters. The minimum and maximum specify the permitted range that is supported at runtime, while the optimum value is used for the kernel selection. This should be the "typical" value that is expected to occur at runtime.

See also
IOptimizationProfile::setDimensions(), IOptimizationProfile::setShapeValues()
Enumerator
kMIN 

This is used to set or get the minimum permitted value for dynamic dimensions etc.

kOPT 

This is used to set or get the value that is used in the optimization (kernel selection).

kMAX 

This is used to set or get the maximum permitted value for dynamic dimensions etc.

◆ PaddingMode

enum class nvinfer1::PaddingMode : int32_t
strong

Enumerates the modes of padding to perform in convolution, deconvolution and pooling layer, padding mode takes precedence if setPaddingMode() and setPrePadding() are also used.

There are two padding styles, EXPLICIT and SAME with each style having two variants. The EXPLICIT style determine if the final sampling location is used or not. The SAME style determine if the asymmetry in the padding is on the pre or post padding.

Shorthand:
I = dimensions of input image.
B = prePadding, before the image data. For deconvolution, prePadding is set before output.
A = postPadding, after the image data. For deconvolution, postPadding is set after output.
P = delta between input and output
S = stride
F = filter
O = output
D = dilation
M = I + B + A ; The image data plus any padding
DK = 1 + D * (F - 1)

Formulas for Convolution:

  • EXPLICIT_ROUND_DOWN:
    O = floor((M - DK) / S) + 1
  • EXPLICIT_ROUND_UP:
    O = ceil((M - DK) / S) + 1
  • SAME_UPPER:
    O = ceil(I / S)
    P = floor((I - 1) / S) * S + DK - I;
    B = floor(P / 2)
    A = P - B
  • SAME_LOWER:
    O = ceil(I / S)
    P = floor((I - 1) / S) * S + DK - I;
    A = floor(P / 2)
    B = P - A

Formulas for Deconvolution:

  • EXPLICIT_ROUND_DOWN:
  • EXPLICIT_ROUND_UP:
    O = (I - 1) * S + DK - (B + A)
  • SAME_UPPER:
    O = min(I * S, (I - 1) * S + DK)
    P = max(DK - S, 0)
    B = floor(P / 2)
    A = P - B
  • SAME_LOWER:
    O = min(I * S, (I - 1) * S + DK)
    P = max(DK - S, 0)
    A = floor(P / 2)
    B = P - A

Formulas for Pooling:

  • EXPLICIT_ROUND_DOWN:
    O = floor((M - F) / S) + 1
  • EXPLICIT_ROUND_UP:
    O = ceil((M - F) / S) + 1
  • SAME_UPPER:
    O = ceil(I / S)
    P = floor((I - 1) / S) * S + F - I;
    B = floor(P / 2)
    A = P - B
  • SAME_LOWER:
    O = ceil(I / S)
    P = floor((I - 1) / S) * S + F - I;
    A = floor(P / 2)
    B = P - A

Pooling Example 1:

Given I = {6, 6}, B = {3, 3}, A = {2, 2}, S = {2, 2}, F = {3, 3}. What is O?
(B, A can be calculated for SAME_UPPER and SAME_LOWER mode)
  • EXPLICIT_ROUND_DOWN:
    Computation:
    M = {6, 6} + {3, 3} + {2, 2} ==> {11, 11}
    O ==> floor((M - F) / S) + 1
    ==> floor(({11, 11} - {3, 3}) / {2, 2}) + {1, 1}
    ==> floor({8, 8} / {2, 2}) + {1, 1}
    ==> {5, 5}
  • EXPLICIT_ROUND_UP:
    Computation:
    M = {6, 6} + {3, 3} + {2, 2} ==> {11, 11}
    O ==> ceil((M - F) / S) + 1
    ==> ceil(({11, 11} - {3, 3}) / {2, 2}) + {1, 1}
    ==> ceil({8, 8} / {2, 2}) + {1, 1}
    ==> {5, 5}
    The sample points are {0, 2, 4, 6, 8} in each dimension.
  • SAME_UPPER:
    Computation:
    I = {6, 6}
    S = {2, 2}
    O = ceil(I / S) = {3, 3}
    P = floor((I - 1) / S) * S + F - I
    ==> floor(({6, 6} - {1, 1}) / {2, 2}) * {2, 2} + {3, 3} - {6, 6}
    ==> {4, 4} + {3, 3} - {6, 6}
    ==> {1, 1}
    B = floor({1, 1} / {2, 2})
    ==> {0, 0}
    A = {1, 1} - {0, 0}
    ==> {1, 1}
  • SAME_LOWER:
    Computation:
    I = {6, 6}
    S = {2, 2}
    O = ceil(I / S) = {3, 3}
    P = floor((I - 1) / S) * S + F - I
    ==> {1, 1}
    A = floor({1, 1} / {2, 2})
    ==> {0, 0}
    B = {1, 1} - {0, 0}
    ==> {1, 1}
    The sample pointers are {0, 2, 4} in each dimension. SAMPLE_UPPER has {O0, O1, O2, pad} in output in each dimension. SAMPLE_LOWER has {pad, O0, O1, O2} in output in each dimension.

Pooling Example 2:

Given I = {6, 6}, B = {3, 3}, A = {3, 3}, S = {2, 2}, F = {3, 3}. What is O?
Enumerator
kEXPLICIT_ROUND_DOWN 

Use explicit padding, rounding output size down.

kEXPLICIT_ROUND_UP 

Use explicit padding, rounding output size up.

kSAME_UPPER 

Use SAME padding, with prePadding <= postPadding.

kSAME_LOWER 

Use SAME padding, with prePadding >= postPadding.

◆ PluginCapabilityType

enum class nvinfer1::PluginCapabilityType : int32_t
strong

Enumerates the different capability types a IPluginV3 object may have.

Enumerator
kCORE 

Core capability. Every IPluginV3 object must have this.

kBUILD 

Build capability. IPluginV3 objects provided to TensorRT build phase must have this.

kRUNTIME 

Runtime capability. IPluginV3 objects provided to TensorRT build and execution phases must have this.

◆ PluginCreatorVersion

enum class nvinfer1::PluginCreatorVersion : int32_t
strong

Enum to identify version of the plugin creator.

Enumerator
kV1 

IPluginCreator.

kV1_PYTHON 

IPluginCreator-based Python plugin creators.

◆ PluginFieldType

enum class nvinfer1::PluginFieldType : int32_t
strong

The possible field types for custom layer.

Enumerator
kFLOAT16 

FP16 field type.

kFLOAT32 

FP32 field type.

kFLOAT64 

FP64 field type.

kINT8 

INT8 field type.

kINT16 

INT16 field type.

kINT32 

INT32 field type.

kCHAR 

char field type.

kDIMS 

nvinfer1::Dims field type.

kUNKNOWN 

Unknown field type.

kBF16 

BF16 field type.

kINT64 

INT64 field type.

kFP8 

FP8 field type.

◆ PluginVersion

enum class nvinfer1::PluginVersion : uint8_t
strong
Enumerator
kV2 

IPluginV2.

kV2_EXT 

IPluginV2Ext.

kV2_IOEXT 

IPluginV2IOExt.

kV2_DYNAMICEXT 

IPluginV2DynamicExt.

kV2_DYNAMICEXT_PYTHON 

IPluginV2DynamicExt-based Python plugins.

◆ PoolingType

enum class nvinfer1::PoolingType : int32_t
strong

The type of pooling to perform in a pooling layer.

Enumerator
kMAX 

Maximum over elements.

kAVERAGE 

Average over elements. If the tensor is padded, the count includes the padding.

kMAX_AVERAGE_BLEND 

Blending between max and average pooling: (1-blendFactor)*maxPool + blendFactor*avgPool.

◆ PreviewFeature

enum class nvinfer1::PreviewFeature : int32_t
strong

Define preview features.

Preview Features have been fully tested but are not yet as stable as other features in TensorRT. They are provided as opt-in features for at least one release.

Enumerator
kPROFILE_SHARING_0806 

Allows optimization profiles to be shared across execution contexts.

Deprecated:
Deprecated in TensorRT 10.0. The default value for this flag is on and can not be changed.

◆ ProfilingVerbosity

enum class nvinfer1::ProfilingVerbosity : int32_t
strong

List of verbosity levels of layer information exposed in NVTX annotations and in IEngineInspector.

See also
IBuilderConfig::setProfilingVerbosity(), IBuilderConfig::getProfilingVerbosity(), IEngineInspector
Enumerator
kLAYER_NAMES_ONLY 

Print only the layer names. This is the default setting.

kNONE 

Do not print any layer information.

kDETAILED 

Print detailed layer information including layer names and layer parameters.

◆ QuantizationFlag

enum class nvinfer1::QuantizationFlag : int32_t
strong

List of valid flags for quantizing the network to int8.

See also
IBuilderConfig::setQuantizationFlag(), IBuilderConfig::getQuantizationFlag()
Enumerator
kCALIBRATE_BEFORE_FUSION 

Run int8 calibration pass before layer fusion. Only valid for IInt8LegacyCalibrator and IInt8EntropyCalibrator. The builder always runs the int8 calibration pass before layer fusion for IInt8MinMaxCalibrator and IInt8EntropyCalibrator2. Disabled by default.

◆ ReduceOperation

enum class nvinfer1::ReduceOperation : int32_t
strong

Enumerates the reduce operations that may be performed by a Reduce layer.

The table shows the result of reducing across an empty volume of a given type.

Operation kFLOAT and kHALF kINT32 kINT8
kSUM 0 0 0
kPROD 1 1 1
kMAX negative infinity INT_MIN -128
kMIN positive infinity INT_MAX 127
kAVG NaN 0 -128

The current version of TensorRT usually performs reduction for kINT8 via kFLOAT or kHALF. The kINT8 values show the quantized representations of the floating-point values.

Enumerator
kSUM 
kPROD 
kMAX 
kMIN 
kAVG 

◆ ResizeCoordinateTransformation

enum class nvinfer1::ResizeCoordinateTransformation : int32_t
strong

The resize coordinate transformation function.

See also
IResizeLayer::setCoordinateTransformation()
Enumerator
kALIGN_CORNERS 

Think of each value in the tensor as a unit volume, and the coordinate is a point inside this volume. The coordinate point is drawn as a star (*) in the below diagram, and multiple values range has a length. Define x_origin as the coordinate of axis x in the input tensor, x_resized as the coordinate of axis x in the output tensor, length_origin as length of the input tensor in axis x, and length_resize as length of the output tensor in axis x.

|<--------------length---------->|
|    0     |    1     |    2     |    3     |
*          *          *          *

x_origin = x_resized * (length_origin - 1) / (length_resize - 1)
kASYMMETRIC 

|<-----------—length------------------—>| | 0 | 1 | 2 | 3 |


x_origin = x_resized * (length_origin / length_resize)

kHALF_PIXEL 

|<-----------—length------------------—>| | 0 | 1 | 2 | 3 |


x_origin = (x_resized + 0.5) * (length_origin / length_resize) - 0.5

◆ ResizeRoundMode

enum class nvinfer1::ResizeRoundMode : int32_t
strong

The rounding mode for nearest neighbor resize.

See also
IResizeLayer::setNearestRounding()
Enumerator
kHALF_UP 

Round half up.

kHALF_DOWN 

Round half down.

kFLOOR 

Round to floor.

kCEIL 

Round to ceil.

◆ ResizeSelector

enum class nvinfer1::ResizeSelector : int32_t
strong

The coordinate selector when resize to single pixel output.

See also
IResizeLayer::setSelectorForSinglePixel()
Enumerator
kFORMULA 

Use formula to map the original index.

kUPPER 

Select the upper left pixel.

◆ SampleMode

enum class nvinfer1::SampleMode : int32_t
strong

Controls how ISliceLayer and IGridSample handle out-of-bounds coordinates.

See also
ISliceLayer and IGridSample
Enumerator
kSTRICT_BOUNDS 

Fail with error when the coordinates are out of bounds.

kWRAP 

Coordinates wrap around periodically.

kCLAMP 

Out of bounds indices are clamped to bounds.

kFILL 

Use fill input value when coordinates are out of bounds.

kREFLECT 

Coordinates reflect. The axis of reflection is the middle of the perimeter pixel and the reflections are repeated indefinitely within the padded regions. Repeats values for a single pixel and throws error for zero pixels.

◆ ScaleMode

enum class nvinfer1::ScaleMode : int32_t
strong

Controls how shift, scale and power are applied in a Scale layer.

See also
IScaleLayer
Enumerator
kUNIFORM 

Identical coefficients across all elements of the tensor.

kCHANNEL 

Per-channel coefficients.

kELEMENTWISE 

Elementwise coefficients.

◆ ScatterMode

enum class nvinfer1::ScatterMode : int32_t
strong

Control form of IScatterLayer.

See also
IScatterLayer
Enumerator
kELEMENT 

Similar to ONNX ScatterElements.

kND 

Similar to ONNX ScatterND.

◆ SerializationFlag

enum class nvinfer1::SerializationFlag : int32_t
strong

List of valid flags that the engine can enable when serializing the bytes.

See also
ISerializationConfig::setFlags(), ISerializationConfig::getFlags()
Enumerator
kEXCLUDE_WEIGHTS 

Exclude the weights that can be refitted.

kEXCLUDE_LEAN_RUNTIME 

Exclude the lean runtime.

◆ TacticSource

enum class nvinfer1::TacticSource : int32_t
strong

List of tactic sources for TensorRT.

See also
TacticSources, IBuilderConfig::setTacticSources(), IBuilderConfig::getTacticSources()
Enumerator
kCUBLAS 

cuBLAS tactics. Disabled by default.

Note
Disabling kCUBLAS will cause the cuBLAS handle passed to plugins in attachToContext to be null.
Deprecated:
Deprecated in TensorRT 10.0.
kCUBLAS_LT 

cuBLAS LT tactics. Enabled by default.

Deprecated:
Deprecated in TensorRT 9.0.
kCUDNN 

cuDNN tactics. Disabled by default.

Note
Disabling kCUDNN will cause the cuDNN handle passed to plugins in attachToContext to be null.
Deprecated:
Deprecated in TensorRT 10.0.
kEDGE_MASK_CONVOLUTIONS 

Enables convolution tactics implemented with edge mask tables. These tactics tradeoff memory for performance by consuming additional memory space proportional to the input size. Enabled by default.

kJIT_CONVOLUTIONS 

Enables convolution tactics implemented with source-code JIT fusion. The engine building time may increase when this is enabled. Enabled by default.

◆ TempfileControlFlag

enum class nvinfer1::TempfileControlFlag : int32_t
strong

Flags used to control TensorRT's behavior when creating executable temporary files.

On some platforms the TensorRT runtime may need to create files in a temporary directory or use platform-specific APIs to create files in-memory to load temporary DLLs that implement runtime code. These flags allow the application to explicitly control TensorRT's use of these files. This will preclude the use of certain TensorRT APIs for deserializing and loading lean runtimes.

Enumerator
kALLOW_IN_MEMORY_FILES 

Allow creating and loading files in-memory (or unnamed files).

kALLOW_TEMPORARY_FILES 

Allow creating and loading named files in a temporary directory on the filesystem.

\see IRuntime::setTemporaryDirectory() 

◆ TensorFormat

enum class nvinfer1::TensorFormat : int32_t
strong

Format of the input/output tensors.

This enum is used by both plugins and network I/O tensors.

See also
IPluginV2::supportsFormat(), safe::ICudaEngine::getBindingFormat()

Many of the formats are vector-major or vector-minor. These formats specify a vector dimension and scalars per vector. For example, suppose that the tensor has has dimensions [M,N,C,H,W], the vector dimension is C and there are V scalars per vector.

  • A vector-major format splits the vectorized dimension into two axes in the memory layout. The vectorized dimension is replaced by an axis of length ceil(C/V) and a new dimension of length V is appended. For the example tensor, the memory layout is equivalent to an array with dimensions [M][N][ceil(C/V)][H][W][V]. Tensor coordinate (m,n,c,h,w) maps to array location [m][n][c/V][h][w][c%V].
  • A vector-minor format moves the vectorized dimension to become the last axis in the memory layout. For the example tensor, the memory layout is equivalent to an array with dimensions [M][N][H][W][ceil(C/V)*V]. Tensor coordinate (m,n,c,h,w) maps array location subscript [m][n][h][w][c].

In interfaces that refer to "components per element", that's the value of V above.

For more information about data formats, see the topic "Data Format Description" located in the TensorRT Developer Guide. https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#data-format-desc

Enumerator
kLINEAR 

Memory layout is similar to an array in C or C++. The stride of each dimension is the product of the dimensions after it. The last dimension has unit stride.

For DLA usage, the tensor sizes are limited to C,H,W in the range [1,8192].

kCHW2 

Vector-major format with two scalars per vector. Vector dimension is third to last.

This format requires FP16 or BF16 and at least three dimensions.

kHWC8 

Vector-minor format with eight scalars per vector. Vector dimension is third to last. This format requires FP16 or BF16 and at least three dimensions.

kCHW4 

Vector-major format with four scalars per vector. Vector dimension is third to last.

This format requires INT8 or FP16 and at least three dimensions. For INT8, the length of the vector dimension must be a build-time constant.

Deprecated usage:

If running on the DLA, this format can be used for acceleration with the caveat that C must be less than or equal to 4. If used as DLA input and the build option kGPU_FALLBACK is not specified, it needs to meet line stride requirement of DLA format. Column stride in bytes must be a multiple of 64 on Orin.

kCHW16 

Vector-major format with 16 scalars per vector. Vector dimension is third to last.

This format requires INT8 or FP16 and at least three dimensions.

For DLA usage, this format maps to the native feature format for FP16, and the tensor sizes are limited to C,H,W in the range [1,8192].

kCHW32 

Vector-major format with 32 scalars per vector. Vector dimension is third to last.

This format requires at least three dimensions.

For DLA usage, this format maps to the native feature format for INT8, and the tensor sizes are limited to C,H,W in the range [1,8192].

kDHWC8 

Vector-minor format with eight scalars per vector. Vector dimension is fourth to last.

This format requires FP16 or BF16 and at least four dimensions.

kCDHW32 

Vector-major format with 32 scalars per vector. Vector dimension is fourth to last.

This format requires FP16 or INT8 and at least four dimensions.

kHWC 

Vector-minor format where channel dimension is third to last and unpadded.

This format requires either FP32 or UINT8 and at least three dimensions. 
kDLA_LINEAR 

DLA planar format. For a tensor with dimension {N, C, H, W}, the W axis always has unit stride. The stride for stepping along the H axis is rounded up to 64 bytes.

The memory layout is equivalent to a C array with dimensions [N][C][H][roundUp(W, 64/elementSize)] where elementSize is 2 for FP16 and 1 for Int8, with the tensor coordinates (n, c, h, w) mapping to array subscript [n][c][h][w].

kDLA_HWC4 

DLA image format. For a tensor with dimension {N, C, H, W} the C axis always has unit stride. The stride for stepping along the H axis is rounded up to 64 bytes on Orin. C can only be 1, 3 or 4. If C == 1, it will map to grayscale format. If C == 3 or C == 4, it will map to color image format. And if C == 3, the stride for stepping along the W axis needs to be padded to 4 in elements.

When C is {1, 3, 4}, then C' is {1, 4, 4} respectively, the memory layout is equivalent to a C array with dimensions [N][H][roundUp(W, 64/C'/elementSize)][C'] on Orin where elementSize is 2 for FP16 and 1 for Int8. The tensor coordinates (n, c, h, w) mapping to array subscript [n][h][w][c].

kHWC16 

Vector-minor format with 16 scalars per vector. Vector dimension is third to last.

This requires FP16 and at least three dimensions.

kDHWC 

Vector-minor format with one scalar per vector. Vector dimension is fourth to last.

This format requires FP32 and at least four dimensions.

◆ TensorIOMode

enum class nvinfer1::TensorIOMode : int32_t
strong

Definition of tensor IO Mode.

Enumerator
kNONE 

Tensor is not an input or output.

kINPUT 

Tensor is input to the engine.

kOUTPUT 

Tensor is output by the engine.

◆ TensorLocation

enum class nvinfer1::TensorLocation : int32_t
strong

The location for tensor data storage, device or host.

Enumerator
kDEVICE 

Data stored on device.

kHOST 

Data stored on host.

◆ TensorRTPhase

enum class nvinfer1::TensorRTPhase : int32_t
strong

Indicates a phase of operation of TensorRT.

Enumerator
kBUILD 

Build phase of TensorRT.

kRUNTIME 

Execution phase of TensorRT.

◆ TopKOperation

enum class nvinfer1::TopKOperation : int32_t
strong

Enumerates the operations that may be performed by a TopK layer.

Enumerator
kMAX 

Maximum of the elements.

kMIN 

Minimum of the elements.

◆ TripLimit

enum class nvinfer1::TripLimit : int32_t
strong
Enumerator
kCOUNT 

Tensor is a scalar of type kINT32 or kINT64 that contains the trip count.

kWHILE 

Tensor is a scalar of type kBOOL. Loop terminates when value is false.

◆ UnaryOperation

enum class nvinfer1::UnaryOperation : int32_t
strong

Enumerates the unary operations that may be performed by a Unary layer.

Operations kNOT must have inputs of DataType::kBOOL.

Operation kSIGN and kABS must have inputs of floating-point type, DataType::kINT8, DataType::kINT32 or DataType::kINT64.

Operation kISINF must have inputs of floating-point type.

All other operations must have inputs of floating-point type.

See also
IUnaryLayer
Enumerator
kEXP 

Exponentiation.

kLOG 

Log (base e).

kSQRT 

Square root.

kRECIP 

Reciprocal.

kABS 

Absolute value.

kNEG 

Negation.

kSIN 

Sine.

kCOS 

Cosine.

kTAN 

Tangent.

kSINH 

Hyperbolic sine.

kCOSH 

Hyperbolic cosine.

kASIN 

Inverse sine.

kACOS 

Inverse cosine.

kATAN 

Inverse tangent.

kASINH 

Inverse hyperbolic sine.

kACOSH 

Inverse hyperbolic cosine.

kATANH 

Inverse hyperbolic tangent.

kCEIL 

Ceiling.

kFLOOR 

Floor.

kERF 

Gauss error function.

kNOT 

Logical NOT.

kSIGN 

Sign, If input > 0, output 1; if input < 0, output -1; if input == 0, output 0.

kROUND 

Round to nearest even for floating-point data type.

kISINF 

Return true if input value equals +/- infinity for floating-point data type.

◆ WeightsRole

enum class nvinfer1::WeightsRole : int32_t
strong

How a layer uses particular Weights.

The power weights of an IScaleLayer are omitted. Refitting those is not supported.

Enumerator
kKERNEL 

kernel for IConvolutionLayer or IDeconvolutionLayer

kBIAS 

bias for IConvolutionLayer or IDeconvolutionLayer

kSHIFT 

shift part of IScaleLayer

kSCALE 

scale part of IScaleLayer

kCONSTANT 

weights for IConstantLayer

kANY 

Any other weights role.

Function Documentation

◆ EnumMax()

template<typename T >
constexpr int32_t nvinfer1::EnumMax ( )
constexprnoexcept

Maximum number of elements in an enumeration type.

◆ EnumMax< BoundingBoxFormat >()

template<>
constexpr int32_t nvinfer1::EnumMax< BoundingBoxFormat > ( )
inlineconstexprnoexcept

Maximum number of elements in BoundingBoxFormat enum.

See also
BoundingBoxFormat

◆ EnumMax< BuilderFlag >()

template<>
constexpr int32_t nvinfer1::EnumMax< BuilderFlag > ( )
inlineconstexprnoexcept

Maximum number of builder flags in BuilderFlag enum.

See also
BuilderFlag

◆ EnumMax< CalibrationAlgoType >()

template<>
constexpr int32_t nvinfer1::EnumMax< CalibrationAlgoType > ( )
inlineconstexprnoexcept

Maximum number of elements in CalibrationAlgoType enum.

See also
DataType

◆ EnumMax< DeviceType >()

template<>
constexpr int32_t nvinfer1::EnumMax< DeviceType > ( )
inlineconstexprnoexcept

Maximum number of elements in DeviceType enum.

See also
DeviceType

◆ EnumMax< DimensionOperation >()

template<>
constexpr int32_t nvinfer1::EnumMax< DimensionOperation > ( )
inlineconstexprnoexcept

Maximum number of elements in DimensionOperation enum.

See also
DimensionOperation

◆ EnumMax< ExecutionContextAllocationStrategy >()

template<>
constexpr int32_t nvinfer1::EnumMax< ExecutionContextAllocationStrategy > ( )
inlineconstexprnoexcept

Maximum number of memory allocation strategies in ExecutionContextAllocationStrategy enum.

See also
ExecutionContextAllocationStrategy

◆ EnumMax< FillOperation >()

template<>
constexpr int32_t nvinfer1::EnumMax< FillOperation > ( )
inlineconstexprnoexcept

Maximum number of elements in FillOperation enum.

See also
FillOperation

◆ EnumMax< GatherMode >()

template<>
constexpr int32_t nvinfer1::EnumMax< GatherMode > ( )
inlineconstexprnoexcept

Maximum number of elements in GatherMode enum.

See also
GatherMode

◆ EnumMax< LayerInformationFormat >()

template<>
constexpr int32_t nvinfer1::EnumMax< LayerInformationFormat > ( )
inlineconstexprnoexcept

Maximum number of layer information formats in LayerInformationFormat enum.

See also
LayerInformationFormat

◆ EnumMax< LayerType >()

template<>
constexpr int32_t nvinfer1::EnumMax< LayerType > ( )
inlineconstexprnoexcept

Maximum number of elements in LayerType enum.

See also
LayerType

◆ EnumMax< LoopOutput >()

template<>
constexpr int32_t nvinfer1::EnumMax< LoopOutput > ( )
inlineconstexprnoexcept

Maximum number of elements in LoopOutput enum.

See also
DataType

◆ EnumMax< MatrixOperation >()

template<>
constexpr int32_t nvinfer1::EnumMax< MatrixOperation > ( )
inlineconstexprnoexcept

Maximum number of elements in MatrixOperation enum.

See also
DataType

◆ EnumMax< MemoryPoolType >()

template<>
constexpr int32_t nvinfer1::EnumMax< MemoryPoolType > ( )
inlineconstexprnoexcept

Maximum number of memory pool types in the MemoryPoolType enum.

See also
MemoryPoolType

◆ EnumMax< NetworkDefinitionCreationFlag >()

template<>
constexpr int32_t nvinfer1::EnumMax< NetworkDefinitionCreationFlag > ( )
inlineconstexprnoexcept

Maximum number of elements in NetworkDefinitionCreationFlag enum.

See also
NetworkDefinitionCreationFlag

◆ EnumMax< OptProfileSelector >()

template<>
constexpr int32_t nvinfer1::EnumMax< OptProfileSelector > ( )
inlineconstexprnoexcept

Number of different values of OptProfileSelector enum.

See also
OptProfileSelector

◆ EnumMax< ProfilingVerbosity >()

template<>
constexpr int32_t nvinfer1::EnumMax< ProfilingVerbosity > ( )
inlineconstexprnoexcept

Maximum number of profile verbosity levels in ProfilingVerbosity enum.

See also
ProfilingVerbosity

◆ EnumMax< QuantizationFlag >()

template<>
constexpr int32_t nvinfer1::EnumMax< QuantizationFlag > ( )
inlineconstexprnoexcept

Maximum number of quantization flags in QuantizationFlag enum.

See also
QuantizationFlag

◆ EnumMax< ReduceOperation >()

template<>
constexpr int32_t nvinfer1::EnumMax< ReduceOperation > ( )
inlineconstexprnoexcept

Maximum number of elements in ReduceOperation enum.

See also
ReduceOperation

◆ EnumMax< SampleMode >()

template<>
constexpr int32_t nvinfer1::EnumMax< SampleMode > ( )
inlineconstexprnoexcept

Maximum number of elements in SampleMode enum.

See also
SampleMode

◆ EnumMax< ScaleMode >()

template<>
constexpr int32_t nvinfer1::EnumMax< ScaleMode > ( )
inlineconstexprnoexcept

Maximum number of elements in ScaleMode enum.

See also
ScaleMode

◆ EnumMax< ScatterMode >()

template<>
constexpr int32_t nvinfer1::EnumMax< ScatterMode > ( )
inlineconstexprnoexcept

Maximum number of elements in ScatterMode enum.

See also
ScatterMode

◆ EnumMax< SerializationFlag >()

template<>
constexpr int32_t nvinfer1::EnumMax< SerializationFlag > ( )
inlineconstexprnoexcept

Maximum number of serialization flags in SerializationFlag enum.

See also
SerializationFlag

◆ EnumMax< TacticSource >()

template<>
constexpr int32_t nvinfer1::EnumMax< TacticSource > ( )
inlineconstexprnoexcept

Maximum number of tactic sources in TacticSource enum.

See also
TacticSource

◆ EnumMax< TempfileControlFlag >()

template<>
constexpr int32_t nvinfer1::EnumMax< TempfileControlFlag > ( )
inlineconstexprnoexcept

Maximum number of elements in TempfileControlFlag enum.

See also
TempfileControlFlag

◆ EnumMax< TopKOperation >()

template<>
constexpr int32_t nvinfer1::EnumMax< TopKOperation > ( )
inlineconstexprnoexcept

Maximum number of elements in TopKOperation enum.

See also
TopKOperation

◆ EnumMax< TripLimit >()

template<>
constexpr int32_t nvinfer1::EnumMax< TripLimit > ( )
inlineconstexprnoexcept

Maximum number of elements in TripLimit enum.

See also
DataType

◆ EnumMax< UnaryOperation >()

template<>
constexpr int32_t nvinfer1::EnumMax< UnaryOperation > ( )
inlineconstexprnoexcept

Maximum number of elements in UnaryOperation enum.

See also
UnaryOperation

◆ EnumMax< WeightsRole >()

template<>
constexpr int32_t nvinfer1::EnumMax< WeightsRole > ( )
inlineconstexprnoexcept

Maximum number of elements in WeightsRole enum.

See also
WeightsRole

◆ getBuilderPluginRegistry()

nvinfer1::IPluginRegistry * nvinfer1::getBuilderPluginRegistry ( nvinfer1::EngineCapability  capability)
noexcept

Return the plugin registry for building a Standard engine, or nullptr if no registry exists.

Also return nullptr if the input argument is not EngineCapability::kSTANDARD. Engine capabilities EngineCapability::kSTANDARD and EngineCapability::kSAFETY have distinct plugin registries. When building a Safety engine, use nvinfer1::getBuilderSafePluginRegistry(). Use IPluginRegistry::registerCreator from the registry to register plugins. Plugins registered in a registry associated with a specific engine capability are only available when building engines with that engine capability.

There is no plugin registry for EngineCapability::kDLA_STANDALONE.

◆ getBuilderSafePluginRegistry()

nvinfer1::safe::IPluginRegistry * nvinfer1::getBuilderSafePluginRegistry ( nvinfer1::EngineCapability  capability)
noexcept

Return the plugin registry for building a Safety engine, or nullptr if no registry exists.

Also return nullptr if the input argument is not EngineCapability::kSAFETY. When building a Standard engine, use nvinfer1::getBuilderPluginRegistry(). Use safe::IPluginRegistry::registerCreator from the registry to register plugins.