Removed C++ APIs and Replacements#

Warning

The APIs listed below have been removed in TensorRT 11.x and will cause compile-time errors if used. Review each entry for its replacement before upgrading.

BuilderFlag::kFP16: Strong typing with ModelOpt AutoCast
BuilderFlag::kINT8: Explicit quantization with Q/DQ nodes
BuilderFlag::kFP8: Explicit quantization with Q/DQ nodes
BuilderFlag::kBF16: Strong typing with ModelOpt AutoCast
BuilderFlag::kINT4: Explicit quantization with Q/DQ nodes
BuilderFlag::kFP4: Explicit quantization with Q/DQ nodes
BuilderFlag::kOBEY_PRECISION_CONSTRAINTS: Strong typing (always enforced)
BuilderFlag::kPREFER_PRECISION_CONSTRAINTS: Strong typing (always enforced)
BuilderFlag::kDIRECT_IO: Removed (not needed in 11.x)
IAlgorithm, IAlgorithmContext, IAlgorithmIOInfo, IAlgorithmSelector, IAlgorithmVariant: Use editable mode in ITimingCache instead.
IBuilderConfig::setInt8Calibrator(IInt8Calibrator*): Explicit quantization with Q/DQ nodes
IBuilder::platformHasFastFp16(): Removed; use strongly typed networks instead of querying platform FP16 support. Third-party code that links against TensorRT 11.x - including the ONNX Runtime TensorRT Execution Provider - must migrate away from this API. ONNX Runtime 1.27+ supports TensorRT 11.x. Refer to the TensorRT 11.0.0 release notes (Known Issues) and TensorRT 11.1.0 release notes (Fixed Issues).
IBuilder::platformHasFastInt8(): Removed; use explicit quantization with Q/DQ nodes instead of querying platform INT8 support.
IBuilderConfig::getInt8Calibrator(): Removed
IBuilderConfig::setCalibrationProfile(IOptimizationProfile const*): Removed
IBuilderConfig::getCalibrationProfile(): Removed
IBuilderConfig::setQuantizationFlags(QuantizationFlags): Removed
IBuilderConfig::getQuantizationFlags(): Removed
IBuilderConfig::clearQuantizationFlag(QuantizationFlag): Removed
IBuilderConfig::setQuantizationFlag(QuantizationFlag): Removed
IBuilderConfig::getQuantizationFlag(QuantizationFlag): Removed
ICudaEngine::createExecutionContextWithoutDeviceMemory(): ICudaEngine::createExecutionContext()
ICudaEngine::getDeviceMemorySize(): ICudaEngine::getDeviceMemorySizeV2()
ICudaEngine::getDeviceMemorySizeForProfile(int32_t): ICudaEngine::getDeviceMemorySizeForProfileV2(int32_t)
ICudaEngine::getMinimumWeightStreamingBudget(): Compute from getStreamableWeightsSize()
ICudaEngine::getProfileTensorValues(char const*, int32_t, OptProfileSelector): ICudaEngine::getProfileTensorValuesV2()
ICudaEngine::getWeightStreamingBudget(): ICudaEngine::getWeightStreamingBudgetV2()
ICudaEngine::hasImplicitBatchDimension(): Removed (always false)
ICudaEngine::setWeightStreamingBudget(int64_t): ICudaEngine::setWeightStreamingBudgetV2(int64_t)
IExecutionContext::allInputShapesSpecified(): Removed (always true)
IExecutionContext::setDeviceMemory(void*): IExecutionContext::setDeviceMemoryV2(void*, int64_t)
IGpuAllocator::allocate(uint64_t, uint64_t, AllocatorFlags): IGpuAllocator::allocateAsync(uint64_t, uint64_t, AllocatorFlags, cudaStream_t)
IGpuAllocator::deallocate(void*): IGpuAllocator::deallocateAsync(void*, cudaStream_t)
IInt8Calibrator (all subclasses): Explicit quantization with Q/DQ nodes
ILayer::setPrecision(DataType): Strong typing (set types on tensors directly)
ILayer::getPrecision(): Removed
ILayer::precisionIsSet(): Removed
ILayer::resetPrecision(): Removed
ILayer::setOutputType(int32_t, DataType): Strong typing (set types on tensors directly)
ILayer::outputTypeIsSet(int32_t): Removed
ILayer::resetOutputType(int32_t): Removed
INetworkDefinition::addAttention(..., bool): INetworkDefinition::addAttentionV2(..., CausalMaskKind)
INetworkDefinition::addNMS(ITensor&, ITensor&, ITensor&): INetworkDefinition::addNMS(..., DataType) (4-arg version)
INetworkDefinition::addNonZero(ITensor&): INetworkDefinition::addNonZero(ITensor&, DataType)
INetworkDefinition::addNormalization(...): INetworkDefinition::addNormalizationV2(...)
INetworkDefinition::addPluginV2(ITensor* const*, int32_t, IPluginV2&): INetworkDefinition::addPluginV3(...)
INetworkDefinition::addTopK(ITensor&, TopKOperation, int32_t, uint32_t): INetworkDefinition::addTopK(..., DataType) (5-arg version)
INormalizationLayer::setComputePrecision(DataType): Removed (use strong typing)
IOutputAllocator::reallocateOutput(char const*, void*, uint64_t, uint64_t): IOutputAllocator::reallocateOutputAsync(..., cudaStream_t)
IPluginCreator: IPluginCreatorV3One
IPluginRegistry::deregisterCreator(IPluginCreator const&): IPluginRegistry::deregisterCreator(IPluginCreatorInterface const&)
IPluginRegistry::getPluginCreator(...): IPluginRegistry::getCreator(...)
IPluginRegistry::getPluginCreatorList(int32_t*): IPluginRegistry::getAllCreators(int32_t*)
IPluginRegistry::registerCreator(IPluginCreator&, ...): IPluginRegistry::registerCreator(IPluginCreatorInterface&, ...)
IPluginV2DynamicExt: IPluginV3
IPluginV2Ext: IPluginV3
IPluginV2IOExt: IPluginV3
IPluginV2Layer: IPluginV3Layer
IRefitter::setDynamicRange(char const*, float, float): Explicit quantization with Q/DQ nodes
IRefitter::getDynamicRangeMin(char const*): Removed
IRefitter::getDynamicRangeMax(char const*): Removed
IRefitter::getTensorsWithDynamicRange(): Removed
IRuntime::deserializeCudaEngine(IStreamReader&): IRuntime::deserializeCudaEngine(IStreamReaderV2&)
ITensor::setType(DataType): Strong typing (type determined by network construction)
ITensor::setDynamicRange(float, float): Explicit quantization with Q/DQ nodes
ITensor::dynamicRangeIsSet(): Removed
ITensor::resetDynamicRange(): Removed
ITensor::getDynamicRangeMin(): Removed
ITensor::getDynamicRangeMax(): Removed
ITensor::setBroadcastAcrossBatch(bool): Removed (implicit batch not supported)
ITensor::getBroadcastAcrossBatch(): Removed (implicit batch not supported)
TacticSource::kCUBLAS: Removed
TacticSource::kCUBLAS_LT: Removed
TacticSource::kCUDNN: Removed
DetectionOutputParameters: Removed
NMSParameters: Removed
CodeTypeSSD: Removed

Removed C++ Plugins and Replacements#

Warning

The plugins listed below have been removed in TensorRT 11.x. Using them will cause compilation or linker errors. Review each entry for its replacement before upgrading.

BatchedNMS_TRT: Use INetworkDefinition::addNMS()
BatchedNMSDynamic_TRT: Use INetworkDefinition::addNMS()
BatchTilePlugin_TRT: Implement with standard TensorRT layers
Clip_TRT: Use INetworkDefinition::addActivation() with kCLIP
CoordConvAC: Implement with standard TensorRT layers (concatenate coordinate channels with IConcatenationLayer, then apply convolution)
CustomGeluPluginDynamic: Use INetworkDefinition::addActivation() with kGELU_ERF or kGELU_TANH
EfficientNMS_ONNX_TRT: Use INetworkDefinition::addNMS()
LReLU_TRT: Use INetworkDefinition::addActivation() with kLEAKY_RELU
NMS_TRT: Use INetworkDefinition::addNMS()
NMSDynamic_TRT: Use INetworkDefinition::addNMS()
Normalize_TRT: Use INetworkDefinition::addNormalizationV2()
Proposal: Implement with standard TensorRT layers
SingleStepLSTMPlugin: Use INetworkDefinition::addLoop() or standard RNN decomposition
SpecialSlice_TRT: Use INetworkDefinition::addSlice()
Split: Use INetworkDefinition::addSlice()

Deprecated BERT Plugins#

The following OSS BERT plugin classes are deprecated in 11.0.0 and scheduled for removal in a future release. Migrate to the listed replacements before upgrading beyond 11.x.

bertQKVToContextPlugin / CustomQKVToContextPluginDynamic: Refer to Migrate to IAttention for more information.