Is this page helpful?

Migrating C++ Code from TensorRT 8.x to 10.x#

This page describes how to update C++ code when you migrate from TensorRT 8.x to 10.x: paired examples for enqueueV3 and the name-based tensor API, guidance on 64-bit dimensions, and lists of C++ APIs added and removed in 10.x.

Migrating from `enqueueV2` to `enqueueV3` (C++)#

The examples below show TensorRT 8.x first, then TensorRT 10.x, for the same inference task.

Before (TensorRT 8.x)#

// Create RAII buffer manager object.
samplesCommon::BufferManager buffers(mEngine);

auto context = SampleUniquePtr<nvinfer1::IExecutionContext>(mEngine->createExecutionContext());
if (!context)
{
    return false;
}

// Pick a random digit to try to infer.
srand(time(NULL));
int32_t const digit = rand() % 10;

// Read the input data into the managed buffers.
// There should be just 1 input tensor.
ASSERT(mParams.inputTensorNames.size() == 1);

if (!processInput(buffers, mParams.inputTensorNames[0], digit))
{
    return false;
}
// Create a CUDA stream to execute this inference.
cudaStream_t stream;
CHECK(cudaStreamCreate(&stream));

// Asynchronously copy data from host input buffers to device input
buffers.copyInputToDeviceAsync(stream);

// Asynchronously enqueue the inference work
if (!context->enqueueV2(buffers.getDeviceBindings().data(), stream, nullptr))
{
    return false;
}
// Asynchronously copy data from device output buffers to host output buffers.
buffers.copyOutputToHostAsync(stream);

// Wait for the work in the stream to complete.
CHECK(cudaStreamSynchronize(stream));

// Release stream.
CHECK(cudaStreamDestroy(stream));

Warning

In TensorRT 10.x, enqueueV3 replaces enqueueV2. You must call setTensorAddress for each I/O tensor (using names from getIOTensorName) before enqueueV3. The bindings-array overload is no longer available.

The After sample shows this pattern.

After (TensorRT 10.x)#

// Create RAII buffer manager object.
samplesCommon::BufferManager buffers(mEngine);

auto context = SampleUniquePtr<nvinfer1::IExecutionContext>(mEngine->createExecutionContext());
if (!context)
{
    return false;
}

for (int32_t i = 0, e = mEngine->getNbIOTensors(); i < e; i++)
{
    auto const name = mEngine->getIOTensorName(i);
    context->setTensorAddress(name, buffers.getDeviceBuffer(name));
}

// Pick a random digit to try to infer.
srand(time(NULL));
int32_t const digit = rand() % 10;

// Read the input data into the managed buffers.
// There should be just 1 input tensor.
ASSERT(mParams.inputTensorNames.size() == 1);

if (!processInput(buffers, mParams.inputTensorNames[0], digit))
{
    return false;
}
// Create a CUDA stream to execute this inference.
cudaStream_t stream;
CHECK(cudaStreamCreate(&stream));

// Asynchronously copy data from host input buffers to device input
buffers.copyInputToDeviceAsync(stream);

// Asynchronously enqueue the inference work
if (!context->enqueueV3(stream))
{
    return false;
}

// Asynchronously copy data from device output buffers to host output buffers.
buffers.copyOutputToHostAsync(stream);

// Wait for the work in the stream to complete.
CHECK(cudaStreamSynchronize(stream));

// Release stream.
CHECK(cudaStreamDestroy(stream));

Summary of Changes#

Added explicit tensor address setup using setTensorAddress() with tensor names from getIOTensorName()
Changed from enqueueV2() to enqueueV3()
The bindings parameter is no longer passed to enqueueV3(); tensor addresses must be set beforehand using setTensorAddress()

Migrating `Dims` and `IShapeLayer` to 64-Bit Types#

Warning

This is a breaking ABI change. Code that bitwise copies to or from Dims must be updated for the wider type.

TensorRT 10.x changes the dimension type from int32_t to int64_t. The dimensions held by Dims changed from int32_t to int64_t. However, in TensorRT 10.x, TensorRT will generally reject networks that use dimensions exceeding the range of int32_t. The tensor type returned by IShapeLayer is now DataType::kINT64. Use ICastLayer to cast the result to the tensor of type DataType::kINT32 if 32-bit dimensions are required.

Inspect code that bitwise copies to and from Dims to ensure it is correct for int64_t dimensions.

C++ APIs Added in 10.x#

The following C++ APIs have been added in TensorRT 10.x to support new features and improved functionality.

Enums#

ActivationType::kGELU_ERF
ActivationType::kGELU_TANH
BuilderFlag::kREFIT_IDENTICAL
BuilderFlag::kSTRIP_PLAN
BuilderFlag::kWEIGHT_STREAMING
BuilderFlag::kSTRICT_NANS
Datatype::kINT4
LayerType::kPLUGIN_V3

Types#

APILanguage
Dims64
ExecutionContextAllocationStrategy
IGpuAsyncAllocator
InterfaceInfo
IPluginResource
IPluginV3
IStreamReader
IVersionedInterface

Methods and Properties#

getInferLibBuildVersion
getInferLibMajorVersion
getInferLibMinorVersion
getInferLibPatchVersion
IBuilderConfig::setMaxNbTactics
IBuilderConfig::getMaxNbTactics
ICudaEngine::createRefitter
IcudaEngine::getMinimumWeightStreamingBudget
IcudaEngine::getStreamableWeightsSize
ICudaEngine::getWeightStreamingBudget
IcudaEngine::isDebugTensor
ICudaEngine::setWeightStreamingBudget
IExecutionContext::getDebugListener
IExecutionContext::getTensorDebugState
IExecutionContext::setAllTensorsDebugState
IExecutionContext::setDebugListener
IExecutionContext::setOuputTensorAddress
IExecutionContext::setTensorDebugState
IExecutionContext::updateDeviceMemorySizeForShapes
IGpuAllocator::allocateAsync
IGpuAllocator::deallocateAsync
INetworkDefinition::addPluginV3
INetworkDefinition::isDebugTensor
INetworkDefinition::markDebug
INetworkDefinition::unmarkDebug
IPluginRegistry::acquirePluginResource
IPluginRegistry::deregisterCreator
IPluginRegistry::getAllCreators
IPluginRegistry::getCreator
IPluginRegistry::registerCreator
IPluginRegistry::releasePluginResource

Removed C++ APIs and Replacements#

Warning

The APIs listed below have been removed in TensorRT 10.x and will cause compilation or linker errors. Review each entry for its replacement before upgrading.

The following C++ APIs have been removed. Each entry shows the removed API and its replacement or migration path.

BuilderFlag::kENABLE_TACTIC_HEURISTIC: Builder optimization level 2
BuilderFlag::kSTRICT_TYPES [1]: Use: kREJECT_EMPTY_ALGORITHMS, kDIRECT_IO, and kPREFER_PRECISION_CONSTRAINTS
EngineCapability::kDEFAULT: EngineCapability::kSTANDARD
EngineCapability::kSAFE_DLA: EngineCapability::kDLA_STANDALONE
EngineCapability::kSAFE_GPU: EngineCapability::kSAFETY
IAlgorithm::getAlgorithmIOInfo(): IAlgorithm::getAlgorithmIOInfoByIndex()
IAlgorithmIOInfo::getTensorFormat(): Strides, data type, and vectorization information are sufficient to identify tensor formats uniquely.
IBuilder::buildEngineWithConfig(): IBuilder::buildSerializedNetwork()
IBuilder::destroy(): delete ObjectName
IBuilder::getMaxBatchSize(): Implicit batch support was removed
IBuilder::setMaxBatchSize(): Implicit batch support was removed
IBuilderConfig::destroy(): delete ObjectName
IBuilderConfig::getMaxWorkspaceSize(): IBuilderConfig::getMemoryPoolLimit() with MemoryPoolType::kWORKSPACE
IBuilderConfig::getMinTimingIterations(): IBuilderConfig::getAvgTimingIterations()
IBuilderConfig::setMaxWorkspaceSize(): IBuilderConfig::setMemoryPoolLimit() with MemoryPoolType::kWORKSPACE
IBuilderConfig::setMinTimingIterations(): IBuilderConfig::setAvgTimingIterations()
IConvolutionLayer::getDilation(): IConvolutionLayer::getDilationNd()
IConvolutionLayer::getKernelSize(): IConvolutionLayer::getKernelSizeNd()
IConvolutionLayer::getPadding(): IConvolutionLayer::getPaddingNd()
IConvolutionLayer::getStride(): IConvolutionLayer::getStrideNd()
IConvolutionLayer::setDilation(): IConvolutionLayer::setDilationNd()
IConvolutionLayer::setKernelSize(): IConvolutionLayer::setKernelSizeNd()
IConvolutionLayer::setPadding(): IConvolutionLayer::setPaddingNd()
IConvolutionLayer::setStride(): IConvolutionLayer::setStrideNd()
ICudaEngine::bindingIsInput(): ICudaEngine::getTensorIOMode()
ICudaEngine::destroy(): delete ObjectName
ICudaEngine::getBindingBytesPerComponent(): ICudaEngine::getTensorBytesPerComponent()
ICudaEngine::getBindingComponentsPerElement(): ICudaEngine::getTensorComponentsPerElement()
ICudaEngine::getBindingDataType(): ICudaEngine::getTensorDataType()
ICudaEngine::getBindingDimensions(): ICudaEngine::getTensorShape()
ICudaEngine::getBindingFormat(): ICudaEngine::getTensorFormat()
ICudaEngine::getBindingFormatDesc(): ICudaEngine::getTensorFormatDesc()
ICudaEngine::getBindingIndex(): Name-based methods
ICudaEngine::getBindingName(): Name-based methods
ICudaEngine::getBindingVectorizedDim(): ICudaEngine::getTensorVectorizedDim()
ICudaEngine::getLocation(): ITensor::getLocation()
ICudaEngine::getMaxBatchSize(): Implicit batch support was removed
ICudaEngine::getNbBindings(): ICudaEngine::getNbIOTensors()
ICudaEngine::getProfileDimensions(): ICudaEngine::getProfileShape()
ICudaEngine::getProfileShapeValues(): ICudaEngine::getShapeValues()
ICudaEngine::hasImplicitBatchDimension(): Implicit batch support was removed
ICudaEngine::isExecutionBinding(): No name-based equivalent replacement
ICudaEngine::isShapeBinding(): ICudaEngine::isShapeInferenceIO()
IDeconvolutionLayer::getKernelSize(): IDeconvolutionLayer::getKernelSizeNd()
IDeconvolutionLayer::getPadding(): IDeconvolutionLayer::getPaddingNd()
IDeconvolutionLayer::getStride(): IDeconvolutionLayer::getStrideNd()
IDeconvolutionLayer::setKernelSize(): IDeconvolutionLayer::setKernelSizeNd()
IDeconvolutionLayer::setPadding(): IDeconvolutionLayer::setPaddingNd()
IDeconvolutionLayer::setStride(): IDeconvolutionLayer::setStrideNd()
IExecutionContext::destroy(): delete ObjectName
IExecutionContext::enqueue(): IExecutionContext::enqueueV3()
IExecutionContext::enqueueV2(): IExecutionContext::enqueueV3()
IExecutionContext::execute(): IExecutionContext::executeV2()
IExecutionContext::getBindingDimensions(): IExecutionContext::getTensorShape()
IExecutionContext::getShapeBinding(): IExecutionContext::getTensorAddress() or getOutputTensorAddress()
IExecutionContext::getStrides(): IExecutionContext::getTensorStrides()
IExecutionContext::setBindingDimensions(): IExecutionContext::setInputShape()
IExecutionContext::setInputShapeBinding(): IExecutionContext::setInputTensorAddress() or setTensorAddress()
IExecutionContext::setOptimizationProfile(): IExecutionContext::setOptimizationProfileAsync()
IFullyConnectedLayer: IMatrixMultiplyLayer
IGpuAllocator::free(): IGpuAllocator::deallocate()
IHostMemory::destroy(): delete ObjectName
INetworkDefinition::addConvolution(): INetworkDefinition::addConvolutionNd()
INetworkDefinition::addDeconvolution(): INetworkDefinition::addDeconvolutionNd()
INetworkDefinition::addFullyConnected(): INetworkDefinition::addMatrixMultiply()
INetworkDefinition::addPadding(): INetworkDefinition::addPaddingNd()
INetworkDefinition::addPooling(): INetworkDefinition::addPoolingNd()
INetworkDefinition::addRNNv2(): INetworkDefinition::addLoop()
INetworkDefinition::destroy(): delete ObjectName
INetworkDefinition::hasExplicitPrecision(): Explicit precision support was removed in 10.0
INetworkDefinition::hasImplicitBatchDimension(): Implicit batch support was removed
IOnnxConfig::destroy(): delete ObjectName
IPaddingLayer::getPostPadding(): IPaddingLayer::getPostPaddingNd()
IPaddingLayer::getPrePadding(): IPaddingLayer::getPrePaddingNd()
IPaddingLayer::setPostPadding(): IPaddingLayer::setPostPaddingNd()
IPaddingLayer::setPrePadding(): IPaddingLayer::setPrePaddingNd()
IPoolingLayer::getPadding(): IPoolingLayer::getPaddingNd()
IPoolingLayer::getStride(): IPoolingLayer::getStrideNd()
IPoolingLayer::getWindowSize(): IPoolingLayer::getWindowSizeNd()
IPoolingLayer::setPadding(): IPoolingLayer::setPaddingNd()
IPoolingLayer::setStride(): IPoolingLayer::setStrideNd()
IPoolingLayer::setWindowSize(): IPoolingLayer::setWindowSizeNd()
IRefitter::destroy(): delete ObjectName
IResizeLayer::getAlignCorners(): IResizeLayer::getAlignCornersNd()
IResizeLayer::setAlignCorners(): IResizeLayer::setAlignCornersNd()
IRuntime::deserializeCudaEngine(void const* blob, std::size_t size, IPluginFactory* pluginFactory): Use deserializeCudaEngine with two parameters
IRuntime::destroy(): delete ObjectName
IRNNv2Layer: ILoop
kNV_TENSORRT_VERSION_IMPL [2]: NV_TENSORRT_VERSION_INT(major, minor, patch) macro: ((major) *10000L + (minor) *100L + (patch) *1L)
NetworkDefinitionCreationFlag::kEXPLICIT_BATCH: Support was removed in 10.0
NetworkDefinitionCreationFlag::kEXPLICIT_PRECISION: Support was removed in 10.0
NV_TENSORRT_SONAME_MAJOR: NV_TENSORRT_MAJOR
NV_TENSORRT_SONAME_MINOR: NV_TENSORRT_MINOR
NV_TENSORRT_SONAME_PATCH: NV_TENSORRT_PATCH
getBuilderSafePluginRegistry(): API will not be implemented
PaddingMode::kCAFFE_ROUND_DOWN: Caffe support was removed
PaddingMode::kCAFFE_ROUND_UP: Caffe support was removed
PreviewFeature::kDISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805: External tactics are always disabled for core code
PreviewFeature::kFASTER_DYNAMIC_SHAPES_080: This flag is on by default
ProfilingVerbosity::kDEFAULT: ProfilingVerbosity::kLAYER_NAMES_ONLY
ProfilingVerbosity::kVERBOSE: ProfilingVerbosity::kDETAILED
ResizeMode: Use InterpolationMode. Alias was removed.
RNNDirection: RNN-related data structures were removed
RNNGateType: RNN-related data structures were removed
RNNInputMode: RNN-related data structures were removed
RNNOperation: RNN-related data structures were removed
SampleMode::kDEFAULT: SampleMode::kSTRICT_BOUNDS
SliceMode: Use SampleMode. Alias was removed.

Removed C++ Plugins and Replacements#

Warning

The following C++ plugins have been removed in TensorRT 10.x. Update your code to use the creator-based replacements listed below.

createAnchorGeneratorPlugin(): GridAnchorPluginCreator::createPlugin()
createBatchedNMSPlugin(): BatchedNMSPluginCreator::createPlugin()
createInstanceNormalizationPlugin(): InstanceNormalizationPluginCreator::createPlugin()
createNMSPlugin(): NMSPluginCreator::createPlugin()
createNormalizePlugin(): NormalizePluginCreator::createPlugin()
createPriorBoxPlugin(): PriorBoxPluginCreator::createPlugin()
createRegionPlugin(): RegionPluginCreator::createPlugin()
createReorgPlugin(): ReorgPluginCreator::createPlugin()
createRPNROIPlugin(): RPROIPluginCreator::createPlugin()
createSplitPlugin(): INetworkDefinition::addSlice()
struct Quadruple: Related plugins were removed

Footnotes

Migrating C++ Code from TensorRT 8.x to 10.x#

Migrating from enqueueV2 to enqueueV3 (C++)#

Before (TensorRT 8.x)#

After (TensorRT 10.x)#

Summary of Changes#

Migrating Dims and IShapeLayer to 64-Bit Types#

C++ APIs Added in 10.x#

Enums#

Types#

Methods and Properties#

Removed C++ APIs and Replacements#

Removed C++ Plugins and Replacements#

Migrating from `enqueueV2` to `enqueueV3` (C++)#

Migrating `Dims` and `IShapeLayer` to 64-Bit Types#