Runtime¶

tensorrt_rtx.TempfileControlFlag¶

Flags used to control TensorRT-RTX’s behavior when creating executable temporary files.

On some platforms the TensorRT-RTX runtime may need to create files in a temporary directory or use platform-specific APIs to create files in-memory to load temporary DLLs that implement runtime code. These flags allow the application to explicitly control TensorRT-RTX’s use of these files. This will preclude the use of certain TensorRT-RTX APIs for deserializing and loading lean runtimes.

These should be treated as bit offsets, e.g. in order to allow in-memory files for a given IRuntime:

runtime.tempfile_control_flags |= (1 << int(TempfileControlFlag.ALLOW_IN_MEMORY_FILES))

Members:

ALLOW_IN_MEMORY_FILES : Allow creating and loading files in-memory (or unnamed files).

ALLOW_TEMPORARY_FILES : Allow creating and loading named files in a temporary directory on the filesystem.

class tensorrt_rtx.Runtime(self: tensorrt_rtx.tensorrt_rtx.Runtime, logger: tensorrt_rtx.tensorrt_rtx.ILogger)¶

Allows a serialized ICudaEngine to be deserialized.

Variables:

error_recorder – IErrorRecorder Application-implemented error reporting interface for TensorRT-RTX objects.
gpu_allocator – IGpuAllocator The GPU allocator to be used by the Runtime . All GPU memory acquired will use this allocator. If set to None, the default allocator will be used (Default: cudaMalloc/cudaFree).
DLA_core – int The DLA core that the engine executes on. Must be between 0 and N-1 where N is the number of available DLA cores.
num_DLA_cores – int The number of DLA engines available to this builder.
logger – ILogger The logger provided when creating the refitter.
max_threads – int The maximum thread that can be used by the Runtime.
temporary_directory – str The temporary directory to use when loading executable code for engines. If set to None (the default), TensorRT-RTX will attempt to find a suitable directory for use using platform-specific heuristics: - On UNIX/Linux platforms, TensorRT-RTX will first try the TMPDIR environment variable, then fall back to /tmp - On Windows, TensorRT-RTX will try the TEMP environment variable.
tempfile_control_flags – int Flags which control whether TensorRT-RTX is allowed to create in-memory or temporary files. See TempfileControlFlag for details.
engine_host_code_allowed – bool Whether this runtime is allowed to deserialize engines that contain host executable code (Default: False).

Parameters:

logger – The logger to use.

__del__(self: tensorrt_rtx.tensorrt_rtx.Runtime) → None¶

__exit__(exc_type, exc_value, traceback)¶: Context managers are deprecated and have no effect. Objects are automatically freed when the reference count reaches 0.

__init__(self: tensorrt_rtx.tensorrt_rtx.Runtime, logger: tensorrt_rtx.tensorrt_rtx.ILogger) → None¶

Parameters:: logger – The logger to use.

deserialize_cuda_engine(*args, **kwargs)¶

Overloaded function.

deserialize_cuda_engine(self: tensorrt_rtx.tensorrt_rtx.Runtime, serialized_engine: buffer) -> tensorrt_rtx.tensorrt_rtx.ICudaEngine

Deserialize an ICudaEngine from host memory.

arg serialized_engine:

The buffer that holds the serialized ICudaEngine.

returns:

The ICudaEngine, or None if it could not be deserialized.
deserialize_cuda_engine(self: tensorrt_rtx.tensorrt_rtx.Runtime, stream_reader_v2: tensorrt_rtx.tensorrt_rtx.IStreamReaderV2) -> tensorrt_rtx.tensorrt_rtx.ICudaEngine

Deserialize an ICudaEngine from a stream reader v2.

arg stream_reader:

The PyStreamReaderV2 that will read the serialized ICudaEngine. This enables deserialization from a file directly, with possible benefits to performance.

returns:

The ICudaEngine, or None if it could not be deserialized.

get_engine_validity(self: tensorrt_rtx.tensorrt_rtx.Runtime, buffer: buffer) → tuple¶: Get engine validity and diagnostics from serialized engine buffer. Returns tuple of (EngineValidity, diagnostics).

get_plugin_registry(self: tensorrt_rtx.tensorrt_rtx.Runtime) → tensorrt_rtx.tensorrt_rtx.IPluginRegistry¶

Get the local plugin registry that can be used by the runtime.

Returns:: The local plugin registry that can be used by the runtime.

load_runtime(self: tensorrt_rtx.tensorrt_rtx.Runtime, path: str) → tensorrt_rtx.tensorrt_rtx.Runtime¶

Load IRuntime from the file.

This method loads a runtime library from a shared library file. The runtime can then be used to execute a plan file built with BuilderFlag.VERSION_COMPATIBLE and BuilderFlag.EXCLUDE_LEAN_RUNTIME both set and built with the same version of TensorRT-RTX as the loaded runtime library.

Variables:: path – Path to the runtime lean library.
Returns:: The IRuntime, or None if it could not be loaded.