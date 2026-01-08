Example Tensor transmitter operator.

On each tick, it transmits a single tensor on the “out” port.

This operator is intended for use in test cases and example applications.

==Named Outputs==

out : nvidia::gxf::Tensor A generated 1D (H), 2D (HW), 3D (HWC) or 4D (NHWC) tensor (initialized with the data provided or uninitialized). Depending on the parameters set, this tensor can be in system memory, pinned host memory or device memory. Setting batch_size , columns or channels to 0 will omit the corresponding dimension. Notation used: N = batch, H = rows, W = columns, C = channels.



==Parameters==

allocator : The memory allocator to use. When not set, a default UnboundedAllocator is used.

storage_type : A string indicating where the memory should be allocated. Options are “system” (system/CPU memory), “host” (CUDA pinned host memory) or “device” (GPU memory). The allocator takes care of allocating memory of the indicated type. The default is “system”.

batch_size : Size of the batch dimension of the generated tensor. If set to 0, this dimension is omitted. The default is 0.

rows : The number of rows in the generated tensor. This dimension must be >= 1. The default is 32.

columns : The number of columns in the generated tensor. If set to 0, this dimension is omitted. The default is 64.

channels : The number of channels in the generated tensor. If set to 0, this dimension is omitted. The default is 0.

data_type_ : A string representing the data type for the generated tensor. Must be one of “int8_t”, “int16_t”, “int32_t”, “int64_t”, “uint8_t”, “uint16_t”, “uint32_t”, “uint64_t”, “float”, “double”, “complex<float”, or “complex<double>”. The default is “uint8_t”.

tensor_name : The name of the generated tensor. The default name is “tensor”.

data: The data to be transmitted. If provided, the tensor will be initialized with this data.

==Device Memory Requirements==

When using this operator with a BlockMemoryPool , the minimum block_size is (batch_size * rows * columns * channels * element_size_bytes) where element_size_bytes is is the number of bytes for a single element of the specified data_type . Only a single memory block is used.

==Notes==

When async_device_allocation is enabled, this operator allocates device memory asynchronously on a CUDA stream. The compute method may return before all GPU work has completed. Downstream operators that receive data from this operator should call op_input.receive_cuda_stream(<port_name>) to synchronize the CUDA stream with the downstream operator’s dedicated internal stream. This ensures proper synchronization before accessing the data. For more details on CUDA stream handling in Holoscan, see: https://docs.nvidia.com/holoscan/sdk-user-guide/holoscan_cuda_stream_handling.html

Public Functions

HOLOSCAN_OPERATOR_FORWARD_ARGS (PingTensorTxOp) PingTensorTxOp()=default

virtual void initialize ( ) override

Initialize the operator. This function is called when the fragment is initialized by Executor::initialize_fragment().

virtual void setup ( OperatorSpec & spec ) override

Define the operator specification. Parameters spec – The reference to the operator specification.

virtual void compute ( InputContext & op_input , OutputContext & op_output , ExecutionContext & context ) override

Implement the compute method. This method is called by the runtime multiple times. The runtime calls this method until the operator is stopped. Parameters op_input – The input context of the operator.

op_output – The output context of the operator.

context – The execution context of the operator.