For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Introduction
    • Overview
    • Relevant Technologies
    • Getting Started
  • Setup
    • SDK Installation
    • Additional Setup
    • Third Party Hardware Setup
  • Using the SDK
    • Holoscan Core
    • GPU Resident Execution
    • Holoscan by Example
    • Create an Application
    • Create a Distributed Application
    • Create an Operator
    • Create an Operator via Decorator
    • Create a Condition
    • Dynamic Flow Control
    • CUDA Stream Handling
    • Logging
    • Data Logging
    • Debugging
    • Python Operator Bindings
  • Operators
    • Operators and Extensions
    • Visualization
    • Inference
    • Testing
    • Video I/O Vendor Implementation Guide
  • Components
    • Schedulers
    • Conditions
    • Resources
    • Analytics
  • AI Skills
    • Ai Skills
  • API reference
          • Allocator
          • AnnotatedDoubleBufferReceiver
          • AnnotatedDoubleBufferTransmitter
          • AppDriver
          • Application
          • AppWorker
          • Arg
          • ArgList
          • ArgType
          • ArgumentSetter
          • AsyncBufferReceiver
          • AsyncBufferTransmitter
          • AsyncDataLoggerBackend
          • AsyncDataLoggerResource
          • AsynchronousCondition
          • BlockMemoryPool
          • BooleanCondition
          • CLIParser
          • Clock
          • ClockInterface
          • CodecRegistry
          • Component
          • ComponentBase
          • ComponentSpec
          • Condition
          • ConditionCombiner
          • Config
          • CountCondition
          • CPUResourceMonitor
          • CPUThread
          • CsvDataExporter
          • CudaAllocator
          • CudaBufferAvailableCondition
          • CudaContextScopedPush
          • CudaEventCondition
          • CudaFunctionLauncher
          • CudaGreenContext
          • CudaGreenContextPool
          • CudaObjectHandler
          • CudaStreamCondition
          • CudaStreamHandler
          • CudaStreamPool
          • DataExporter
          • DataFlowTracker
          • DataLogger
          • DataLoggerQueue
          • DataLoggerResource
          • DefaultFragmentService
          • DFFTCollector
          • DistributedAppService
          • DLManagedMemoryBufferVersioned
          • DoubleBufferReceiver
          • DoubleBufferTransmitter
          • DownstreamMessageAffordableCondition
          • Endpoint
          • EventBasedScheduler
          • ExecutionContext
          • Executor
          • ExpiringMessageAvailableCondition
          • ExtensionManager
          • FastDdsDiscovery
          • FastDdsEndpoint
          • FastDdsHoloscanEntityTypeSupport
          • FastDdsNativeBufferAdapter
          • FastDdsPubSubContext
          • FastDdsPubSubNetworkContext
          • FastDdsSerializer
          • FastDdsTransport
          • FileFIFOMutex
          • FirstFitAllocator
          • FirstFitAllocatorBase
          • FirstPixelOutCondition
          • FlowGraph
          • FlowGraphImpl
          • Fragment
          • FragmentAllocationStrategy
          • FragmentScheduler
          • FragmentService
          • FragmentServiceProvider
          • GPUDevice
          • GPUResidentDeck
          • GPUResidentExecutor
          • GPUResidentOperator
          • GPUResourceMonitor
          • GreedyFragmentAllocationStrategy
          • GreedyScheduler
          • GXFComponentResource
          • HoloEntitySerializerBase
          • HoloIpcCudaNativeBufferAdapterBase
          • HoloscanAsyncBufferReceiver
          • HoloscanAsyncBufferTransmitter
          • HoloscanLogger
          • HoloscanUcxReceiver
          • HoloscanUcxTransmitter
          • InMemoryPubSubNetworkContext
          • InMemoryPubSubSession
          • InputContext
          • IOSpec
          • LockFreeQueue
          • Logger
          • ManualClock
          • Map
          • MatXAllocator
          • MemoryAvailableCondition
          • Message
          • MessageAvailableCondition
          • MessageLabel
          • MetadataDictionary
          • MetaParameter
          • MultiMessageAvailableCondition
          • MultiMessageAvailableTimeoutCondition
          • MultiThreadScheduler
          • NativeBufferProtocolAdapter
          • NetworkContext
          • Nullable
          • Operator
          • OperatorSpec
          • OrConditionCombiner
          • OrderedQueue
          • OutputContext
          • ParameterWrapper
          • PathMetrics
          • PeriodicCondition
          • PoseTree
          • PoseTreeEdgeHistory
          • PoseTreeManager
          • PoseTreeUCXClient
          • PoseTreeUCXServer
          • PresentDoneCondition
          • PublisherAvailableCondition
          • PubSubContext
          • PubSubReceiver
          • PubSubTransmitter
          • RealtimeClock
          • Receiver
          • Resource
          • RMMAllocator
          • RuntimeError
          • Scheduler
          • ScopedFlock
          • ScopedWaitedFlock
          • SerializationBuffer
          • SessionDiscoveryFrontend
          • SessionTransportFrontend
          • SidecarDispatchQueue
          • SignalHandler
          • SO2
          • SO3
          • StdComponentSerializer
          • StdEntitySerializer
          • StdPubSubEntitySerializer
          • StreamOrderedAllocator
          • Subgraph
          • SubscriberAvailableCondition
          • SyntheticClock
          • SystemResourceManager
          • Tensor
          • TensorMap
          • ThreadPool
          • Timer
          • Topology
          • Transmitter
          • UcxComponentSerializer
          • UcxContext
          • UcxEntitySerializer
          • UcxHoloscanComponentSerializer
          • UcxReceiver
          • UcxSerializationBuffer
          • UcxTransmitter
          • UnboundedAllocator
  • Performance
    • Performance Considerations
    • Flow Tracking
    • GXF Job Statistics
    • Nsight Profiling
  • HoloHub
    • HoloHub Overview
  • FAQ
    • FAQ
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
On this page
  • Constructors
  • Tensor
  • Destructor
  • ~Tensor
  • Methods
  • data
  • device
  • dtype
  • shape
  • strides
  • is_contiguous
  • size
  • ndim
  • itemsize
  • nbytes
  • to_dlpack
  • to_dlpack_versioned
  • dl_ctx
  • set_deallocation_stream
  • Member variables
API referenceC++ APIHoloscanClasses

holoscan::Tensor

Beta
||View as Markdown|
Previous

holoscan::SystemResourceManager

Next

holoscan::TensorMap

Tensor class.

A Tensor is a multi-dimensional array of elements of a single data type.

The Tensor class is a wrapper around the DLManagedTensorContext struct that holds the DLManagedTensor object. (https://dmlc.github.io/dlpack/latest/c_api.html#c.DLManagedTensor).

This class provides a primary interface to access Tensor data and is interoperable with other frameworks that support DLManagedTensor.

#include <holoscan/tensor.hpp>

Constructors

Tensor

Default
Overload 2
From raw pointer (with dl managed tensor ptr)
From raw pointer (with dl managed tensor ver ptr)
holoscan::Tensor::Tensor() = defaultholoscan::Tensor::Tensor() = default

Destructor

~Tensor

virtual holoscan::Tensor::~Tensor() = defaultvirtual holoscan::Tensor::~Tensor() = default

Methods

data

void * holoscan::Tensor::data() const

Get a pointer to the underlying data.

Returns: The pointer to the Tensor’s data.

device

DLDevice holoscan::Tensor::device() const

Get the device information of the Tensor.

Returns: The device information of the Tensor.

dtype

DLDataType holoscan::Tensor::dtype() const

Get the Tensor’s data type information.

For details of the DLDataType struct see the DLPack documentation: https://dmlc.github.io/dlpack/latest/c_api.html#_CPPv410DLDataType

Returns: The DLDataType struct containing DLPack dtype information for the tensor.

shape

std::vector<int64_t> holoscan::Tensor::shape() const

Get the shape of the Tensor data.

Returns: The vector containing the Tensor’s shape.

strides

std::vector<int64_t> holoscan::Tensor::strides() const

Get the strides of the Tensor data.

Note that, unlike DLTensor.strides, the strides this method returns are in number of bytes, not elements (to be consistent with NumPy/CuPy’s strides).

Returns: The vector containing the Tensor’s strides.

is_contiguous

bool holoscan::Tensor::is_contiguous() const

Check if the tensor a has contiguous, row-major memory layout.

Returns: True if the tensor is contiguous, False otherwise.

size

int64_t holoscan::Tensor::size() const

Get the size (number of elements) in the Tensor.

The size is defined as the number of elements, not the number of bytes. For the latter, see ::nbytes.

If the underlying DLDataType contains multiple lanes, all lanes are considered as a single element. For example, a float4 vectorized type is counted as a single element, not four elements.

Returns: The size of the tensor in number of elements.

ndim

int32_t holoscan::Tensor::ndim() const

Get the number of dimensions of the Tensor.

Returns: The number of dimensions.

itemsize

uint8_t holoscan::Tensor::itemsize() const

Get the itemsize of a single Tensor data element.

If the underlying DLDataType contains multiple lanes, itemsize takes this into account. For example, a Tensor containing (vectorized) float4 elements would have itemsize 16, not 4.

Returns: The itemsize of the Tensor’s data.

nbytes

int64_t holoscan::Tensor::nbytes() const

Get the total number of bytes for the Tensor’s data.

Returns: The size of the Tensor’s data in bytes.

to_dlpack

DLManagedTensor * holoscan::Tensor::to_dlpack() constDLManagedTensor * holoscan::Tensor::to_dlpack() const

Get a DLPack managed tensor pointer to the Tensor.

Returns: A DLManagedTensor* pointer corresponding to the Tensor.

to_dlpack_versioned

DLManagedTensorVersioned * holoscan::Tensor::to_dlpack_versioned() constDLManagedTensorVersioned * holoscan::Tensor::to_dlpack_versioned() const

Get a DLPack versioned managed tensor pointer to the Tensor.

Returns: A DLManagedTensorVersioned* pointer corresponding to the Tensor.

dl_ctx

std::shared_ptr<DLManagedTensorContext> & holoscan::Tensor::dl_ctx()std::shared_ptr<DLManagedTensorContext> & holoscan::Tensor::dl_ctx()

Get the internal DLManagedTensorContext of the Tensor.

Returns: A shared pointer to the Tensor’s DLManagedTensorContext.

set_deallocation_stream

bool holoscan::Tensor::set_deallocation_stream(
cudaStream_t stream
)

Set the CUDA stream for stream-aware memory deallocation.

For sink operators that don’t emit data, this method should be called with the operator’s working CUDA stream to ensure allocators (like BlockMemoryPool) defer memory reuse until GPU operations on the stream complete. This prevents race conditions where memory is returned to the pool while GPU kernels are still reading from it.

This method only works for tensors whose memory is managed by a Holoscan/GXF allocator (i.e., tensors received from upstream operators in the pipeline). For tensors created from external sources via the DLPack interface (e.g., from CuPy or PyTorch), this method returns false and has no effect.

Returns: true if the stream was set successfully, false if the tensor’s memory is not managed by a Holoscan/GXF allocator.

Parameters

stream
cudaStream_t

The CUDA stream that last accessed this tensor’s data.


Member variables

NameTypeDescription
dl_ctx_std::shared_ptr< DLManagedTensorContext >The DLManagedTensorContext object.
memory_buffer_ptr_nvidia::gxf::MemoryBuffer *Pointer to the underlying MemoryBuffer when tensor is from a GXF allocator.