Holoscan SDK v4.0.0

Class Tensor

class Tensor

Tensor class.

A Tensor is a multi-dimensional array of elements of a single data type.

The Tensor class is a wrapper around the DLManagedTensorContext struct that holds the DLManagedTensor object. (https://dmlc.github.io/dlpack/latest/c_api.html#c.DLManagedTensor).

This class provides a primary interface to access Tensor data and is interoperable with other frameworks that support DLManagedTensor.

Public Functions

Tensor() = default
inline explicit Tensor(std::shared_ptr<DLManagedTensorContext> &ctx, nvidia::gxf::MemoryBuffer *memory_buffer_ptr = nullptr)

Construct a new Tensor from an existing DLManagedTensorContext.

Parameters
  • ctx – A shared pointer to the DLManagedTensorContext to be used in Tensor construction.

  • memory_buffer_ptr – Optional pointer to the underlying nvidia::gxf::MemoryBuffer. When provided (for tensors from GXF allocators), enables stream-aware deallocation via set_deallocation_stream(). Pass nullptr for external DLPack tensors.

explicit Tensor(DLManagedTensor *dl_managed_tensor_ptr)

Construct a new Tensor from an existing DLManagedTensor pointer.

Parameters

dl_managed_tensor_ptr – A pointer to the DLManagedTensor to be used in Tensor construction.

explicit Tensor(DLManagedTensorVersioned *dl_managed_tensor_ver_ptr)

Construct a new Tensor from an existing DLManagedTensorVersioned pointer.

Note that currently holoscan::Tensor does not support versioned tensors from the C++ API, so any version information and flags from DLPack >= 1.0 will not be stored.

Parameters

dl_managed_tensor_ver_ptr – A pointer to the DLManagedTensorVersioned to be used in Tensor construction.

virtual ~Tensor() = default
inline void *data() const

Get a pointer to the underlying data.

Returns

The pointer to the Tensor’s data.

inline DLDevice device() const

Get the device information of the Tensor.

Returns

The device information of the Tensor.

inline DLDataType dtype() const

Get the Tensor’s data type information.

For details of the DLDataType struct see the DLPack documentation: https://dmlc.github.io/dlpack/latest/c_api.html#_CPPv410DLDataType

Returns

The DLDataType struct containing DLPack dtype information for the tensor.

std::vector<int64_t> shape() const

Get the shape of the Tensor data.

Returns

The vector containing the Tensor’s shape.

std::vector<int64_t> strides() const

Get the strides of the Tensor data.

Note that, unlike DLTensor.strides, the strides this method returns are in number of bytes, not elements (to be consistent with NumPy/CuPy’s strides).

Returns

The vector containing the Tensor’s strides.

bool is_contiguous() const

Check if the tensor a has contiguous, row-major memory layout.

Returns

True if the tensor is contiguous, False otherwise.

int64_t size() const

Get the size (number of elements) in the Tensor.

The size is defined as the number of elements, not the number of bytes. For the latter, see nbytes.

If the underlying DLDataType contains multiple lanes, all lanes are considered as a single element. For example, a float4 vectorized type is counted as a single element, not four elements.

Returns

The size of the tensor in number of elements.

inline int32_t ndim() const

Get the number of dimensions of the Tensor.

Returns

The number of dimensions.

inline uint8_t itemsize() const

Get the itemsize of a single Tensor data element.

If the underlying DLDataType contains multiple lanes, itemsize takes this into account. For example, a Tensor containing (vectorized) float4 elements would have itemsize 16, not 4.

Returns

The itemsize of the Tensor’s data.

inline int64_t nbytes() const

Get the total number of bytes for the Tensor’s data.

Returns

The size of the Tensor’s data in bytes.

DLManagedTensor *to_dlpack()

Get a DLPack managed tensor pointer to the Tensor.

Returns

A DLManagedTensor* pointer corresponding to the Tensor.

DLManagedTensorVersioned *to_dlpack_versioned()

Get a DLPack versioned managed tensor pointer to the Tensor.

Returns

A DLManagedTensorVersioned* pointer corresponding to the Tensor.

inline std::shared_ptr<DLManagedTensorContext> &dl_ctx()

Get the internal DLManagedTensorContext of the Tensor.

Returns

A shared pointer to the Tensor’s DLManagedTensorContext.

bool set_deallocation_stream(cudaStream_t stream)

Set the CUDA stream for stream-aware memory deallocation.

For sink operators that don’t emit data, this method should be called with the operator’s working CUDA stream to ensure allocators (like BlockMemoryPool) defer memory reuse until GPU operations on the stream complete. This prevents race conditions where memory is returned to the pool while GPU kernels are still reading from it.

This method only works for tensors whose memory is managed by a Holoscan/GXF allocator (i.e., tensors received from upstream operators in the pipeline). For tensors created from external sources via the DLPack interface (e.g., from CuPy or PyTorch), this method returns false and has no effect.

Parameters

stream – The CUDA stream that last accessed this tensor’s data.

Returns

true if the stream was set successfully, false if the tensor’s memory is not managed by a Holoscan/GXF allocator.

Protected Attributes

std::shared_ptr<DLManagedTensorContext> dl_ctx_

The DLManagedTensorContext object.

nvidia::gxf::MemoryBuffer *memory_buffer_ptr_ = nullptr

Pointer to the underlying MemoryBuffer when tensor is from a GXF allocator. nullptr for external DLPack tensors (from CuPy, PyTorch, etc.). This enables stream-aware deallocation via set_deallocation_stream().

Previous Class SystemResourceManager
Next Class TensorMap
© Copyright 2022-2026, NVIDIA. Last updated on Mar 9, 2026