Class Tensor
Defined in File tensor.hpp
-
class Tensor
Tensor class.
A Tensor is a multi-dimensional array of elements of a single data type.
The Tensor class is a wrapper around the DLManagedTensorContext struct that holds the DLManagedTensor object. (https://dmlc.github.io/dlpack/latest/c_api.html#c.DLManagedTensor).
This class provides a primary interface to access Tensor data and is interoperable with other frameworks that support DLManagedTensor.
Public Functions
-
Tensor() = default
Construct a new Tensor from an existing DLManagedTensorContext.
- Parameters
ctx – A shared pointer to the DLManagedTensorContext to be used in Tensor construction.
memory_buffer_ptr – Optional pointer to the underlying nvidia::gxf::MemoryBuffer. When provided (for tensors from GXF allocators), enables stream-aware deallocation via set_deallocation_stream(). Pass nullptr for external DLPack tensors.
-
explicit Tensor(DLManagedTensor *dl_managed_tensor_ptr)
Construct a new Tensor from an existing DLManagedTensor pointer.
- Parameters
dl_managed_tensor_ptr – A pointer to the DLManagedTensor to be used in Tensor construction.
-
explicit Tensor(DLManagedTensorVersioned *dl_managed_tensor_ver_ptr)
Construct a new Tensor from an existing DLManagedTensorVersioned pointer.
Note that currently holoscan::Tensor does not support versioned tensors from the C++ API, so any version information and flags from DLPack >= 1.0 will not be stored.
- Parameters
dl_managed_tensor_ver_ptr – A pointer to the DLManagedTensorVersioned to be used in Tensor construction.
-
virtual ~Tensor() = default
-
inline void *data() const
Get a pointer to the underlying data.
- Returns
The pointer to the Tensor’s data.
-
inline DLDevice device() const
Get the device information of the Tensor.
- Returns
The device information of the Tensor.
-
inline DLDataType dtype() const
Get the Tensor’s data type information.
For details of the DLDataType struct see the DLPack documentation: https://dmlc.github.io/dlpack/latest/c_api.html#_CPPv410DLDataType
- Returns
The DLDataType struct containing DLPack dtype information for the tensor.
-
std::vector<int64_t> shape() const
Get the shape of the Tensor data.
- Returns
The vector containing the Tensor’s shape.
-
std::vector<int64_t> strides() const
Get the strides of the Tensor data.
Note that, unlike
DLTensor.strides, the strides this method returns are in number of bytes, not elements (to be consistent with NumPy/CuPy’s strides).- Returns
The vector containing the Tensor’s strides.
-
bool is_contiguous() const
Check if the tensor a has contiguous, row-major memory layout.
- Returns
True if the tensor is contiguous, False otherwise.
-
int64_t size() const
Get the size (number of elements) in the Tensor.
The size is defined as the number of elements, not the number of bytes. For the latter, see nbytes.
If the underlying DLDataType contains multiple lanes, all lanes are considered as a single element. For example, a float4 vectorized type is counted as a single element, not four elements.
- Returns
The size of the tensor in number of elements.
-
inline int32_t ndim() const
Get the number of dimensions of the Tensor.
- Returns
The number of dimensions.
-
inline uint8_t itemsize() const
Get the itemsize of a single Tensor data element.
If the underlying DLDataType contains multiple lanes, itemsize takes this into account. For example, a Tensor containing (vectorized) float4 elements would have itemsize 16, not 4.
- Returns
The itemsize of the Tensor’s data.
-
inline int64_t nbytes() const
Get the total number of bytes for the Tensor’s data.
- Returns
The size of the Tensor’s data in bytes.
-
DLManagedTensor *to_dlpack()
Get a DLPack managed tensor pointer to the Tensor.
- Returns
A DLManagedTensor* pointer corresponding to the Tensor.
-
DLManagedTensorVersioned *to_dlpack_versioned()
Get a DLPack versioned managed tensor pointer to the Tensor.
- Returns
A DLManagedTensorVersioned* pointer corresponding to the Tensor.
-
inline std::shared_ptr<DLManagedTensorContext> &dl_ctx()
Get the internal DLManagedTensorContext of the Tensor.
- Returns
A shared pointer to the Tensor’s DLManagedTensorContext.
-
bool set_deallocation_stream(cudaStream_t stream)
Set the CUDA stream for stream-aware memory deallocation.
For sink operators that don’t emit data, this method should be called with the operator’s working CUDA stream to ensure allocators (like BlockMemoryPool) defer memory reuse until GPU operations on the stream complete. This prevents race conditions where memory is returned to the pool while GPU kernels are still reading from it.
This method only works for tensors whose memory is managed by a Holoscan/GXF allocator (i.e., tensors received from upstream operators in the pipeline). For tensors created from external sources via the DLPack interface (e.g., from CuPy or PyTorch), this method returns false and has no effect.
- Parameters
stream – The CUDA stream that last accessed this tensor’s data.
- Returns
true if the stream was set successfully, false if the tensor’s memory is not managed by a Holoscan/GXF allocator.
Protected Attributes
-
std::shared_ptr<DLManagedTensorContext> dl_ctx_
The DLManagedTensorContext object.
-
nvidia::gxf::MemoryBuffer *memory_buffer_ptr_ = nullptr
Pointer to the underlying MemoryBuffer when tensor is from a GXF allocator. nullptr for external DLPack tensors (from CuPy, PyTorch, etc.). This enables stream-aware deallocation via set_deallocation_stream().
-
Tensor() = default