Tensor#

Multi-dimensional array descriptors and memory management for tensor data.

Overview#

The Tensor library provides lightweight data structures for describing and allocating multi-dimensional arrays (tensors) with type-safe memory management. It supports various numeric types including integers, floating-point numbers, and complex values.

Core Concepts#

Tensor Info#

TensorInfo describes tensor properties including data type and dimensions. It provides compatibility validation and element count calculation.

Creating a Tensor Descriptor#

// Create tensor descriptor for a 3D array of float32 values
const framework::tensor::TensorInfo tensor(framework::tensor::TensorR32F, {10, 20, 30});

// Get tensor properties
const auto type = tensor.get_type();
const auto &dimensions = tensor.get_dimensions();
const auto total_elements = tensor.get_total_elements();

Compatibility Checking#

// Create two tensor descriptors
const framework::tensor::TensorInfo tensor1(framework::tensor::TensorR32F, {100, 200});
const framework::tensor::TensorInfo tensor2(framework::tensor::TensorR32F, {100, 200});
const framework::tensor::TensorInfo tensor3(framework::tensor::TensorC32F, {100, 200});

// Check compatibility
const auto compatible_same = tensor1.is_compatible_with(tensor2);
const auto compatible_different = tensor1.is_compatible_with(tensor3);

Data Types#

NvDataType defines supported tensor element types for integers, floating-point values, and complex numbers with various precisions.

Type Names and Traits#

// Get string representation of data types
const auto *const float_name =
        framework::tensor::nv_get_data_type_string(framework::tensor::TensorR32F);
const auto *const complex_name =
        framework::tensor::nv_get_data_type_string(framework::tensor::TensorC32F);
const auto *const int_name =
        framework::tensor::nv_get_data_type_string(framework::tensor::TensorR32I);

The library provides compile-time type traits for mapping between NvDataType enumeration values and C++ types:

// Use type traits to get C++ types from tensor types
using FloatType = framework::tensor::data_type_traits<framework::tensor::TensorR32F>::Type;
using IntType = framework::tensor::data_type_traits<framework::tensor::TensorR32I>::Type;

// Reverse mapping from C++ types to tensor types
constexpr auto FLOAT_TYPE = framework::tensor::type_to_tensor_type<float>::VALUE;
constexpr auto INT_TYPE = framework::tensor::type_to_tensor_type<int>::VALUE;

// Check type sizes
const auto float_size = sizeof(FloatType);
const auto int_size = sizeof(IntType);

Storage Element Size#

// Get storage size for different data types
const auto float_size =
        framework::tensor::get_nv_type_storage_element_size(framework::tensor::TensorR32F);
const auto complex_size =
        framework::tensor::get_nv_type_storage_element_size(framework::tensor::TensorC32F);
const auto half_size =
        framework::tensor::get_nv_type_storage_element_size(framework::tensor::TensorR16F);

Tensor Arena#

TensorArena provides RAII-based memory allocation for tensor data with support for device and host-pinned memory.

Device Memory#

// Allocate device memory arena
const std::size_t arena_size = static_cast<std::size_t>(1024) * 1024; // 1 MB
framework::tensor::TensorArena arena(arena_size, framework::tensor::MemoryType::Device);

// Allocate typed regions within arena
auto *float_ptr = arena.allocate_at<float>(0);
auto *int_ptr = arena.allocate_at<int>(1024);

// Get arena properties
const auto total_bytes = arena.total_bytes();
const auto mem_type = arena.memory_type();

Host Pinned Memory#

// Allocate pinned host memory arena for efficient CPU-GPU transfers
const std::size_t buffer_size = 4096;
framework::tensor::TensorArena host_arena(
        buffer_size, framework::tensor::MemoryType::HostPinned);

// Get raw memory pointer for CUDA operations
void *raw_memory = host_arena.raw_ptr();

const auto mem_type = host_arena.memory_type();

Complete Example#

// Create tensor descriptor
const framework::tensor::TensorInfo tensor_desc(framework::tensor::TensorR32F, {128, 256});
const auto element_count = tensor_desc.get_total_elements();
const auto element_size =
        framework::tensor::get_nv_type_storage_element_size(framework::tensor::TensorR32F);
const auto total_size = element_count * element_size;

// Allocate device memory for tensor
framework::tensor::TensorArena device_memory(total_size, framework::tensor::MemoryType::Device);
auto *tensor_data = device_memory.allocate_at<float>(0);

// Allocate host pinned memory for transfers
framework::tensor::TensorArena host_memory(
        total_size, framework::tensor::MemoryType::HostPinned);
auto *host_data = host_memory.allocate_at<float>(0);

Additional Examples#

For more examples, see:

framework/tensor/tests/tensor_sample_tests.cpp - Documentation examples and unit tests

API Reference#

enum framework::tensor::NvDataType#

Data type enumeration for NV operations.

This enumeration defines the supported data types for NV operations, including various integer, floating-point, and complex number formats. The values are compatible with CUDA library types where applicable.

See also

CUDA library types for compatibility information

Values:

enumerator TensorVoid#: uninitialized type

enumerator TensorBit#: 1-bit value

enumerator TensorR8I#: 8-bit signed integer real values

enumerator TensorC8I#: 8-bit signed integer complex values

enumerator TensorR8U#: 8-bit unsigned integer real values

enumerator TensorC8U#: 8-bit unsigned integer complex values

enumerator TensorR16I#: 16-bit signed integer real values

enumerator TensorC16I#: 16-bit signed integer complex values

enumerator TensorR16U#: 16-bit unsigned integer real values

enumerator TensorC16U#: 16-bit unsigned integer complex values

enumerator TensorR32I#: 32-bit signed integer real values

enumerator TensorC32I#: 32-bit signed integer complex values

enumerator TensorR32U#: 32-bit unsigned integer real values

enumerator TensorC32U#: 32-bit unsigned integer complex values

enumerator TensorR16F#: half precision (16-bit) real values

enumerator TensorC16F#: half precision (16-bit) complex values

enumerator TensorR32F#: single precision (32-bit) real values

enumerator TensorC32F#: single precision (32-bit) complex values

enumerator TensorR64F#: double precision (64-bit) real values

enumerator TensorC64F#: double precision (64-bit) complex values

enum class framework::tensor::MemoryType#

Memory allocation type for tensor arenas.

Values:

enumerator Device#: GPU device memory.

enumerator HostPinned#: CPU pinned (page-locked) memory.

enum class framework::tensor::TensorDimension : std::uint8_t#

Tensor dimension counts and limits

Values:

enumerator Dim1#: 1-D tensor dimension count

enumerator Dim2#: 2-D tensor dimension count

enumerator Dim3#: 3-D tensor dimension count

enumerator Dim4#: 4-D tensor dimension count

enumerator Dim5#: 5-D tensor dimension count

enumerator Max#: Maximum tensor dimensions supported.

constexpr const char *framework::tensor::nv_get_data_type_string( const NvDataType type, ) noexcept#

Get string representation of NV data type

This function returns a human-readable string representation of the given NV data type enumeration value. Useful for debugging, logging, and error reporting.

See also

NvDataType for available data type enumeration values

Example:

NvDataType type = TensorR32F;
const char* typeStr = nv_get_data_type_string(type);
printf("Data type: %s\n", typeStr); // Output: "Data type: TensorR32F"

Note

This function is marked [[nodiscard]] to encourage checking the return value

Note

The returned pointer points to static string literals and does not need to be freed

Parameters:

type – [in] The NV data type to convert to string

Return values:

TensorVoid – for uninitialized type
TensorBit – for 1-bit values
TensorR8I, TensorR8U, TensorR16I, TensorR16U, TensorR32I, TensorR32U – for integer real number types
TensorR16F, TensorR32F, TensorR64F – for floating-point real number types
TensorC8I, TensorC8U, TensorC16I, TensorC16U, TensorC32I, TensorC32U – for integer complex number types
TensorC16F, TensorC32F, TensorC64F – for floating-point complex number types
UNKNOWN_TYPE – for invalid or unrecognized data types

Returns:

const char* String representation of the data type

constexpr bool framework::tensor::type_is_sub_byte( const NvDataType type, ) noexcept#

Check if data type is sub-byte precision

Determines whether the given NV data type represents a sub-byte data type, meaning multiple values can be packed into a single byte. Currently only the TensorBit type is considered sub-byte.

See also

NvDataType for available data type enumeration values

See also

get_nv_type_storage_element_size for related storage size information

Parameters:

type – [in] The NV data type to check

Return values:

true – for TensorBit type (1-bit values)
false – for all other data types

Returns:

true if the type is sub-byte precision, false otherwise

std::size_t framework::tensor::get_nv_type_storage_element_size( const NvDataType type, ) noexcept#

Get storage element size for NV data type

Returns the size in bytes of the storage element for a given NV data type. In general, this is the size of the type used to store the given NvDataType. However, for sub-byte types, multiple elements are stored in a machine word. For these types, the returned size is the size of a machine word which stores multiple elements.

See also

NvDataType for available data type enumeration values

See also

type_is_sub_byte to check if a type is sub-byte precision

See also

data_type_traits for type mapping information

Note

For sub-byte types, the storage size represents the container size, not the logical element size

Parameters:

type – [in] The NV data type to get storage size for

Return values:

0 – for TensorVoid (uninitialized type)
1 – for 8-bit types (TensorR8I, TensorR8U)
2 – for 8-bit complex types (TensorC8I, TensorC8U) and 16-bit types (TensorR16I, TensorR16U, TensorR16F)
4 – for TensorBit (sub-byte type stored in uint32_t), 16-bit complex types (TensorC16I, TensorC16U, TensorC16F), and 32-bit types (TensorR32I, TensorR32U, TensorR32F)
8 – for 32-bit complex types (TensorC32I, TensorC32U, TensorC32F) and 64-bit real types (TensorR64F)
16 – for 64-bit complex types (TensorC64F)

Returns:

Size in bytes of the storage element

template<NvDataType T> struct data_type_traits#

Type traits for NV data types

Template structure providing type mapping from NvDataType enumeration values to their corresponding C++ types. Each specialization defines a Type alias that represents the actual C++ type used for storage and computation.

See also

NvDataType for available enumeration values

See also

type_to_tensor_type for reverse mapping from C++ types to NvDataType

Example:

using FloatType = data_type_traits<TensorR32F>::Type;  // Resolves to float
using ComplexType = data_type_traits<TensorC32F>::Type; // Resolves to
cuComplex

Note

TensorBit uses std::uint32_t for storage as multiple bits are packed into a single word

Note

Complex types use CUDA vector types (e.g., char2, float2) or cuComplex types

Template Parameters:: T – The NvDataType enumeration value

template<> struct data_type_traits<TensorBit>#

#include <data_types.hpp>

Type traits specialization for bit-packed tensor data.

Public Types

using Type = std::uint32_t#: 32-bit storage for packed bits

template<> struct data_type_traits<TensorC16F>#

#include <data_types.hpp>

Type traits specialization for 16-bit half-precision complex tensor data.

Public Types

using Type = __half2#: 16-bit half-precision complex

template<> struct data_type_traits<TensorC16I>#

#include <data_types.hpp>

Type traits specialization for 16-bit signed integer complex tensor data.

Public Types

using Type = short2#: 16-bit signed integer complex

template<> struct data_type_traits<TensorC16U>#

#include <data_types.hpp>

Type traits specialization for 16-bit unsigned integer complex tensor data.

Public Types

using Type = ushort2#: 16-bit unsigned integer complex

template<> struct data_type_traits<TensorC32F>#

#include <data_types.hpp>

Type traits specialization for 32-bit single-precision complex tensor data.

Public Types

using Type = cuda::std::complex<float>#: 32-bit single-precision complex

template<> struct data_type_traits<TensorC32I>#

#include <data_types.hpp>

Type traits specialization for 32-bit signed integer complex tensor data.

Public Types

using Type = int2#: 32-bit signed integer complex

template<> struct data_type_traits<TensorC32U>#

#include <data_types.hpp>

Type traits specialization for 32-bit unsigned integer complex tensor data.

Public Types

using Type = uint2#: 32-bit unsigned integer complex

template<> struct data_type_traits<TensorC64F>#

#include <data_types.hpp>

Type traits specialization for 64-bit double-precision complex tensor data.

Public Types

using Type = cuda::std::complex<double>#: 64-bit double-precision complex

template<> struct data_type_traits<TensorC8I>#

#include <data_types.hpp>

Type traits specialization for 8-bit signed integer complex tensor data.

Public Types

using Type = char2#: 8-bit signed integer complex

template<> struct data_type_traits<TensorC8U>#

#include <data_types.hpp>

Type traits specialization for 8-bit unsigned integer complex tensor data.

Public Types

using Type = uchar2#: 8-bit unsigned integer complex

template<> struct data_type_traits<TensorR16F>#

#include <data_types.hpp>

Type traits specialization for 16-bit half-precision real tensor data.

Public Types

using Type = __half#: 16-bit half-precision float

template<> struct data_type_traits<TensorR16I>#

#include <data_types.hpp>

Type traits specialization for 16-bit signed integer real tensor data.

Public Types

using Type = short#: 16-bit signed integer

template<> struct data_type_traits<TensorR16U>#

#include <data_types.hpp>

Type traits specialization for 16-bit unsigned integer real tensor data.

Public Types

using Type = unsigned short#: 16-bit unsigned integer

template<> struct data_type_traits<TensorR32F>#

#include <data_types.hpp>

Type traits specialization for 32-bit single-precision real tensor data.

Public Types

using Type = float#: 32-bit single-precision float

template<> struct data_type_traits<TensorR32I>#

#include <data_types.hpp>

Type traits specialization for 32-bit signed integer real tensor data.

Public Types

using Type = int#: 32-bit signed integer

template<> struct data_type_traits<TensorR32U>#

#include <data_types.hpp>

Type traits specialization for 32-bit unsigned integer real tensor data.

Public Types

using Type = unsigned int#: 32-bit unsigned integer

template<> struct data_type_traits<TensorR64F>#

#include <data_types.hpp>

Type traits specialization for 64-bit double-precision real tensor data.

Public Types

using Type = double#: 64-bit double-precision float

template<> struct data_type_traits<TensorR8I>#

#include <data_types.hpp>

Type traits specialization for 8-bit signed integer real tensor data.

Public Types

using Type = signed char#: 8-bit signed integer

template<> struct data_type_traits<TensorR8U>#

#include <data_types.hpp>

Type traits specialization for 8-bit unsigned integer real tensor data.

Public Types

using Type = unsigned char#: 8-bit unsigned integer

template<> struct data_type_traits<TensorVoid>#

#include <data_types.hpp>

Type traits specialization for void/uninitialized tensor data.

Public Types

using Type = void#: Void type for uninitialized.

class TensorArena#

#include <tensor_arena.hpp>

Type-safe memory arena for tensor data.

Provides type-safe allocation and access to tensor memory regions. Supports both device and pinned host memory allocation.

Public Functions

explicit TensorArena( std::size_t total_bytes, MemoryType memory_type = MemoryType::Device, )#

Constructor - allocates memory arena.

Parameters:

total_bytes – [in] Total size of arena in bytes
memory_type – [in] Type of memory to allocate (Device or HostPinned)

~TensorArena()#: Destructor - cleanup memory.

TensorArena(const TensorArena&) = delete#

TensorArena &operator=(const TensorArena&) = delete#

TensorArena(TensorArena &&other) noexcept#

Move constructor.

Parameters:: other – Arena to move from

TensorArena &operator=(TensorArena &&other) noexcept#

Move assignment operator.

Parameters:: other – Arena to move from
Returns:: Reference to this arena

template<typename T> inline T *allocate_at( const std::size_t offset_bytes, )#

Return an already allocated memory region at specific offset.

Template Parameters:: T – Type to allocate
Parameters:: offset_bytes – [in] Byte offset from arena start
Throws:: std::runtime_error – if allocation would exceed arena bounds
Returns:: Type-safe pointer to allocated region

inline void *raw_ptr()#

Get raw memory pointer for external APIs.

Returns:: Pointer to raw memory

inline const void *raw_ptr() const#

Get const raw memory pointer.

Returns:: Const pointer to raw memory

inline void *raw_ptr_mutable() const#

Get mutable raw memory pointer from const context.

Provides mutable access to arena memory from const member functions when the operation is logically const but requires mutable pointer for external APIs (e.g., CUDA memory transfer operations where dst must be non-const).

Returns:: Mutable pointer to raw memory

inline std::size_t total_bytes() const#

Get total bytes allocated for this arena.

Returns:: Total arena size in bytes

inline MemoryType memory_type() const#

Get memory type of this arena.

Returns:: Memory type (Device or HostPinned)

class TensorInfo#

#include <tensor_info.hpp>

Describes tensor properties for ABI validation between modules.

This class encapsulates all the necessary information about a tensor, including its data type, dimensions, and other metadata to ensure compatibility between modules.

Public Types

using DataType = NvDataType #: Data type alias for tensor elements.

Public Functions

TensorInfo() = default#: Default constructor.

TensorInfo(DataType type, std::vector<std::size_t> dimensions)#

Constructor with data type and dimensions.

Parameters:

type – [in] The data type of the tensor
dimensions – [in] The dimensions of the tensor

Throws:

std::invalid_argument – if any dimension is zero

DataType get_type() const noexcept#

Get the data type of the tensor.

Returns:: The data type

const std::vector<std::size_t> &get_dimensions() const noexcept#

Get the dimensions of the tensor.

Returns:: A const reference to the dimensions vector

bool is_compatible_with(const TensorInfo &other) const noexcept#

Check if this TensorInfo is compatible with another.

Parameters:: other – [in] The TensorInfo to check compatibility with
Returns:: true if compatible, false otherwise

std::size_t get_total_elements() const#

Get the total number of elements in the tensor.

Returns:: The total number of elements

void set_size_bytes(std::size_t size_bytes)#

Set the total size in bytes for this tensor. This is typically called after calculating the size based on data type and shape.

Parameters:: size_bytes – [in] The total size in bytes

std::size_t get_total_size_in_bytes() const#

Get the total size in bytes for this tensor.

Returns:: The total size in bytes (0 if not set)

template<typename T> struct type_to_tensor_type#

Reverse type mapping from C++ types to NV data types

Template structure providing reverse type mapping from C++ types to their corresponding NvDataType enumeration values. Each specialization defines a VALUE constant that represents the NvDataType for the given C++ type.

See also

data_type_traits for forward mapping from NvDataType to C++ types

See also

NvDataType for available enumeration values

Example:

constexpr auto floatType = type_to_tensor_type<float>::VALUE;        //
TensorR32F constexpr auto complexType =
type_to_tensor_type<cuComplex>::VALUE;
// TensorC32F

Note

This template provides compile-time type-to-enum mapping

Note

Not all C++ types have corresponding NvDataType values

Template Parameters:: T – The C++ type to map to NvDataType

template<> struct type_to_tensor_type<__half>#

#include <data_types.hpp>

Reverse type mapping for 16-bit half-precision float.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR16F#: 16-bit half-precision float

template<> struct type_to_tensor_type<__half2>#

#include <data_types.hpp>

Reverse type mapping for 16-bit half-precision complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC16F#: 16-bit half-precision complex

template<> struct type_to_tensor_type<char2>#

#include <data_types.hpp>

Reverse type mapping for 8-bit signed integer complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC8I#: 8-bit signed integer complex

template<> struct type_to_tensor_type<cuda::std::complex<double>>#

#include <data_types.hpp>

Reverse type mapping for 64-bit double-precision complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC64F#: 64-bit double-precision complex

template<> struct type_to_tensor_type<cuda::std::complex<float>>#

#include <data_types.hpp>

Reverse type mapping for 32-bit single-precision complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC32F#: 32-bit single-precision complex

template<> struct type_to_tensor_type<double>#

#include <data_types.hpp>

Reverse type mapping for 64-bit double-precision float.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR64F#: 64-bit double-precision float

template<> struct type_to_tensor_type<float>#

#include <data_types.hpp>

Reverse type mapping for 32-bit single-precision float.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR32F#: 32-bit single-precision float

template<> struct type_to_tensor_type<int>#

#include <data_types.hpp>

Reverse type mapping for 32-bit signed integer.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR32I#: 32-bit signed integer

template<> struct type_to_tensor_type<int2>#

#include <data_types.hpp>

Reverse type mapping for 32-bit signed integer complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC32I#: 32-bit signed integer complex

template<> struct type_to_tensor_type<short>#

#include <data_types.hpp>

Reverse type mapping for 16-bit signed integer.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR16I#: 16-bit signed integer

template<> struct type_to_tensor_type<short2>#

#include <data_types.hpp>

Reverse type mapping for 16-bit signed integer complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC16I#: 16-bit signed integer complex

template<> struct type_to_tensor_type<signed char>#

#include <data_types.hpp>

Reverse type mapping for 8-bit signed integer.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR8I#: 8-bit signed integer

template<> struct type_to_tensor_type<uchar2>#

#include <data_types.hpp>

Reverse type mapping for 8-bit unsigned integer complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC8U#: 8-bit unsigned integer complex

template<> struct type_to_tensor_type<uint2>#

#include <data_types.hpp>

Reverse type mapping for 32-bit unsigned integer complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC32U#: 32-bit unsigned integer complex

template<> struct type_to_tensor_type<unsigned char>#

#include <data_types.hpp>

Reverse type mapping for 8-bit unsigned integer.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR8U#: 8-bit unsigned integer

template<> struct type_to_tensor_type<unsigned int>#

#include <data_types.hpp>

Reverse type mapping for 32-bit unsigned integer.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR32U#: 32-bit unsigned integer

template<> struct type_to_tensor_type<unsigned short>#

#include <data_types.hpp>

Reverse type mapping for 16-bit unsigned integer.

Public Static Attributes

static constexpr NvDataType VALUE = TensorR16U#: 16-bit unsigned integer

template<> struct type_to_tensor_type<ushort2>#

#include <data_types.hpp>

Reverse type mapping for 16-bit unsigned integer complex.

Public Static Attributes

static constexpr NvDataType VALUE = TensorC16U#: 16-bit unsigned integer complex

template<> struct type_to_tensor_type<void>#

#include <data_types.hpp>

Reverse type mapping for void type.

Public Static Attributes

static constexpr NvDataType VALUE = TensorVoid#: Void type mapping.