holoscan::MatXAllocator

Beta
View as Markdown

Wrap a holoscan::Allocator for use with MatX’s custom allocator interface.

MatX (v0.9.3+) detects custom allocators via SFINAE: any type providing allocate(size_t)`` -> void* and deallocate(void*, size_t)`` -> void is accepted. This class bridges the holoscan::Allocator API to satisfy that interface.

The adapter supports stream-aware allocation when the underlying allocator is a CudaAllocator (e.g., RMMAllocator, StreamOrderedAllocator). For non-CudaAllocator types (e.g., BlockMemoryPool), synchronous allocation is used, but stream-aware deallocation is still leveraged via the GXF-level free(ptr, stream) when a stream is bound.

Allocator behavior matrix (behavior depends on whether a CUDA stream is passed to MatXAllocator, not on the allocator class itself):

“Stream” = whether a non-null cudaStream_t is passed to MatXAllocator.

Example usage:

#include <holoscan/matx_allocator.hpp>

Rows 1-2 refer to the same allocator type; the distinction is whether MatXAllocator is constructed with a non-null stream.

“Sync” in the Allocation/Deallocation columns means not stream-ordered (no cudaMallocAsync/cudaFreeAsync). It does NOT mean that each allocation forces a GPU sync. For BlockMemoryPool, allocation from the preallocated pool is CPU bookkeeping only (mutex + stack).

This class does NOT own the Allocator. The caller must ensure the Allocator outlives the MatXAllocator and any tensors allocated through it.

Async allocation (CudaAllocator + stream) only supports device memory (MemoryStorageType::kDevice). Constructing with a non-kDevice storage type and a CudaAllocator + stream throws std::invalid_argument.

MatX’s make_tensor with a custom allocator does NOT accept a CUDA stream parameter. To enable stream-ordered allocation, bind the stream when constructing the MatXAllocator. Use with_stream() to create allocators for different streams without reconstructing from scratch.

Example

// Inside an operator's compute() method, where allocator_ is a
// Parameter<std::shared_ptr<Allocator>> registered in setup():
holoscan::MatXAllocator matx_alloc(allocator_.get(), cuda_stream);
auto tensor = matx::make_tensor<float>({1024, 1024}, matx_alloc);
// Direct construction via MetaParameter's implicit conversion also works:
holoscan::MatXAllocator matx_alloc2(allocator_, cuda_stream);

Constructors

MatXAllocator

inlineexplicit
holoscan::MatXAllocator::MatXAllocator(holoscan::MatXAllocator::MatXAllocator(
Allocator *allocator,
MemoryStorageType storage_type = MemoryStorageType::kDevice,
cudaStream_t stream = nullptr
)

Construct a MatXAllocator with full control over memory type and stream.

Throws: std::invalid_argument if allocator is null.

Throws: std::invalid_argument if a CudaAllocator + stream is used with a non-kDevice storage type (async allocation is device-only).

Parameters

allocator
Allocator *

Pointer to a holoscan::Allocator (must not be null).

storage_type
MemoryStorageTypeDefaults to MemoryStorageType::kDevice

Memory type for allocations (default: kDevice). When using a CudaAllocator with a stream, only kDevice is supported.

stream
cudaStream_tDefaults to nullptr

Optional CUDA stream for async allocation/deallocation. When non-null and the allocator supports it, async APIs are used.


Assignment operators

operator=

MatXAllocator & holoscan::MatXAllocator::operator=(MatXAllocator & holoscan::MatXAllocator::operator=(
const MatXAllocator &
) = default

Methods

allocate

void * holoscan::MatXAllocator::allocate(
size_t size
)

Allocate memory (satisfy MatX’s allocator interface).

Dispatch strategy:

  1. If size is 0, return nullptr (no allocation needed).
  2. If the underlying allocator is a CudaAllocator and a stream is bound, use allocate_async(size, stream).
  3. Otherwise, use allocate(size, storage_type) (synchronous).

Returns: Pointer to allocated memory.

Throws: std::bad_alloc if allocation fails.

Parameters

size
size_t

Number of bytes to allocate. Zero returns nullptr without error.

deallocate

void holoscan::MatXAllocator::deallocate(
void *ptr,
size_t size
) noexcept

Deallocate memory (satisfy MatX’s allocator interface).

This method is noexcept to be safe when called from destructors (e.g., when a MatX tensor is destroyed). Any internal errors are logged but not propagated.

Dispatch strategy:

  1. If the underlying allocator is a CudaAllocator and a stream is bound, call the GXF-level free_async(ptr, stream) for status checking, with fallback to synchronous free(ptr) on failure.
  2. Else if a stream is bound (e.g., BlockMemoryPool), call GXF-level free(ptr, stream) for stream-aware deferred deallocation. Fall back to synchronous free(ptr) if unavailable.
  3. Otherwise, use free(ptr) (synchronous).

The fallback from stream-aware free to synchronous free is safe for all current Holoscan allocators: BlockMemoryPool uses CUDA-event-based deferred free and UnboundedAllocator’s default free_abi(ptr, stream) delegates to free_abi(ptr). If a future allocator differentiates these, revisit this logic.

The underlying allocator’s free() implementation must not throw. If it does, the exception is caught and logged but the pointer may be leaked. All current Holoscan allocators satisfy this requirement.

If the GXF allocator handle is unavailable, deallocation falls back to allocator_->free() as a best-effort path. In this case, status cannot be queried from GXF.

Parameters

ptr
void *

Pointer to memory to deallocate (null is a no-op).

size
size_t

Size of allocation (required by MatX interface, unused).

allocator

Allocator * holoscan::MatXAllocator::allocator() const noexceptAllocator * holoscan::MatXAllocator::allocator() const noexcept

Return the underlying Holoscan allocator.

storage_type

MemoryStorageType holoscan::MatXAllocator::storage_type() const noexcept

Return the configured memory storage type.

stream

cudaStream_t holoscan::MatXAllocator::stream() const noexcept

Return the bound CUDA stream (nullptr if none).

with_stream

MatXAllocator holoscan::MatXAllocator::with_stream(MatXAllocator holoscan::MatXAllocator::with_stream(
cudaStream_t stream
) const

Create a copy of this allocator bound to a different CUDA stream.

Return a new MatXAllocator sharing the same underlying Allocator and storage type, but associated with a different stream. Useful in multi-stream pipelines where the same allocator serves multiple streams.

Returns: A new MatXAllocator bound to the given stream.

Throws: std::invalid_argument if the new stream + storage_type combination is invalid (see primary constructor).

Parameters

stream
cudaStream_t

The CUDA stream to bind to the new allocator.


Member variables

NameTypeDescription
allocator_Allocator *non-owning pointer to the Holoscan allocator
storage_type_MemoryStorageTypememory type for synchronous allocations
stream_cudaStream_toptional CUDA stream for async operations
cuda_allocator_CudaAllocator *cached dynamic_cast (nullptr if not CudaAllocator)