Class MatXAllocator
Defined in File matx_allocator.hpp
-
class MatXAllocator
Wrap a holoscan::Allocator for use with MatX’s custom allocator interface.
MatX (v0.9.3+) detects custom allocators via SFINAE: any type providing
allocate(size_t) -> void*anddeallocate(void*, size_t) -> voidis accepted. This class bridges the holoscan::Allocator API to satisfy that interface.The adapter supports stream-aware allocation when the underlying allocator is a CudaAllocator (e.g., RMMAllocator, StreamOrderedAllocator). For non-CudaAllocator types (e.g., BlockMemoryPool), synchronous allocation is used, but stream-aware deallocation is still leveraged via the GXF-level
free(ptr, stream)when a stream is bound.Allocator behavior matrix (behavior depends on whether a CUDA stream is passed to MatXAllocator, not on the allocator class itself):
Stream
Allocation
Deallocation
RMMAllocator / StreamOrderedAllocator Yes Async Async RMMAllocator / StreamOrderedAllocator No Sync Sync BlockMemoryPool Yes Sync Deferred (event) BlockMemoryPool No Sync Sync UnboundedAllocator Any Sync Sync “Stream” = whether a non-null cudaStream_t is passed to MatXAllocator.
Example usage:
// Inside an operator's compute() method, where allocator_ is a // Parameter<std::shared_ptr<Allocator>> registered in setup(): holoscan::MatXAllocator matx_alloc(allocator_.get(), cuda_stream); auto tensor = matx::make_tensor<float>({1024, 1024}, matx_alloc); // Direct construction via MetaParameter's implicit conversion also works: holoscan::MatXAllocator matx_alloc2(allocator_, cuda_stream);
- Since
4.0.0
NoteRows 1-2 refer to the same allocator type; the distinction is whether MatXAllocator is constructed with a non-null stream.
Note“Sync” in the Allocation/Deallocation columns means not stream-ordered (no cudaMallocAsync/cudaFreeAsync). It does NOT mean that each allocation forces a GPU sync. For BlockMemoryPool, allocation from the preallocated pool is CPU bookkeeping only (mutex + stack).
NoteThis class does NOT own the Allocator. The caller must ensure the Allocator outlives the MatXAllocator and any tensors allocated through it.
NoteAsync allocation (CudaAllocator + stream) only supports device memory (
MemoryStorageType::kDevice). Constructing with a non-kDevice storage type and a CudaAllocator + stream throwsstd::invalid_argument.NoteMatX’s
make_tensorwith a custom allocator does NOT accept a CUDA stream parameter. To enable stream-ordered allocation, bind the stream when constructing the MatXAllocator. Usewith_stream()to create allocators for different streams without reconstructing from scratch.Public Functions
-
inline explicit MatXAllocator(Allocator *allocator, MemoryStorageType storage_type = MemoryStorageType::kDevice, cudaStream_t stream = nullptr)
Construct a MatXAllocator with full control over memory type and stream.
- Parameters
allocator – Pointer to a holoscan::Allocator (must not be null).
storage_type – Memory type for allocations (default: kDevice). When using a CudaAllocator with a stream, only kDevice is supported.
stream – Optional CUDA stream for async allocation/deallocation. When non-null and the allocator supports it, async APIs are used.
- Throws
std::invalid_argument – if
allocatoris null.std::invalid_argument – if a CudaAllocator + stream is used with a non-kDevice storage type (async allocation is device-only).
-
inline MatXAllocator(Allocator *allocator, cudaStream_t stream)
Construct with device memory and a CUDA stream (convenience overload).
Equivalent to
MatXAllocator(allocator, MemoryStorageType::kDevice, stream).- Parameters
allocator – Pointer to a holoscan::Allocator (must not be null).
stream – CUDA stream for async allocation/deallocation.
Construct from a
std::shared_ptr<Allocator>.Extract the raw pointer via
shared_ptr::get()and delegate to theAllocator*constructor. The MatXAllocator does NOT retain or extend the lifetime of the shared_ptr — only the raw pointer is stored. The caller must ensure the Allocator outlives the MatXAllocator.This overload enables ergonomic construction from
Parameter<std::shared_ptr<Allocator>>:holoscan::MatXAllocator alloc(allocator_.get()); // shared_ptr holoscan::MatXAllocator alloc(allocator_); // implicit conversion
- Parameters
allocator – Shared pointer to a holoscan::Allocator (must not be null).
storage_type – Memory type for allocations (default: kDevice).
stream – Optional CUDA stream for async operations.
- Throws
std::invalid_argument – if the underlying pointer is null.
std::invalid_argument – if a CudaAllocator + stream is used with a non-kDevice storage type.
Construct from a
std::shared_ptr<Allocator>with a CUDA stream (convenience overload).Equivalent to
MatXAllocator(allocator.get(), MemoryStorageType::kDevice, stream).- Parameters
allocator – Shared pointer to a holoscan::Allocator (must not be null).
stream – CUDA stream for async allocation/deallocation.
-
MatXAllocator(const MatXAllocator&) = default
-
MatXAllocator &operator=(const MatXAllocator&) = default
-
MatXAllocator(MatXAllocator&&) = default
-
MatXAllocator &operator=(MatXAllocator&&) = default
-
inline void *allocate(size_t size)
Allocate memory (satisfy MatX’s allocator interface).
Dispatch strategy:
If size is 0, return nullptr (no allocation needed).
If the underlying allocator is a CudaAllocator and a stream is bound, use allocate_async(size, stream).
Otherwise, use allocate(size, storage_type) (synchronous).
- Parameters
size – Number of bytes to allocate. Zero returns nullptr without error.
- Throws
std::bad_alloc – if allocation fails.
- Returns
Pointer to allocated memory.
-
inline void deallocate(void *ptr, size_t size) noexcept
Deallocate memory (satisfy MatX’s allocator interface).
This method is
noexceptto be safe when called from destructors (e.g., when a MatX tensor is destroyed). Any internal errors are logged but not propagated.Dispatch strategy:
If the underlying allocator is a CudaAllocator and a stream is bound, call the GXF-level free_async(ptr, stream) for status checking, with fallback to synchronous free(ptr) on failure.
Else if a stream is bound (e.g., BlockMemoryPool), call GXF-level free(ptr, stream) for stream-aware deferred deallocation. Fall back to synchronous free(ptr) if unavailable.
Otherwise, use free(ptr) (synchronous).
NoteThe fallback from stream-aware free to synchronous free is safe for all current Holoscan allocators: BlockMemoryPool uses CUDA-event-based deferred free and UnboundedAllocator’s default free_abi(ptr, stream) delegates to free_abi(ptr). If a future allocator differentiates these, revisit this logic.
NoteThe underlying allocator’s
free()implementation must not throw. If it does, the exception is caught and logged but the pointer may be leaked. All current Holoscan allocators satisfy this requirement.NoteIf the GXF allocator handle is unavailable, deallocation falls back to
allocator_->free()as a best-effort path. In this case, status cannot be queried from GXF.- Parameters
ptr – Pointer to memory to deallocate (null is a no-op).
size – Size of allocation (required by MatX interface, unused).
-
inline Allocator *allocator() const noexcept
Return the underlying Holoscan allocator.
-
inline MemoryStorageType storage_type() const noexcept
Return the configured memory storage type.
-
inline cudaStream_t stream() const noexcept
Return the bound CUDA stream (nullptr if none).
-
inline MatXAllocator with_stream(cudaStream_t stream) const
Create a copy of this allocator bound to a different CUDA stream.
Return a new MatXAllocator sharing the same underlying Allocator and storage type, but associated with a different stream. Useful in multi-stream pipelines where the same allocator serves multiple streams.
- Parameters
stream – The CUDA stream to bind to the new allocator.
- Throws
std::invalid_argument – if the new stream + storage_type combination is invalid (see primary constructor).
- Returns
A new MatXAllocator bound to the given stream.