holoscan::MatXAllocator
holoscan::MatXAllocator
Wrap a holoscan::Allocator for use with MatX’s custom allocator interface.
MatX (v0.9.3+) detects custom allocators via SFINAE: any type providing allocate(size_t)`` -> void* and deallocate(void*, size_t)`` -> void is accepted. This class bridges the holoscan::Allocator API to satisfy that interface.
The adapter supports stream-aware allocation when the underlying allocator is a CudaAllocator (e.g., RMMAllocator, StreamOrderedAllocator). For non-CudaAllocator types (e.g., BlockMemoryPool), synchronous allocation is used, but stream-aware deallocation is still leveraged via the GXF-level free(ptr, stream) when a stream is bound.
Allocator behavior matrix (behavior depends on whether a CUDA stream is passed to MatXAllocator, not on the allocator class itself):
“Stream” = whether a non-null cudaStream_t is passed to MatXAllocator.
Example usage:
Rows 1-2 refer to the same allocator type; the distinction is whether MatXAllocator is constructed with a non-null stream.
“Sync” in the Allocation/Deallocation columns means not stream-ordered (no cudaMallocAsync/cudaFreeAsync). It does NOT mean that each allocation forces a GPU sync. For BlockMemoryPool, allocation from the preallocated pool is CPU bookkeeping only (mutex + stack).
Async allocation (CudaAllocator + stream) only supports device memory (MemoryStorageType::kDevice). Constructing with a non-kDevice storage type and a CudaAllocator + stream throws std::invalid_argument.
MatX’s make_tensor with a custom allocator does NOT accept a CUDA stream parameter. To enable stream-ordered allocation, bind the stream when constructing the MatXAllocator. Use with_stream() to create allocators for different streams without reconstructing from scratch.
Example
Constructors
MatXAllocator
Overload 1
Overload 2
Copy (with allocator and storage type and stream)
Copy (with allocator and stream)
Copy
Move
Construct a MatXAllocator with full control over memory type and stream.
Throws: std::invalid_argument if allocator is null.
Throws: std::invalid_argument if a CudaAllocator + stream is used with a non-kDevice storage type (async allocation is device-only).
Parameters
Pointer to a holoscan::Allocator (must not be null).
Memory type for allocations (default: kDevice). When using a CudaAllocator with a stream, only kDevice is supported.
Optional CUDA stream for async allocation/deallocation. When non-null and the allocator supports it, async APIs are used.
Assignment operators
operator=
Copy assign
Move assign
Methods
allocate
Allocate memory (satisfy MatX’s allocator interface).
Dispatch strategy:
- If size is 0, return nullptr (no allocation needed).
- If the underlying allocator is a CudaAllocator and a stream is bound, use allocate_async(size, stream).
- Otherwise, use allocate(size, storage_type) (synchronous).
Returns: Pointer to allocated memory.
Throws: std::bad_alloc if allocation fails.
Parameters
Number of bytes to allocate. Zero returns nullptr without error.
deallocate
Deallocate memory (satisfy MatX’s allocator interface).
This method is noexcept to be safe when called from destructors (e.g., when a MatX tensor is destroyed). Any internal errors are logged but not propagated.
Dispatch strategy:
- If the underlying allocator is a CudaAllocator and a stream is bound, call the GXF-level free_async(ptr, stream) for status checking, with fallback to synchronous free(ptr) on failure.
- Else if a stream is bound (e.g., BlockMemoryPool), call GXF-level free(ptr, stream) for stream-aware deferred deallocation. Fall back to synchronous free(ptr) if unavailable.
- Otherwise, use free(ptr) (synchronous).
The fallback from stream-aware free to synchronous free is safe for all current Holoscan allocators: BlockMemoryPool uses CUDA-event-based deferred free and UnboundedAllocator’s default free_abi(ptr, stream) delegates to free_abi(ptr). If a future allocator differentiates these, revisit this logic.
The underlying allocator’s free() implementation must not throw. If it does, the exception is caught and logged but the pointer may be leaked. All current Holoscan allocators satisfy this requirement.
If the GXF allocator handle is unavailable, deallocation falls back to allocator_->free() as a best-effort path. In this case, status cannot be queried from GXF.
Parameters
Pointer to memory to deallocate (null is a no-op).
Size of allocation (required by MatX interface, unused).
allocator
Return the underlying Holoscan allocator.
storage_type
Return the configured memory storage type.
stream
Return the bound CUDA stream (nullptr if none).
with_stream
Create a copy of this allocator bound to a different CUDA stream.
Return a new MatXAllocator sharing the same underlying Allocator and storage type, but associated with a different stream. Useful in multi-stream pipelines where the same allocator serves multiple streams.
Returns: A new MatXAllocator bound to the given stream.
Throws: std::invalid_argument if the new stream + storage_type combination is invalid (see primary constructor).
Parameters
The CUDA stream to bind to the new allocator.