Class StreamOrderedAllocator
Defined in File stream_ordered_allocator.hpp
Base Type
public holoscan::CudaAllocator(Class CudaAllocator)
-
class StreamOrderedAllocator : public holoscan::CudaAllocator
CUDA device memory allocator using stream-ordered allocation.
StreamOrderedAllocator uses CUDA’s stream-ordered memory allocator (
cudaMallocAsync/cudaFreeAsync) to dynamically allocate device memory. Stream-ordered allocation enables memory operations to be tied to specific CUDA streams, allowing allocation and deallocation without blocking the host or other streams.See the CUDA Programming Guide section on Stream-Ordered Memory Allocator for details on the underlying CUDA feature.
This allocator only supports CUDA device memory. If host memory is also needed, see
RMMAllocatorwhich provides both device and pinned host memory pools.Because it is a CudaAllocator it supports both synchronous (
allocate,free) and asynchronous (allocate_async,free_async) APIs for memory allocation.The values for the memory parameters, such as
device_memory_initial_sizemust be specified in the form of a string containing a non-negative integer value followed by a suffix representing the units. Supported units are B, KB, MB, GB and TB where the values are powers of 1024 bytes (e.g. MB = 1024 * 1024 bytes). Examples of valid units are “512MB”, “256 KB”, “1 GB”. If a floating point number is specified that decimal portion will be truncated (i.e. the value is rounded down to the nearest integer).==Parameters==
device_memory_initial_size (std::string, optional): The initial size of the device memory pool. See above for the format accepted. Defaults to “8MB” on aarch64 and “16MB” on x86_64.
device_memory_max_size (std::string, optional): The maximum size of the device memory pool. See above for the format accepted. The default is to use twice the value set for
device_memory_initial_size.release_threshold (std::string, optional): The amount of reserved memory to hold onto before trying to release memory back to the OS. See above for the format accepted. The default value is “4MB”.
dev_id (int32_t, optional): The CUDA device id specifying which device the memory pool will use. (Default: 0)
Public Functions
-
template<typename ArgT, typename ...ArgsT, typename = std::enable_if_t<!std::is_base_of_v<::holoscan::Resource, std::decay_t<ArgT>> && (std::is_same_v<::holoscan::Arg, std::decay_t<ArgT>> || std::is_same_v<::holoscan::ArgList, std::decay_t<ArgT>>)>>
inline explicit StreamOrderedAllocator(ArgT &&arg, ArgsT&&... args)
-
StreamOrderedAllocator() = default
-
StreamOrderedAllocator(const std::string &name, nvidia::gxf::StreamOrderedAllocator *component)
-
inline virtual const char *gxf_typename() const override
-
virtual void setup(ComponentSpec &spec) override
Define the resource specification.
- Parameters
spec – The reference to the component specification.
-
nvidia::gxf::StreamOrderedAllocator *get() const