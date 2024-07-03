An allocator that uses dynamic host or device memory allocation without an upper bound. This allocator does not take any user-specified parameters. This memory pool is easy to use and is recommended for initial prototyping. Once an application is working, switching to a BlockMemoryPool instead may help provide additional performance.

This is a memory pool which provides a user-specified number of equally sized blocks of memory. Using this memory pool provides a way to allocate memory blocks once and reuse the blocks on each subsequent call to an Operator’s compute method. This saves overhead relative to allocating memory again each time compute is called. For the built-in operators which accept a memory pool parameer, there is a section in it’s API docstrings titled “Device Memory Requirements” which provides guidance on the num_blocks and block_size needed for use with this memory pool.

The storage_type parameter can be set to determine the memory storage type used by the operator. This can be 0 for page-locked host memory (allocated with cudaMallocHost ), 1 for device memory (allocated with cudaMalloc ) or 2 for system memory (allocated with C++ new ).

The block_size parameter determines the size of a single block in the memory pool in bytes. Any allocation requests made of this allocator must fit into this block size.

The num_blocks parameter controls the total number of blocks that are allocated in the memory pool.

The dev_id parameter is an optional parameter that can be used to specify the CUDA ID of the device on which the memory pool will be created.

This allocator creates a pool of CUDA streams.