TensorRT 8.2.1
nvinfer1::IGpuAllocator Class Referenceabstract

Application-implemented class for controlling allocation on the GPU. More...

#include <NvInferRuntimeCommon.h>

Public Member Functions

virtual void * allocate (uint64_t const size, uint64_t const alignment, AllocatorFlags const flags) noexcept=0
 
virtual TRT_DEPRECATED void free (void *const memory) noexcept=0
 
virtual ~IGpuAllocator ()=default
 
virtual void * reallocate (void *, uint64_t, uint64_t) noexcept
 
virtual bool deallocate (void *const memory) noexcept
 

Detailed Description

Application-implemented class for controlling allocation on the GPU.

Constructor & Destructor Documentation

◆ ~IGpuAllocator()

virtual nvinfer1::IGpuAllocator::~IGpuAllocator ( )
virtualdefault

Destructor declared virtual as general good practice for a class with virtual methods. TensorRT never calls the destructor for an IGpuAllocator defined by the application.

Member Function Documentation

◆ allocate()

virtual void * nvinfer1::IGpuAllocator::allocate ( uint64_t const  size,
uint64_t const  alignment,
AllocatorFlags const  flags 
)
pure virtualnoexcept

A thread-safe callback implemented by the application to handle acquisition of GPU memory.

Parameters
sizeThe size of the memory required.
alignmentThe required alignment of memory. Alignment will be zero or a power of 2 not exceeding the alignment guaranteed by cudaMalloc. Thus this allocator can be safely implemented with cudaMalloc/cudaFree. An alignment value of zero indicates any alignment is acceptable.
flagsReserved for future use. In the current release, 0 will be passed.

If an allocation request of size 0 is made, nullptr should be returned.

If an allocation request cannot be satisfied, nullptr should be returned.

Note
The implementation must guarantee thread safety for concurrent allocate/free/reallocate/deallocate requests.

\usage

  • Allowed context for the API call
    • Thread-safe: Yes, this method is required to be thread-safe and may be called from multiple threads.

◆ deallocate()

virtual bool nvinfer1::IGpuAllocator::deallocate ( void *const  memory)
inlinevirtualnoexcept

A thread-safe callback implemented by the application to handle release of GPU memory.

TensorRT may pass a nullptr to this function if it was previously returned by allocate().

Parameters
memoryThe acquired memory.
Returns
True if the acquired memory is released successfully.
Note
The implementation must guarantee thread safety for concurrent allocate/free/reallocate/deallocate requests.
If user-implemented free() might hit an error condition, the user should override deallocate() as the primary implementation and override free() to call deallocate() for backwards compatibility.
See also
free()

\usage

  • Allowed context for the API call
    • Thread-safe: Yes, this method is required to be thread-safe and may be called from multiple threads.

◆ free()

virtual TRT_DEPRECATED void nvinfer1::IGpuAllocator::free ( void *const  memory)
pure virtualnoexcept

A thread-safe callback implemented by the application to handle release of GPU memory.

TensorRT may pass a nullptr to this function if it was previously returned by allocate().

Parameters
memoryThe acquired memory.
Note
The implementation must guarantee thread safety for concurrent allocate/free/reallocate/deallocate requests.
See also
deallocate()
Deprecated:
Superseded by deallocate and will be removed in TensorRT 10.0.

\usage

  • Allowed context for the API call
    • Thread-safe: Yes, this method is required to be thread-safe and may be called from multiple threads.

◆ reallocate()

virtual void * nvinfer1::IGpuAllocator::reallocate ( void *  ,
uint64_t  ,
uint64_t   
)
inlinevirtualnoexcept

A thread-safe callback implemented by the application to resize an existing allocation.

Only allocations which were allocated with AllocatorFlag::kRESIZABLE will be resized.

Options are one of:

  • resize in place leaving min(oldSize, newSize) bytes unchanged and return the original address
  • move min(oldSize, newSize) bytes to a new location of sufficient size and return its address
  • return nullptr, to indicate that the request could not be fulfilled.

If nullptr is returned, TensorRT will assume that resize() is not implemented, and that the allocation at baseAddr is still valid.

This method is made available for use cases where delegating the resize strategy to the application provides an opportunity to improve memory management. One possible implementation is to allocate a large virtual device buffer and progressively commit physical memory with cuMemMap. CU_MEM_ALLOC_GRANULARITY_RECOMMENDED is suggested in this case.

TensorRT may call realloc to increase the buffer by relatively small amounts.

Parameters
baseAddrthe address of the original allocation.
alignmentThe alignment used by the original allocation.
newSizeThe new memory size required.
Returns
the address of the reallocated memory
Note
The implementation must guarantee thread safety for concurrent allocate/free/reallocate/deallocate requests.

\usage

  • Allowed context for the API call
    • Thread-safe: Yes, this method is required to be thread-safe and may be called from multiple threads.

The documentation for this class was generated from the following file: