TensorRT 10.6.0
|
#include <NvInferRuntime.h>
Public Member Functions | |
virtual TRT_DEPRECATED void * | allocate (uint64_t const size, uint64_t const alignment, AllocatorFlags const flags) noexcept=0 |
A thread-safe callback implemented by the application to handle acquisition of GPU memory. More... | |
~IGpuAllocator () override=default | |
IGpuAllocator ()=default | |
virtual void * | reallocate (void *const, uint64_t, uint64_t) noexcept |
A thread-safe callback implemented by the application to resize an existing allocation. More... | |
virtual TRT_DEPRECATED bool | deallocate (void *const memory) noexcept=0 |
A thread-safe callback implemented by the application to handle release of GPU memory. More... | |
virtual void * | allocateAsync (uint64_t const size, uint64_t const alignment, AllocatorFlags const flags, cudaStream_t) noexcept |
A thread-safe callback implemented by the application to handle stream-ordered acquisition of GPU memory. More... | |
virtual bool | deallocateAsync (void *const memory, cudaStream_t) noexcept |
A thread-safe callback implemented by the application to handle stream-ordered release of GPU memory. More... | |
InterfaceInfo | getInterfaceInfo () const noexcept override |
Return version information associated with this interface. Applications must not override this method. More... | |
Public Member Functions inherited from nvinfer1::IVersionedInterface | |
virtual APILanguage | getAPILanguage () const noexcept |
The language used to build the implementation of this Interface. More... | |
virtual | ~IVersionedInterface () noexcept=default |
Additional Inherited Members | |
Protected Member Functions inherited from nvinfer1::IVersionedInterface | |
IVersionedInterface ()=default | |
IVersionedInterface (IVersionedInterface const &)=default | |
IVersionedInterface (IVersionedInterface &&)=default | |
IVersionedInterface & | operator= (IVersionedInterface const &) &=default |
IVersionedInterface & | operator= (IVersionedInterface &&) &=default |
|
overridedefault |
|
default |
|
pure virtualnoexcept |
A thread-safe callback implemented by the application to handle acquisition of GPU memory.
size | The size of the memory block required (in bytes). |
alignment | The required alignment of memory. Alignment will be zero or a power of 2 not exceeding the alignment guaranteed by cudaMalloc. Thus this allocator can be safely implemented with cudaMalloc/cudaFree. An alignment value of zero indicates any alignment is acceptable. |
flags | Reserved for future use. In the current release, 0 will be passed. |
Usage considerations
Allowed context for the API call
Implemented in nvinfer1::v_1_0::IGpuAsyncAllocator.
|
inlinevirtualnoexcept |
A thread-safe callback implemented by the application to handle stream-ordered acquisition of GPU memory.
The default behavior is to call method allocate(), which is synchronous and thus loses any performance benefits of asynchronous allocation. If you want the benefits of asynchronous allocation, see discussion of IGpuAsyncAllocator vs. IGpuAllocator in the documentation for nvinfer1::IGpuAllocator.
size | The size of the memory block required (in bytes). |
alignment | The required alignment of memory. Alignment will be zero or a power of 2 not exceeding the alignment guaranteed by cudaMalloc. Thus this allocator can be safely implemented with cudaMalloc/cudaFree. An alignment value of zero indicates any alignment is acceptable. |
flags | Reserved for future use. In the current release, 0 will be passed. |
stream | specifies the cudaStream for asynchronous usage. |
Usage considerations
Reimplemented in nvinfer1::v_1_0::IGpuAsyncAllocator.
|
pure virtualnoexcept |
A thread-safe callback implemented by the application to handle release of GPU memory.
TensorRT may pass a nullptr to this function if it was previously returned by allocate().
memory | A memory address that was previously returned by an allocate() or reallocate() call of the same allocator object. |
Usage considerations
Implemented in nvinfer1::v_1_0::IGpuAsyncAllocator.
|
inlinevirtualnoexcept |
A thread-safe callback implemented by the application to handle stream-ordered release of GPU memory.
The default behavior is to call method deallocate(), which is synchronous and thus loses any performance benefits of asynchronous deallocation. If you want the benefits of asynchronous deallocation, see discussion of IGpuAsyncAllocator vs. IGpuAllocator in the documentation for nvinfer1::IGpuAllocator.
TensorRT may pass a nullptr to this function if it was previously returned by allocate().
memory | A memory address that was previously returned by an allocate() or reallocate() call of the same allocator object. |
stream | specifies the cudaStream for asynchronous usage. |
Usage considerations
Reimplemented in nvinfer1::v_1_0::IGpuAsyncAllocator.
|
inlineoverridevirtualnoexcept |
Return version information associated with this interface. Applications must not override this method.
Implements nvinfer1::IVersionedInterface.
Reimplemented in nvinfer1::v_1_0::IGpuAsyncAllocator.
|
inlinevirtualnoexcept |
A thread-safe callback implemented by the application to resize an existing allocation.
Only allocations which were allocated with AllocatorFlag::kRESIZABLE will be resized.
Options are one of:
If nullptr is returned, TensorRT will assume that resize() is not implemented, and that the allocation at baseAddr is still valid.
This method is made available for use cases where delegating the resize strategy to the application provides an opportunity to improve memory management. One possible implementation is to allocate a large virtual device buffer and progressively commit physical memory with cuMemMap. CU_MEM_ALLOC_GRANULARITY_RECOMMENDED is suggested in this case.
TensorRT may call realloc to increase the buffer by relatively small amounts.
baseAddr | the address of the original allocation, which will have been returned by previously calling allocate() or reallocate() on the same object. |
alignment | The alignment used by the original allocation. This will be the same value that was previously passed to the allocate() or reallocate() call that returned baseAddr. |
newSize | The new memory size required (in bytes). |
Usage considerations
Copyright © 2024 NVIDIA Corporation
Privacy Policy |
Manage My Privacy |
Do Not Sell or Share My Data |
Terms of Service |
Accessibility |
Corporate Policies |
Product Security |
Contact