Defined in File trtserver.h
typedef TRTSERVER_Error *(*
TRTSERVER_ResponseAllocatorAllocFn_t)(TRTSERVER_ResponseAllocator *allocator, const char *tensor_name, size_t byte_size, TRTSERVER_Memory_Type memory_type, int64_t memory_type_id, void *userp, void **buffer, void **buffer_userp, TRTSERVER_Memory_Type *actual_memory_type, int64_t *actual_memory_type_id)¶
Object representing a memory allocator for inference response tensors.Type for allocation function that allocates a buffer to hold a result tensor.
Return in ‘buffer’ a pointer to the contiguous memory block of size ‘byte_size’ for result tensor called ‘tensor_name’. The buffer must be allocated in the memory type identified by ‘memory_type’ and ‘memory_type_id’. The ‘userp’ data is the same as what is supplied in the call to TRTSERVER_ServerInferAsync.
Return in ‘buffer_userp’ a user-specified value to associate with the buffer. This value will be provided in the call to TRTSERVER_ResponseAllocatorReleaseFn_t.
The function will be called once for each result tensor, even if the ‘byte_size’ required for that tensor is zero. When ‘byte_size’ is zero the function does not need to allocate any memory but may perform other tasks associated with the result tensor. In this case the function should return success and set ‘buffer’ == nullptr.
If the function is called with ‘byte_size’ non-zero the function should allocate a contiguous buffer of the requested size. If possible the function should allocate the buffer in the requested ‘memory_type’ and ‘memory_type_id’, but the function is free to allocate the buffer in any memory. The function must return in ‘actual_memory_type’ and ‘actual_memory_type_id’ the memory where the buffer is allocated.
The function should return a TRTSERVER_Error object if a failure occurs while attempting an allocation. If an error object is returned for a result tensor, the inference server will assume allocation is not possible for the result buffer and will abort the inference request.