Typedef TRTSERVER_ResponseAllocatorAllocFn_t

Typedef Documentation

typedef TRTSERVER_Error *(*TRTSERVER_ResponseAllocatorAllocFn_t)(TRTSERVER_ResponseAllocator *allocator, void **buffer, void **buffer_userp, const char *tensor_name, size_t byte_size, TRTSERVER_Allocator_Region region, int64_t region_id, void *userp)

Type for allocation function that allocates a buffer to hold a result tensor.

Return in ‘buffer’ a pointer to the contiguous memory block of size ‘byte_size’ for result tensor called ‘tensor_name’. The buffer must be allocated in the memory region identified by ‘region’ and ‘region_id’. The ‘userp’ data is the same as what is supplied in the call to TRTSERVER_ServerInferAsync.

Return in ‘buffer_userp’ a user-specified value to associate with the buffer. This value will be provided in the call to TRTSERVER_ResponseAllocatorReleaseFn_t.

The function will be called for each result tensor, even if the ‘byte_size’ required for that tensor is zero. When ‘byte_size’ is zero the function does not need to allocate any memory but may perform other tasks associated with the result tensor. In this case the function should return success and set ‘buffer’ == nullptr.

If the function is called with ‘byte_size’ non-zero but an allocation is not possible for the request ‘region’, the function should return success and set ‘buffer’ == nullptr to indicate that an allocation in the requested ‘region’ is not possible. In this case the function may be called again for the same ‘tensor_name’ but with a different ‘region’.

The function should return a TRTSERVER_Error object if a failure occurs while attempting an allocation. If an error object is returned, or if ‘buffer’ == nullptr is returned on all attempts for a result tensor, the inference server will assume allocation is not possible for the result buffer and will abort the inference request.