SDK Library#
The SDK library consists of the Maxine SDK C API and Library, which is used by the application to communicate with the server. Several applications can connect to the same Triton server where the individual requests are dynamically batched for concurrent processing. Each application can send one request or a batch of requests simultaneously.
The SDK library communicates with the server using the gRPC. The video data between the server and the SDK library is transferred as raw frames. We recommende that the server and the application be run on the same machine, not separate machines connected over a network, because it is not possible to achieve real-time video processing by transferring raw video frames using gRPC. This is not a problem when the server and the application are on the same machine. Additionally, on the same machine, applications can choose the option to share the image buffers between server and SDK library using CUDA shared memory (https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/protocol/extension_shared_memory.html), which can achieve significantly faster data transfer.
Depending on what APIs are called, the SDK library can also run the features locally, rather than on a server. In this case, the applications will not be able to utilize the benefits of the Triton server. The details about the API usage is explained in the programming guide section.