This project demonstrates how to run C API applications using Triton Inference Server as a shared library. We also show how to build and execute such applications on Jetson.
In our example, we placed the contents of downloaded release directory under /opt/tritonserver.
Part 1. Concurrent inference and dynamic batching#
The purpose of the sample located under concurrency_and_dynamic_batching
is to demonstrate the important features of Triton Inference Server such as concurrent model execution and
dynamic batching. In order to do that, we implemented a people detection application using C API and Triton
Inference Server as a shared library.
Part 2. Analyzing model performance with perf_analyzer#
To analyze model performance on Jetson, perf_analyzer tool is used. The perf_analyzer is included in the release tar file or can be compiled from source.
From this directory of the repository, execute the following to evaluate model performance: