Currently there is no CI testing enabled for the open-source version of the Triton Inference Server. We will enable CI testing in a future update.
However, there is a set of tests in the qa/ directory that can be run manually to provide extensive testing. Before running these tests you must first generate a few model repositories containing the models needed by the tests.
Generate QA Model Repositories¶
The QA model repositories contain some simple models that are used to verify the correctness of the inference server. To generate the QA model repositories:
$ cd qa/common $ ./gen_qa_model_repository $ ./gen_qa_custom_ops
This will create multiple model repositories in /tmp/<version>/qa_* (for example /tmp/19.08/qa_model_repository). The TensorRT models will be created for the GPU on the system that CUDA considers device 0 (zero). If you have multiple GPUs on your system see the documentation in the scripts for how to target a specific GPU.
Build QA Container¶
Next you need to build a QA version of the inference server container. This container will contain the inference server, the QA tests, and all the dependencies needed to run the QA tests. You must first build the tritonserver_client, tritonserver_cbe, tritonserver_build and tritonserver containers as described in Getting the Client Libraries and Building and then build the QA container:
$ docker build -t tritonserver_qa -f Dockerfile.QA .
Run QA Container¶
Now run the QA container and mount the QA model repositories into the container so the tests will be able to access them:
$ nvidia-docker run -it --rm -v/tmp:/data/inferenceserver tritonserver_qa
Within the container the QA tests are in /opt/tensorrtserver/qa. To run a test:
$ cd <test directory> $ ./test.sh