Building

The inference server is built using Docker and the TensorFlow and PyTorch containers from NVIDIA GPU Cloud (NGC). Before building you must install Docker and nvidia-docker and login to the NGC registry by following the instructions in Installing Prebuilt Containers.

Building the Server

To build a release version of the TensorRT Inference Server container change directory to the root of the repo and issue the following command:

$ docker build --pull -t tensorrtserver .

Incremental Builds

For typical development you will want to run the build container with your local repo’s source files mounted so that your local changes can be incrementally built. This is done by first building the tensorrtserver_build container:

$ docker build --pull -t tensorrtserver_build --target trtserver_build .

By mounting /path/to/tensorrtserver/src into the container at /workspace/src, changes to your local repo will be reflected in the container:

$ nvidia-docker run -it --rm -v/path/to/tensorrtserver/src:/workspace/src tensorrtserver_build

Within the container you can perform an incremental server build with:

# cd /workspace
# bazel build -c opt --config=cuda src/servers/trtserver
# cp /workspace/bazel-bin/src/servers/trtserver /opt/tensorrtserver/bin/trtserver

Similarly, within the container you can perform an incremental build of the C++ and Python client libraries and example executables with:

# cd /workspace
# bazel build -c opt --config=cuda src/clients/…
# mkdir -p /opt/tensorrtserver/bin
# cp bazel-bin/src/clients/c++/image_client /opt/tensorrtserver/bin/.
# cp bazel-bin/src/clients/c++/perf_client /opt/tensorrtserver/bin/.
# cp bazel-bin/src/clients/c++/simple_client /opt/tensorrtserver/bin/.
# mkdir -p /opt/tensorrtserver/lib
# cp bazel-bin/src/clients/c++/librequest.so /opt/tensorrtserver/lib/.
# cp bazel-bin/src/clients/c++/librequest.a /opt/tensorrtserver/lib/.
# mkdir -p /opt/tensorrtserver/pip
# bazel-bin/src/clients/python/build_pip /opt/tensorrtserver/pip/.

Some source changes seem to cause bazel to get confused and not correctly rebuild all required sources. You can force bazel to rebuild all the inference server source without requiring a complete rebuild of the TensorFlow and Caffe2 components by doing the following before issuing the above build command:

# rm -fr bazel-bin/src

Building the Client Libraries and Examples

The provided Dockerfile can be used to build just the client libraries and examples. Issue the following command to build the C++ client library, C++ and Python examples, and a Python wheel file for the Python client library:

$ docker build -t tensorrtserver_clients --target trtserver_build --build-arg "PYVER=<ver>" --build-arg "BUILD_CLIENTS_ONLY=1" .

The --build-arg setting PYVER is optional and can be used to set the Python version that you want the Python client library built for (the default is 3.5).

After the build completes, the easiest way to extract the built libraries and examples from the docker image is to mount a host directory and then copy them out from within the container:

$ docker run -it --rm -v/tmp:/tmp/host tensorrtserver_clients
# cp /opt/tensorrtserver/bin/image_client /tmp/host/.
# cp /opt/tensorrtserver/bin/perf_client /tmp/host/.
# cp /opt/tensorrtserver/bin/simple_client /tmp/host/.
# cp /opt/tensorrtserver/pip/tensorrtserver-*.whl /tmp/host/.
# cp /opt/tensorrtserver/lib/librequest.* /tmp/host/.

You can now access the files from /tmp on the host system. To run the C++ examples you must install some dependencies on your host system:

$ apt-get install curl libcurl3-dev libopencv-dev libopencv-core-dev python-pil

To run the Python examples you will need to additionally install the client whl file and some other dependencies:

$ apt-get install python3 python3-pip
$ pip3 install --user --upgrade tensorrtserver-*.whl pillow

Building the Documentation

The inference server documentation is found in the docs/ directory and is based on Sphinx. Doxygen integrated with Exhale is used for C++ API docuementation.

To build the docs install the required dependencies. As of 10/22/2018 need to install sphinx version < 1.8 to avoid a bug related to exhale:

$ apt-get update
$ apt-get install -y --no-install-recommends doxygen
$ pip install --upgrade 'sphinx<1.8' sphinx-rtd-theme nbsphinx exhale

To get the Python client library API docs the TensorRT Inference Server Python package must be installed:

$ pip install --upgrade tensorrtserver-*.whl

Then use Sphinx to build the documentation into the build/html directory:

$ cd docs
$ make clean html