Alternative Installation Methods#

pip#

pip install tritonclient

perf_analyzer -m <model>

Warning: If any runtime dependencies are missing, Perf Analyzer will produce errors showing which ones are missing. You will need to manually install them.

Build from Source#

docker run --rm --gpus=all -it --net=host ubuntu:24.04

# inside container, install build/runtime dependencies
apt update && DEBIAN_FRONTEND=noninteractive apt install -y cmake g++ git libssl-dev nvidia-cuda-toolkit python3 rapidjson-dev zlib1g-dev

git clone --depth=1 https://github.com/triton-inference-server/perf_analyzer.git

mkdir perf_analyzer/build

cmake -B perf_analyzer/build -S perf_analyzer

cmake --build perf_analyzer/build -- -j8

export PATH=$(pwd)/perf_analyzer/build/perf_analyzer/src/perf-analyzer-build:$PATH

perf_analyzer -m <model>
  • To enable OpenAI mode, add -D TRITON_ENABLE_PERF_ANALYZER_OPENAI=ON to the first cmake command.

  • To enable C API mode, add -D TRITON_ENABLE_PERF_ANALYZER_C_API=ON to the first cmake command.

  • To enable TorchServe backend, add -D TRITON_ENABLE_PERF_ANALYZER_TS=ON to the first cmake command.

  • To enable Tensorflow Serving backend, add -D TRITON_ENABLE_PERF_ANALYZER_TFS=ON to the first cmake command.

  • To disable CUDA shared memory support and the dependency on CUDA toolkit libraries, add -D TRITON_ENABLE_GPU=OFF to the first cmake command.