Profile OpenAI-Compatible Text APIs Using AIPerf
Profile OpenAI-Compatible Text APIs Using AIPerf
This guide covers profiling OpenAI-compatible Chat Completions and Completions endpoints with vLLM and AIPerf.
Start a vLLM server
Pull and start a vLLM server using Docker:
Verify the server is ready:
Profile Chat Completions API
The Chat Completions API uses the /v1/chat/completions endpoint.
Profile with synthetic inputs
Run AIPerf against the Chat Completions endpoint using synthetic inputs:
Sample Output (Successful Run):
Profile with custom input file
Create a JSONL input file:
Run AIPerf against the Chat Completions endpoint using the custom input file:
Profile Completions API
The Completions API uses the /v1/completions endpoint.
Profile with synthetic inputs
Run AIPerf against the Completions endpoint using synthetic inputs:
Sample Output (Successful Run):
Profile with custom input file
Create a JSONL input file:
Run AIPerf against the Completions endpoint using the custom input file: