Profile OpenAI-Compatible Text APIs Using AIPerf
Profile OpenAI-Compatible Text APIs Using AIPerf
Profile OpenAI-Compatible Text APIs Using AIPerf
This guide covers profiling OpenAI-compatible Chat Completions and Completions endpoints with vLLM and AIPerf.
Pull and start a vLLM server using Docker:
Verify the server is ready:
The Chat Completions API uses the /v1/chat/completions endpoint.
Run AIPerf against the Chat Completions endpoint using synthetic inputs:
Sample Output (Successful Run):
Create a JSONL input file:
Run AIPerf against the Chat Completions endpoint using the custom input file:
The Completions API uses the /v1/completions endpoint.
Run AIPerf against the Completions endpoint using synthetic inputs:
Sample Output (Successful Run):
Create a JSONL input file:
Run AIPerf against the Completions endpoint using the custom input file: