Profile Ranking Models with GenAI-Perf#
GenAI-Perf allows you to profile ranking models compatible with Hugging Face’s Text Embeddings Inference’s re-ranker API.
Start a Hugging Face Re-Ranker-Compatible Server#
To start a Hugging Face re-ranker-compatible server, run the following commands:
model=BAAI/bge-reranker-base
docker run --gpus all -p 8080:80 --pull always ghcr.io/huggingface/text-embeddings-inference:1.3 --model-id $model --port 80
To specify the use of the HuggingFace server,
our benchmarking commands below will include
--endpoint rerank
and --extra-inputs rankings:tei
.
Approach 1. Profile Using Synthetic Inputs#
To profile ranking models using GenAI-Perf, use the following command:
genai-perf profile \
-m BAAI/bge-reranker-base \
--tokenizer BAAI/bge-reranker-base \
--service-kind openai \
--endpoint-type rankings \
--endpoint rerank \
--input-file synthetic:queries,passages \
-u localhost:8080 \
--extra-inputs rankings:tei \
--synthetic-input-tokens-mean 100 \
--batch-size-text 2
Approach 2. Profile Using Custom Inputs#
Create a Sample Rankings Input Directory#
To create a sample rankings input directory, follow these steps:
Create a directory called rankings_jsonl:
mkdir rankings_jsonl
Inside this directory, create a JSONL file named queries.jsonl with queries data:
echo '{"text": "What was the first car ever driven?"}
{"text": "Who served as the 5th President of the United States of America?"}
{"text": "Is the Sydney Opera House located in Australia?"}
{"text": "In what state did they film Shrek 2?"}' > rankings_jsonl/queries.jsonl
Create another JSONL file named passages.jsonl with passages data:
echo '{"text": "Eric Anderson (born January 18, 1968) is an American sociologist and sexologist."}
{"text": "Kevin Loader is a British film and television producer."}
{"text": "Francisco Antonio Zea Juan Francisco Antonio Hilari was a Colombian journalist, botanist, diplomat, politician, and statesman who served as the 1st Vice President of Colombia."}
{"text": "Daddys Home 2 Principal photography on the film began in Massachusetts in March 2017 and it was released in the United States by Paramount Pictures on November 10, 2017. Although the film received unfavorable reviews, it has grossed over $180 million worldwide on a $69 million budget."}' > rankings_jsonl/passages.jsonl
Run GenAI-Perf#
To profile ranking models using GenAI-Perf, use the following command:
genai-perf profile \
-m BAAI/bge-reranker-base \
--tokenizer BAAI/bge-reranker-base \
--service-kind openai \
--endpoint-type rankings \
--endpoint rerank \
--input-file rankings_jsonl/ \
-u localhost:8080 \
--extra-inputs rankings:tei
Review the Output#
Example output:
Rankings Metrics
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━┓
┃ Statistic ┃ avg ┃ min ┃ max ┃ p99 ┃ p90 ┃ p75 ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━┩
│ Request latency (ms) │ 5.48 │ 2.50 │ 23.91 │ 10.27 │ 8.34 │ 6.07 │
└──────────────────────┴──────┴──────┴───────┴───────┴──────┴──────┘
Request throughput (per sec): 180.11