The template endpoint provides a flexible way to benchmark custom APIs that don’t match standard OpenAI formats. You define request payloads using Jinja2 templates and optionally specify how to extract responses using JMESPath queries.
Use the template endpoint when:
Benchmark an API that accepts text in a custom format:
Sample Output (Successful Run):
Configure the template endpoint using --extra-inputs:
payload_template: Jinja2 template defining the request payload format
nv-embedqa/path/to/template.json'{"text": {{ text|tojson }}}'response_field: JMESPath query to extract data from responses
data[0].embeddingAny other --extra-inputs fields are merged into every request payload:
text: First text content (or None)texts: List of all text contentsimage, audio, video: First media content (or None)images, audios, videos: Lists of all media contentsquery: First query textqueries: All query textspassage: First passage textpassages: All passage textstexts_by_name: Dict mapping content names to text listsimages_by_name, audios_by_name, videos_by_name: Dicts for mediamodel: Model namemax_tokens: Output token limitstream: Whether streaming is enabledrole: Message roleturn: Current turn objectturns: List of all turnsrequest_info: Full request contextAuto-detection tries to extract in this order: embeddings, rankings, then text.
text, content, response, output, resultchoices[0].text, choices[0].message.contentdata[].embeddingembeddings, embeddingrankings, resultsSpecify a JMESPath query to extract specific fields:
Using the built-in nv-embedqa template:
Note: The nv-embedqa template expands to {"text": {{ texts|tojson }}}.
Create chat_template.json:
Use it:
|tojson for string/list values to properly escape JSON-v or -vv to see debug logs with formatted payloadsartifacts/<run-name>/inputs.json to see all formatted request payloadsresponse_fieldTemplate didn’t render valid JSON
|tojson filter for string or nullable valuesResponse not parsed correctly
-vv to see raw responses in logsresponse_field with a JMESPath queryVariables not available
request_info and turn objects for nested data