Testing Endpoint Compatibility#

This guide helps you test your hosted endpoint to verify OpenAI-compatible API compatibility using curl requests for different task types. Models deployed using nemo-evaluator-launcher should be compatible with these tests.

To test your endpoint run the provided command and check the model’s response. Make sure to populate FULL_ENDPOINT_URL and API_KEY and replace <YOUR_MODEL_NAME> with your own values.

Chat endpoint testing#

Tip

If you model is not gated, skip the line with authorization header:

-H "Authorization: Bearer ${API_KEY}"

from the commands below.

General Requirements#

Your endpoint should support the following parameters:

  • top_p

  • temperature

  • max_tokens

Chat endpoint testing#

export FULL_ENDPOINT_URL="https://your-server.com/v1/chat/completions"
export API_KEY="your-api-key-here"

curl -X POST ${FULL_ENDPOINT_URL} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${API_KEY}" \
-d '{
  "messages": [
    {
      "role": "user",
      "content": "Write Python code that can add a list of numbers together."
    }
  ],
  "model": "<YOUR_MODEL_NAME>",
  "temperature": 0.6,
  "top_p": 0.95,
  "max_tokens": 256,
  "stream": false
}'

Completions endpoint testing#

export FULL_ENDPOINT_URL="https://your-server.com/v1/completions"
export API_KEY="your-api-key-here"

curl -X POST ${FULL_ENDPOINT_URL} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${API_KEY}" \
-d '{
  "prompt": "Write Python code that can add a list of numbers together.",
  "model": "<YOUR_MODEL_NAME>",
  "temperature": 0.6,
  "top_p": 0.95,
  "max_tokens": 256,
  "stream": false
}'

VLM chat endpoint testing#

NeMo Evaluator supports the OpenAI Images API (docs) and vLLM (docs) with the image provided as base64-encoded image, and the following content types:

  • image_url

  • text

export FULL_ENDPOINT_URL="https://your-server.com/v1/chat/completions"
export API_KEY="your-api-key-here"

curl -X POST ${FULL_ENDPOINT_URL} \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -H "Accept: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image_url",
            "image_url": {
              "url": ""
            }
          },
          {
            "type": "text",
            "text": "Describe the image:"
          }
        ]
      }
    ],
    "model": "<YOUR_MODEL_NAME>",
    "stream": false,
    "max_tokens": 16,
        "temperature": 0.0,
    "top_p": 1.0
}'

Function calling testing#

We support OpenAI-compatible function calling (docs):

Function calling request:

export FULL_ENDPOINT_URL="https://your-server.com/v1/chat/completions"
export API_KEY="your-api-key-here"

curl -X POST ${FULL_ENDPOINT_URL} \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -H "Accept: application/json" \
  -d '{
    "model": "<YOUR_MODEL_NAME>",
    "stream": false,
    "max_tokens": 16,
    "temperature": 0.0,
    "top_p": 1.0,
    "messages": [
      {
        "role": "user",
        "content": "What is the slope of the line which is perpendicular to the line with the equation y = 3x + 2?"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "find_critical_points",
          "description": "Finds the critical points of the function. Note that the provided function is in Python 3 syntax.",
          "parameters": {
            "type": "object",
            "properties": {
              "function": {
                "type": "string",
                "description": "The function to find the critical points for."
              },
              "variable": {
                "type": "string",
                "description": "The variable in the function."
              },
              "range": {
                "type": "array",
                "items": {
                  "type": "number"
                },
                "description": "The range to consider for finding critical points. Optional. Default is [0.0, 3.4]."
              }
            },
            "required": ["function", "variable"]
          }
        }
      }
    ]
  }'

Audio endpoint testing#

We support audio input with the following content types:

  • audio_url

Example:

export FULL_ENDPOINT_URL="https://your-server.com/v1/chat/completions"
export API_KEY="your-api-key-here"

curl -X POST ${FULL_ENDPOINT_URL} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${API_KEY}" \
  -H "Accept: application/json" \
  -d '{
    "max_tokens": 256,
    "model": "<YOUR_MODEL_NAME>",
    "messages": [
        {
            "content": [
                {
                    "audio_url": {
                        "url": "data:audio/wav;base64,"
                    },
                    "type": "audio_url"
                },
                {
                    "text": "Please recognize the speech and only output the recognized content:",
                    "type": "text"
                }
            ],
            "role": "user"
        }
    ],
    "temperature": 0.0,
    "top_p": 1.0
}'

Next Steps#

  • Run your first evaluation: Choose your path with Quickstart

  • Select benchmarks: Explore available evaluation tasks