Create and Manage Evaluation Targets#
When you run an evaluation in NVIDIA NeMo Evaluator, you create a separate target and configuration for the evaluation.
Tip
Because NeMo Evaluator separates the target and the configuration, you can create a target once, and reuse it multiple times with different configurations (for example, to make a model scorecard). To see what targets and configurations are supported together, refer to Combine Evaluation Targets and Configurations.
NeMo Evaluator provides evaluation capabilities the following different target types:
LLM Models
Retriever Pipelines
RAG Pipelines
Evaluator API URL#
To create a target for an evaluation, send a POST
request to the evaluation/targets
API.
The URL of the evaluator API depends on where you deploy evaluator and how you configure it.
For more information, refer to NeMo Evaluator Deployment Guide.
The examples in this documentation specify {EVALUATOR_HOSTNAME}
in the code.
Do the following to store the evaluator hostname to use it in your code.
Important
Replace <your evaluator service endpoint>
with your address, such as evaluator.internal.your-company.com
, before you run this code.
export EVALUATOR_HOSTNAME="<your evaluator service endpoint>"
import requests
EVALUATOR_HOSTNAME = "<your evaluator service endpoint>"
Example Target#
The following is the partial structure of the code to create an evaluation target. Use the rest of this documentation to see examples and reference to create a target specific to your scenario.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "<target-type>",
"name": "<my-target-name>",
"namespace": "<my-namespace>",
// More target details
}'
data = {
"type": "<evaluation-type>",
"name": "<my-configuration-name>",
"namespace": "<my-namespace>",
// More target details
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
To see a sample response, refer to Create Target Response.
Target JSON Reference#
When you create a target for an evaluation, you send a JSON data structure that contains the information for your target.
Important
Each target is uniquely identified by a combination of namespace
and name
. For example my-organization/my-target
.
The following table contains selected field reference for the JSON data. For the full API reference, refer to Evaluator API.
Name |
Description |
Type |
Valid Values or Child Objects |
---|---|---|---|
api_endpoint |
The endpoint for a model. |
Object |
- |
api_key |
The key to access an API endpoint. |
String |
— |
cached_outputs |
Pre-generated data. |
Object |
- |
context_ordering |
The order for retrieved results. |
String |
- |
custom_fields |
An optional object that you can use to store additional information. |
Object |
— |
files_url |
The url for a file that contains pre-generated data. Use |
String |
— |
id |
The ID of the target. The ID is returned in the response when you create a target. |
String |
— |
index_embedding_model |
The NIM model for the embedding model to perform indexing of documents. |
Object |
- |
model |
The NIM model for an evaluation. |
Object |
- |
model_id |
The id of the NIM model, as specified in Models. |
String |
— |
name |
An arbitrary name for to identify the target. If you don’t specify a name, the default is the ID associated with the target. |
String |
— |
namespace |
An arbitrary organization name, a vendor name, or any other text. If you don’t specify a namespace, the default is |
String |
— |
pipeline |
The pipeline for a retriever or RAG evaluation. |
Object |
- |
query_embedding_model |
The NIM model for the embedding model to perform querying. |
Object |
- |
rag |
A RAG pipeline for an evaluation. |
Object |
- |
reranker_model |
The NIM model for the reranker model to perform reranking documents. |
Object |
- |
retriever |
A retriever pipeline for an evaluation. |
Object |
- |
top_k |
The number of relevant documents to be retrieved based on the query, sorted descending by relevance score. |
Integer |
Any positive number. In practice, this value should usually be less than 100. |
type |
The type of the evaluation target. |
String |
- |
url |
The url for a model endpoint. |
String |
— |
LLM Model Targets#
An LLM model target points to a model, such as an LLM model, a chat endpoint, or a data file.
Example Target for an LLM Model Endpoint#
To create an evaluation target pointing to an LLM model running as NIM for LLMs, specify a model
that contains the api_endpoint
of the model.
For the list of NIM for LLMs models, refer to Models.
Use the following code to create a target for an LLM model.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "model",
"name": "my-target-model-1",
"namespace": "my-organization",
"model": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/completions",
"model_id": "<my-model>"
}
}
}'
data = {
"type": "model",
"name": "my-target-model-1",
"namespace": "my-organization",
"model": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/completions",
"model_id": "<my-model>"
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Example Target for a Chat Endpoint#
To run an evaluation using a chat endpoint, specify a model.api_endpoint.url
that contains a URL that ends with /chat/completions
.
Use the following code to create a target for a chat endpoint.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "model",
"name": "my-target-model-2",
"namespace": "my-organization",
"model": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}'
data = {
"type": "model",
"name": "my-target-model-2",
"namespace": "my-organization",
"model": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Example Target for a Chat Endpoint (OpenAI-compatible Behind Authentication)#
To run an evaluation on an OpenAI-compatible chat endpoint that requires authentication with an API key or token,
specify openai
for model.api_endpoint.format
,
and specify the API key for model.api_endpoint.api_key
.
Use the following code to create a target for an OpenAI-compatible chat endpoint behind authentication.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "model",
"name": "my-target-model-3",
"namespace": "my-organization",
"model": {
"api_endpoint": {
"url": "<external-openai-compatible-base-url>/chat/completions",
"model_id": "<external-model>",
"api_key": "<my-api-key>",
"format": "openai"
}
}
}'
data = {
"type": "model",
"name": "my-target-model-3",
"namespace": "my-organization",
"model": {
"api_endpoint": {
"url": "<external-openai-compatible-base-url>/chat/completions",
"model_id": "<external-model>",
"api_key": "<my-api-key>",
"format": "openai"
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Example Target Offline (Pre-generated)#
An offline (pre-generated) target points to a file that is stored in stored in NeMo Data Store and that contains pre-generated answers. Offline targets are useful for similarity metrics evaluations. For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.
Use the following code to create a target that contains pre-generated answers.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "cached_outputs",
"name": "my-target-model-4",
"namespace": "my-organization",
"cached_outputs": {
"files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
}
}'
data = {
"type": "cached_outputs",
"name": "my-target-model-4",
"namespace": "my-organization",
"cached_outputs": {
"files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Retriever Pipeline Targets#
Retriever pipelines are used to retrieve relevant documents based on a query. For more information, refer to Retriever Pipelines.
Example Target for Embedding Only#
In an embedding-only scenario, an embedding model is used to perform dense retrieval of documents.
Use the following code to create a retriever target with embedding only.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "retriever",
"name": "my-target-retriever-1",
"namespace": "my-organization",
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"top_k": 5
}
}
}'
data = {
"type": "retriever",
"name": "my-target-retriever-1",
"namespace": "my-organization",
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"top_k": 5
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Example Target for Embedding + Reranking#
In an embedding + reranking scenario, the documents retrieved by the embedding model are reranked by the reranking model.
Use the following code to create a retriever target with embedding and reranking.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "retriever",
"name": "my-target-retriever-2",
"namespace": "my-organization",
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"reranker_model": {
"api_endpoint": {
"url": "<my-ranker-url>",
"model_id": "<my-ranker-model>"
}
},
"top_k": 5
}
}
}'
data = {
"type": "retriever",
"name": "my-target-retriever-2",
"namespace": "my-organization",
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"reranker_model": {
"api_endpoint": {
"url": "<my-ranker-url>",
"model_id": "<my-ranker-model>"
}
},
"top_k": 5
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
RAG Pipeline Targets#
Retrieval Augmented Generation (RAG) pipelines are built by pipelining NeMo Retriever and LLM. A retriever pipeline is used to retrieve relevant documents based on a query, and the LLM is used to generate answers based on the query and the retrieved documents. For more information, refer to RAG Pipelines.
Example Target for Answer Evaluation#
NeMo Evaluator supports Answer Evaluation RAG pipelines.
The rag
pipeline is replaced by a cached_outputs
field that contains pre-generated retrieved documents and pre-generated answers.
For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.
Use the following code to create a RAG target for an answer evaluation pipeline.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "rag",
"name": "my-target-rag-1",
"namespace": "my-organization",
"rag": {
"cached_outputs": {
"files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
}
}
}'
data = {
"type": "rag",
"name": "my-target-rag-1",
"namespace": "my-organization",
"rag": {
"cached_outputs": {
"files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Example Target for Answer Generation + Answer Evaluation#
NeMo Evaluator supports Answer Generation + Answer Evaluation RAG pipelines.
The retriever
pipeline is replaced by a cached_outputs
field that contains pre-generated retrieved documents.
For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.
Use the following code to create a RAG target for an Answer Generation + Answer Evaluation pipeline.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "rag",
"name": "my-target-rag-2",
"namespace": "my-organization",
"rag": {
"pipeline": {
"retriever": {
"cached_outputs": {
"files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
}
},
"model": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}
}
}'
data = {
"type": "rag",
"name": "my-target-rag-2",
"namespace": "my-organization",
"rag": {
"pipeline": {
"retriever": {
"cached_outputs": {
"files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
}
},
"model": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Example Target for Retrieval (Embedding only) + Answer Generation + Answer Evaluation#
NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines.
The rag
pipeline field contains a retriever
pipeline and a model
.
Use the following code to create a RAG target for a Retrieval (Embedding only) + Answer Generation + Answer Evaluation pipeline.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "rag",
"name": "my-target-rag-3",
"namespace": "my-organization",
"rag": {
"pipeline": {
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"top_k": 3
}
},
"model": {
"api_endpoint": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}
}
}
}'
data = {
"type": "rag",
"name": "my-target-rag-3",
"namespace": "my-organization",
"rag": {
"pipeline": {
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"top_k": 3
}
},
"model": {
"api_endpoint": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Example Target for Retrieval (Embedding + Reranking) + Answer Generation + Answer Evaluation#
NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines.
The rag
pipeline field contains a retriever
pipeline and a model
.
Use the following code to create a RAG target for a Retrieval (Embedding + Reranking) + Answer Generation + Answer Evaluation pipeline.
curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '
{
"type": "rag",
"name": "my-target-rag-4",
"namespace": "my-organization",
"rag": {
"pipeline": {
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"reranker_model": {
"api_endpoint": {
"url": "<my-ranker-url>",
"model_id": "<my-ranker-model>"
}
},
"top_k": 3
}
},
"model": {
"api_endpoint": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}
}
}
}'
data = {
"type": "rag",
"name": "my-target-rag-4",
"namespace": "my-organization",
"rag": {
"pipeline": {
"retriever": {
"pipeline": {
"query_embedding_model": {
"api_endpoint": {
"url": "<my-query-embedding-url>",
"model_id": "<my-query-embedding-model>"
}
},
"index_embedding_model": {
"api_endpoint": {
"url": "<my-index-embedding-url>",
"model_id": "<my-index-embedding-model>"
}
},
"reranker_model": {
"api_endpoint": {
"url": "<my-ranker-url>",
"model_id": "<my-ranker-model>"
}
},
"top_k": 3
}
},
"model": {
"api_endpoint": {
"api_endpoint": {
"url": "<my-nim-deployment-base-url>/chat/completions",
"model_id": "<my-model>"
}
}
}
}
}
}
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"
response = requests.post(endpoint, json=data).json()
Delete a Target#
To delete an evaluation target, send a DELETE
request to the targets endpoint.
You must provide both the namespace and ID of the target as shown in the following code.
Caution
Before you delete a target, ensure that no jobs use it. If a job uses the target, you must delete the job first. To find all jobs that use a target, refer to Example: Filter Jobs by Target.
curl -X "DELETE" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets/<my-namespace>/<my-target-id>" \
-H 'accept: application/json'
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets/<my-namespace>/<my-target-id>"
response = requests.delete(endpoint).json()
response
When you delete a target, the response is similar to the following.
{
"message": "Resource deleted successfully.",
"id": "eval-target-ABCD1234EFGH5678",
"deleted_at": null
}
Create Target Response#
When you create a target for an evaluation, the response is similar to the following.
For the full response reference, refer to Evaluator API.
{
"created_at": "2025-03-19T22:23:28.528061",
"updated_at": "2025-03-19T22:23:28.528062",
"id": "eval-target-ABCD1234EFGH5678",
"name": "my-target-model-1",
"namespace": "my-organization",
"type": "model",
"model": {
"schema_version": "1.0",
"id": "model-MvPLX6aEa1zXJq7YMRCosm",
"type_prefix": "model",
"namespace": "default",
"created_at": "2025-03-19T22:23:28.527760",
"updated_at": "2025-03-19T22:23:28.527762",
"custom_fields": {},
"name": "model-MvPLX6aEa1zXJq7YMRCosm",
"version_id": "main",
"version_tags": [],
"api_endpoint": {
"url": "http://nemo-nim-proxy:8000/v1/chat/completions",
"model_id": "meta/llama-3.1-8b-instruct",
"format": "nim"
}
},
"custom_fields": {}
}