Create and Manage Evaluation Targets#

When you run an evaluation in NVIDIA NeMo Evaluator, you create a separate target and configuration for the evaluation.

Tip

Because NeMo Evaluator separates the target and the configuration, you can create a target once, and reuse it multiple times with different configurations (for example, to make a model scorecard). To see what targets and configurations are supported together, refer to Combine Evaluation Targets and Configurations.

NeMo Evaluator provides evaluation capabilities the following different target types:

  • LLM Models

  • Retriever Pipelines

  • RAG Pipelines

Evaluator API URL#

To create a target for an evaluation, send a POST request to the evaluation/targets API. The URL of the evaluator API depends on where you deploy evaluator and how you configure it. For more information, refer to NeMo Evaluator Deployment Guide.

The examples in this documentation specify {EVALUATOR_HOSTNAME} in the code. Do the following to store the evaluator hostname to use it in your code.

Important

Replace <your evaluator service endpoint> with your address, such as evaluator.internal.your-company.com, before you run this code.

export EVALUATOR_HOSTNAME="<your evaluator service endpoint>"
import requests

EVALUATOR_HOSTNAME = "<your evaluator service endpoint>" 

Example Target#

The following is the partial structure of the code to create an evaluation target. Use the rest of this documentation to see examples and reference to create a target specific to your scenario.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -d '
    {
        "type": "<target-type>",
        "name": "<my-target-name>",
        "namespace": "<my-namespace>",

        // More target details
    }'
data = {
    "type": "<evaluation-type>",
    "name": "<my-configuration-name>",
    "namespace": "<my-namespace>",

    // More target details
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

To see a sample response, refer to Create Target Response.

Target JSON Reference#

When you create a target for an evaluation, you send a JSON data structure that contains the information for your target.

Important

Each target is uniquely identified by a combination of namespace and name. For example my-organization/my-target.

The following table contains selected field reference for the JSON data. For the full API reference, refer to Evaluator API.

Name

Description

Type

Valid Values or Child Objects

api_endpoint

The endpoint for a model.

Object

- url
- model_id
- api_key

api_key

The key to access an API endpoint.

String

cached_outputs

Pre-generated data.

Object

- files_url

context_ordering

The order for retrieved results.

String

- asc
- desc

custom_fields

An optional object that you can use to store additional information.

Object

files_url

The url for a file that contains pre-generated data. Use hf://datasets/ as prefix for files stored in NeMo Data Store. For format information, refer to Use Custom Data with NVIDIA NeMo Evaluator.

String

id

The ID of the target. The ID is returned in the response when you create a target.

String

index_embedding_model

The NIM model for the embedding model to perform indexing of documents.

Object

- api_endpoint

model

The NIM model for an evaluation.

Object

- api_endpoint

model_id

The id of the NIM model, as specified in Models.

String

name

An arbitrary name for to identify the target. If you don’t specify a name, the default is the ID associated with the target.

String

namespace

An arbitrary organization name, a vendor name, or any other text. If you don’t specify a namespace, the default is default.

String

pipeline

The pipeline for a retriever or RAG evaluation.

Object

- query_embedding_model
- index_embedding_model
- reranker_model
- top_k
- retriever
- model
- context_ordering

query_embedding_model

The NIM model for the embedding model to perform querying.

Object

- api_endpoint

rag

A RAG pipeline for an evaluation.

Object

- pipeline
- cached_outputs

reranker_model

The NIM model for the reranker model to perform reranking documents.

Object

- api_endpoint

retriever

A retriever pipeline for an evaluation.

Object

- pipeline
- cached_outputs

top_k

The number of relevant documents to be retrieved based on the query, sorted descending by relevance score.

Integer

Any positive number. In practice, this value should usually be less than 100.

type

The type of the evaluation target.

String

- cached_outputs
- model
- retriever
- rag

url

The url for a model endpoint.

String

LLM Model Targets#

An LLM model target points to a model, such as an LLM model, a chat endpoint, or a data file.

Example Target for an LLM Model Endpoint#

To create an evaluation target pointing to an LLM model running as NIM for LLMs, specify a model that contains the api_endpoint of the model. For the list of NIM for LLMs models, refer to Models.

Use the following code to create a target for an LLM model.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "model",
      "name": "my-target-model-1",
      "namespace": "my-organization",
      "model": {
         "api_endpoint": {
            "url": "<my-nim-deployment-base-url>/completions",
            "model_id": "<my-model>"
         }
      }
   }'
data = {
    "type": "model",
    "name": "my-target-model-1",
    "namespace": "my-organization",
    "model": {
        "api_endpoint": {
        "url": "<my-nim-deployment-base-url>/completions",
        "model_id": "<my-model>"
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for a Chat Endpoint#

To run an evaluation using a chat endpoint, specify a model.api_endpoint.url that contains a URL that ends with /chat/completions.

Use the following code to create a target for a chat endpoint.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "model",
      "name": "my-target-model-2",
      "namespace": "my-organization",
      "model": {
         "api_endpoint": {
            "url": "<my-nim-deployment-base-url>/chat/completions",
            "model_id": "<my-model>"
         }
      }
   }'
data = {
    "type": "model",
    "name": "my-target-model-2",
    "namespace": "my-organization",
    "model": {
        "api_endpoint": {
        "url": "<my-nim-deployment-base-url>/chat/completions",
        "model_id": "<my-model>"
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for a Chat Endpoint (OpenAI-compatible Behind Authentication)#

To run an evaluation on an OpenAI-compatible chat endpoint that requires authentication with an API key or token, specify openai for model.api_endpoint.format, and specify the API key for model.api_endpoint.api_key.

Use the following code to create a target for an OpenAI-compatible chat endpoint behind authentication.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "model",
      "name": "my-target-model-3",
      "namespace": "my-organization",
      "model": {
         "api_endpoint": {
            "url": "<external-openai-compatible-base-url>/chat/completions",
            "model_id": "<external-model>",
            "api_key": "<my-api-key>",
            "format": "openai"        
         }
      }
   }'
data = {
    "type": "model",
    "name": "my-target-model-3",
    "namespace": "my-organization",
    "model": {
        "api_endpoint": {
        "url": "<external-openai-compatible-base-url>/chat/completions",
        "model_id": "<external-model>",
        "api_key": "<my-api-key>",
        "format": "openai"        
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target Offline (Pre-generated)#

An offline (pre-generated) target points to a file that is stored in stored in NeMo Data Store and that contains pre-generated answers. Offline targets are useful for similarity metrics evaluations. For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.

Use the following code to create a target that contains pre-generated answers.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "cached_outputs",
      "name": "my-target-model-4",
      "namespace": "my-organization",
      "cached_outputs": {
         "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
      }
   }'
data = {
    "type": "cached_outputs",
    "name": "my-target-model-4",
    "namespace": "my-organization",
    "cached_outputs": {
        "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Retriever Pipeline Targets#

Retriever pipelines are used to retrieve relevant documents based on a query. For more information, refer to Retriever Pipelines.

Example Target for Embedding Only#

In an embedding-only scenario, an embedding model is used to perform dense retrieval of documents.

Use the following code to create a retriever target with embedding only.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "retriever",
      "name": "my-target-retriever-1",
      "namespace": "my-organization",
      "retriever": {
         "pipeline": {
            "query_embedding_model": {
               "api_endpoint": {
                  "url": "<my-query-embedding-url>",
                  "model_id": "<my-query-embedding-model>"
               }
            },
            "index_embedding_model": {
               "api_endpoint": {
                  "url": "<my-index-embedding-url>",
                  "model_id": "<my-index-embedding-model>"
               }
            },
            "top_k": 5
         }
      }
   }'
data = {
    "type": "retriever",
    "name": "my-target-retriever-1",
    "namespace": "my-organization",
    "retriever": {
        "pipeline": {
        "query_embedding_model": {
            "api_endpoint": {
                "url": "<my-query-embedding-url>",
                "model_id": "<my-query-embedding-model>"
            }
        },
        "index_embedding_model": {
            "api_endpoint": {
                "url": "<my-index-embedding-url>",
                "model_id": "<my-index-embedding-model>"
            }
        },
        "top_k": 5
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Embedding + Reranking#

In an embedding + reranking scenario, the documents retrieved by the embedding model are reranked by the reranking model.

Use the following code to create a retriever target with embedding and reranking.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "retriever",
      "name": "my-target-retriever-2",
      "namespace": "my-organization",
      "retriever": {
         "pipeline": {
            "query_embedding_model": {
               "api_endpoint": {
                  "url": "<my-query-embedding-url>",
                  "model_id": "<my-query-embedding-model>"
               }
            },
            "index_embedding_model": {
               "api_endpoint": {
                  "url": "<my-index-embedding-url>",
                  "model_id": "<my-index-embedding-model>"
               }
            },
            "reranker_model": {
               "api_endpoint": {
                  "url": "<my-ranker-url>",
                  "model_id": "<my-ranker-model>"
               }
            },
            "top_k": 5
         }
      }
   }'
data = {
    "type": "retriever",
    "name": "my-target-retriever-2",
    "namespace": "my-organization",
    "retriever": {
        "pipeline": {
        "query_embedding_model": {
            "api_endpoint": {
                "url": "<my-query-embedding-url>",
                "model_id": "<my-query-embedding-model>"
            }
        },
        "index_embedding_model": {
            "api_endpoint": {
                "url": "<my-index-embedding-url>",
                "model_id": "<my-index-embedding-model>"
            }
        },
        "reranker_model": {
            "api_endpoint": {
                "url": "<my-ranker-url>",
                "model_id": "<my-ranker-model>"
            }
        },
        "top_k": 5
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

RAG Pipeline Targets#

Retrieval Augmented Generation (RAG) pipelines are built by pipelining NeMo Retriever and LLM. A retriever pipeline is used to retrieve relevant documents based on a query, and the LLM is used to generate answers based on the query and the retrieved documents. For more information, refer to RAG Pipelines.

Example Target for Answer Evaluation#

NeMo Evaluator supports Answer Evaluation RAG pipelines. The rag pipeline is replaced by a cached_outputs field that contains pre-generated retrieved documents and pre-generated answers. For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.

Use the following code to create a RAG target for an answer evaluation pipeline.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-1",
      "namespace": "my-organization",
      "rag": {
         "cached_outputs": {
            "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
         }
      }
   }'
data = {
    "type": "rag",
    "name": "my-target-rag-1",
    "namespace": "my-organization",
    "rag": {
        "cached_outputs": {
        "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Answer Generation + Answer Evaluation#

NeMo Evaluator supports Answer Generation + Answer Evaluation RAG pipelines. The retriever pipeline is replaced by a cached_outputs field that contains pre-generated retrieved documents. For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.

Use the following code to create a RAG target for an Answer Generation + Answer Evaluation pipeline.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-2",
      "namespace": "my-organization",
      "rag": {
         "pipeline": {
            "retriever": {
               "cached_outputs": {
                  "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
               }
            },
            "model": {
               "api_endpoint": {
                  "url": "<my-nim-deployment-base-url>/chat/completions",
                  "model_id": "<my-model>"
               }
            }
         }
      }
  }'
data = {
    "type": "rag",
    "name": "my-target-rag-2",
    "namespace": "my-organization",
    "rag": {
        "pipeline": {
            "retriever": {
                "cached_outputs": {
                    "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
                }
            },
            "model": {
                "api_endpoint": {
                    "url": "<my-nim-deployment-base-url>/chat/completions",
                    "model_id": "<my-model>"
                }
            }
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Retrieval (Embedding only) + Answer Generation + Answer Evaluation#

NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines. The rag pipeline field contains a retriever pipeline and a model.

Use the following code to create a RAG target for a Retrieval (Embedding only) + Answer Generation + Answer Evaluation pipeline.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-3",
      "namespace": "my-organization",
      "rag": {
         "pipeline": {
            "retriever": {
               "pipeline": {
                  "query_embedding_model": {
                     "api_endpoint": {
                        "url": "<my-query-embedding-url>",
                        "model_id": "<my-query-embedding-model>"
                     }
                  },
                  "index_embedding_model": {
                     "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                     }
                  },          
                  "top_k": 3
               }
            },
            "model": {
               "api_endpoint": {
                  "api_endpoint": {
                     "url": "<my-nim-deployment-base-url>/chat/completions",
                     "model_id": "<my-model>"
                  }
               }
            }
         }
      }
   }'
data = {
    "type": "rag",
    "name": "my-target-rag-3",
    "namespace": "my-organization",
    "rag": {
        "pipeline": {
            "retriever": {
                "pipeline": {
                    "query_embedding_model": {
                        "api_endpoint": {
                        "url": "<my-query-embedding-url>",
                        "model_id": "<my-query-embedding-model>"
                        }
                    },
                    "index_embedding_model": {
                        "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                        }
                    },          
                    "top_k": 3
                }
            },
            "model": {
                "api_endpoint": {
                    "api_endpoint": {
                        "url": "<my-nim-deployment-base-url>/chat/completions",
                        "model_id": "<my-model>"
                    }
                }
            }
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Retrieval (Embedding + Reranking) + Answer Generation + Answer Evaluation#

NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines. The rag pipeline field contains a retriever pipeline and a model.

Use the following code to create a RAG target for a Retrieval (Embedding + Reranking) + Answer Generation + Answer Evaluation pipeline.

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-4",
      "namespace": "my-organization",
      "rag": {
         "pipeline": {
            "retriever": {
               "pipeline": {
                  "query_embedding_model": {
                        "api_endpoint": {
                           "url": "<my-query-embedding-url>",
                           "model_id": "<my-query-embedding-model>"
                        }
                  },
                  "index_embedding_model": {
                     "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                     }
                  },
                  "reranker_model": {
                     "api_endpoint": {
                        "url": "<my-ranker-url>",
                        "model_id": "<my-ranker-model>"
                     }
                  },
                  "top_k": 3
               }
            },
            "model": {
               "api_endpoint": {
                  "api_endpoint": {
                     "url": "<my-nim-deployment-base-url>/chat/completions",
                     "model_id": "<my-model>"
                  }
               }
            }
         }
      }
   }'
data = {
    "type": "rag",
    "name": "my-target-rag-4",
    "namespace": "my-organization",
    "rag": {
        "pipeline": {
            "retriever": {
                "pipeline": {
                    "query_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-query-embedding-url>",
                            "model_id": "<my-query-embedding-model>"
                        }
                    },
                    "index_embedding_model": {
                        "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                        }
                    },
                    "reranker_model": {
                        "api_endpoint": {
                        "url": "<my-ranker-url>",
                        "model_id": "<my-ranker-model>"
                        }
                    },
                    "top_k": 3
                }
            },
            "model": {
                "api_endpoint": {
                    "api_endpoint": {
                        "url": "<my-nim-deployment-base-url>/chat/completions",
                        "model_id": "<my-model>"
                    }
                }
            }
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Delete a Target#

To delete an evaluation target, send a DELETE request to the targets endpoint. You must provide both the namespace and ID of the target as shown in the following code.

Caution

Before you delete a target, ensure that no jobs use it. If a job uses the target, you must delete the job first. To find all jobs that use a target, refer to Example: Filter Jobs by Target.

curl -X "DELETE" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets/<my-namespace>/<my-target-id>" \
  -H 'accept: application/json'
endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets/<my-namespace>/<my-target-id>"
response = requests.delete(endpoint).json()
response

When you delete a target, the response is similar to the following.

{
    "message": "Resource deleted successfully.",
    "id": "eval-target-ABCD1234EFGH5678",
    "deleted_at": null
}

Create Target Response#

When you create a target for an evaluation, the response is similar to the following.

For the full response reference, refer to Evaluator API.

{
    "created_at": "2025-03-19T22:23:28.528061",
    "updated_at": "2025-03-19T22:23:28.528062",
    "id": "eval-target-ABCD1234EFGH5678",
    "name": "my-target-model-1",
    "namespace": "my-organization",
    "type": "model",
    "model": {
        "schema_version": "1.0",
        "id": "model-MvPLX6aEa1zXJq7YMRCosm",
        "type_prefix": "model",
        "namespace": "default",
        "created_at": "2025-03-19T22:23:28.527760",
        "updated_at": "2025-03-19T22:23:28.527762",
        "custom_fields": {},
        "name": "model-MvPLX6aEa1zXJq7YMRCosm",
        "version_id": "main",
        "version_tags": [],
        "api_endpoint": {
            "url": "http://nemo-nim-proxy:8000/v1/chat/completions",
            "model_id": "meta/llama-3.1-8b-instruct",
            "format": "nim"
        }
    },
    "custom_fields": {}
}