Create and Manage Evaluation Targets#

When you run an evaluation in NVIDIA NeMo Evaluator, you create a separate target and configuration for the evaluation.

Tip

Because NeMo Evaluator separates the target and the configuration, you can create a target once, and reuse it multiple times with different configurations (for example, to make a model scorecard). To see what targets and configurations are supported together, refer to Combine Evaluation Targets and Configurations.

NeMo Evaluator provides evaluation capabilities the following different target types:

LLM Models
Retriever Pipelines
RAG Pipelines

Evaluator API URL#

To create a target for an evaluation, send a POST request to the evaluation/targets API. The URL of the evaluator API depends on where you deploy evaluator and how you configure it. For more information, refer to NeMo Evaluator Deployment Guide.

The examples in this documentation specify {EVALUATOR_HOSTNAME} in the code. Do the following to store the evaluator hostname to use it in your code.

Important

Replace <your evaluator service endpoint> with your address, such as evaluator.internal.your-company.com, before you run this code.

curl

export EVALUATOR_HOSTNAME="<your evaluator service endpoint>"

Python

import requests

EVALUATOR_HOSTNAME = "<your evaluator service endpoint>" 

Example Target#

The following is the partial structure of the code to create an evaluation target. Use the rest of this documentation to see examples and reference to create a target specific to your scenario.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -d '
    {
        "type": "<target-type>",
        "name": "<my-target-name>",
        "namespace": "<my-namespace>",

        // More target details
    }'

Python

data = {
    "type": "<evaluation-type>",
    "name": "<my-configuration-name>",
    "namespace": "<my-namespace>",

    // More target details
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

To see a sample response, refer to Create Target Response.

Target JSON Reference#

When you create a target for an evaluation, you send a JSON data structure that contains the information for your target.

Important

Each target is uniquely identified by a combination of namespace and name. For example my-organization/my-target.

The following table contains selected field reference for the JSON data. For the full API reference, refer to Evaluator API.

Name	Description	Type	Valid Values or Child Objects
api_endpoint	The endpoint for a model.	Object	- `url` - `model_id` - `api_key`
api_key	The key to access an API endpoint.	String	—
cached_outputs	Pre-generated data.	Object	- `files_url`
context_ordering	The order for retrieved results.	String	- `asc` - `desc`
custom_fields	An optional object that you can use to store additional information.	Object	—
files_url	The url for a file that contains pre-generated data. Use `hf://datasets/` as prefix for files stored in NeMo Data Store. For format information, refer to Use Custom Data with NVIDIA NeMo Evaluator.	String	—
id	The ID of the target. The ID is returned in the response when you create a target.	String	—
index_embedding_model	The NIM model for the embedding model to perform indexing of documents.	Object	- `api_endpoint`
model	The NIM model for an evaluation.	Object	- `api_endpoint`
model_id	The id of the NIM model, as specified in Models.	String	—
name	An arbitrary name for to identify the target. If you don’t specify a name, the default is the ID associated with the target.	String	—
namespace	An arbitrary organization name, a vendor name, or any other text. If you don’t specify a namespace, the default is `default`.	String	—
pipeline	The pipeline for a retriever or RAG evaluation.	Object	- `query_embedding_model` - `index_embedding_model` - `reranker_model` - `top_k` - `retriever` - `model` - `context_ordering`
query_embedding_model	The NIM model for the embedding model to perform querying.	Object	- `api_endpoint`
rag	A RAG pipeline for an evaluation.	Object	- `pipeline` - `cached_outputs`
reranker_model	The NIM model for the reranker model to perform reranking documents.	Object	- `api_endpoint`
retriever	A retriever pipeline for an evaluation.	Object	- `pipeline` - `cached_outputs`
top_k	The number of relevant documents to be retrieved based on the query, sorted descending by relevance score.	Integer	Any positive number. In practice, this value should usually be less than 100.
type	The type of the evaluation target.	String	- `cached_outputs` - `model` - `retriever` - `rag`
url	The url for a model endpoint.	String	—

LLM Model Targets#

An LLM model target points to a model, such as an LLM model, a chat endpoint, or a data file.

Example Target for an LLM Model Endpoint#

To create an evaluation target pointing to an LLM model running as NIM for LLMs, specify a model that contains the api_endpoint of the model. For the list of NIM for LLMs models, refer to Models.

Use the following code to create a target for an LLM model.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "model",
      "name": "my-target-model-1",
      "namespace": "my-organization",
      "model": {
         "api_endpoint": {
            "url": "<my-nim-deployment-base-url>/completions",
            "model_id": "<my-model>"
         }
      }
   }'

Python

data = {
    "type": "model",
    "name": "my-target-model-1",
    "namespace": "my-organization",
    "model": {
        "api_endpoint": {
        "url": "<my-nim-deployment-base-url>/completions",
        "model_id": "<my-model>"
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for a Chat Endpoint#

To run an evaluation using a chat endpoint, specify a model.api_endpoint.url that contains a URL that ends with /chat/completions.

Use the following code to create a target for a chat endpoint.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "model",
      "name": "my-target-model-2",
      "namespace": "my-organization",
      "model": {
         "api_endpoint": {
            "url": "<my-nim-deployment-base-url>/chat/completions",
            "model_id": "<my-model>"
         }
      }
   }'

Python

data = {
    "type": "model",
    "name": "my-target-model-2",
    "namespace": "my-organization",
    "model": {
        "api_endpoint": {
        "url": "<my-nim-deployment-base-url>/chat/completions",
        "model_id": "<my-model>"
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for a Chat Endpoint (OpenAI-compatible Behind Authentication)#

To run an evaluation on an OpenAI-compatible chat endpoint that requires authentication with an API key or token, specify openai for model.api_endpoint.format, and specify the API key for model.api_endpoint.api_key.

Use the following code to create a target for an OpenAI-compatible chat endpoint behind authentication.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "model",
      "name": "my-target-model-3",
      "namespace": "my-organization",
      "model": {
         "api_endpoint": {
            "url": "<external-openai-compatible-base-url>/chat/completions",
            "model_id": "<external-model>",
            "api_key": "<my-api-key>",
            "format": "openai"        
         }
      }
   }'

Python

data = {
    "type": "model",
    "name": "my-target-model-3",
    "namespace": "my-organization",
    "model": {
        "api_endpoint": {
        "url": "<external-openai-compatible-base-url>/chat/completions",
        "model_id": "<external-model>",
        "api_key": "<my-api-key>",
        "format": "openai"        
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target Offline (Pre-generated)#

An offline (pre-generated) target points to a file that is stored in stored in NeMo Data Store and that contains pre-generated answers. Offline targets are useful for similarity metrics evaluations. For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.

Use the following code to create a target that contains pre-generated answers.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "cached_outputs",
      "name": "my-target-model-4",
      "namespace": "my-organization",
      "cached_outputs": {
         "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
      }
   }'

Python

data = {
    "type": "cached_outputs",
    "name": "my-target-model-4",
    "namespace": "my-organization",
    "cached_outputs": {
        "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Retriever Pipeline Targets#

Retriever pipelines are used to retrieve relevant documents based on a query. For more information, refer to Retriever Pipelines.

Example Target for Embedding Only#

In an embedding-only scenario, an embedding model is used to perform dense retrieval of documents.

Use the following code to create a retriever target with embedding only.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "retriever",
      "name": "my-target-retriever-1",
      "namespace": "my-organization",
      "retriever": {
         "pipeline": {
            "query_embedding_model": {
               "api_endpoint": {
                  "url": "<my-query-embedding-url>",
                  "model_id": "<my-query-embedding-model>"
               }
            },
            "index_embedding_model": {
               "api_endpoint": {
                  "url": "<my-index-embedding-url>",
                  "model_id": "<my-index-embedding-model>"
               }
            },
            "top_k": 5
         }
      }
   }'

Python

data = {
    "type": "retriever",
    "name": "my-target-retriever-1",
    "namespace": "my-organization",
    "retriever": {
        "pipeline": {
        "query_embedding_model": {
            "api_endpoint": {
                "url": "<my-query-embedding-url>",
                "model_id": "<my-query-embedding-model>"
            }
        },
        "index_embedding_model": {
            "api_endpoint": {
                "url": "<my-index-embedding-url>",
                "model_id": "<my-index-embedding-model>"
            }
        },
        "top_k": 5
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Embedding + Reranking#

In an embedding + reranking scenario, the documents retrieved by the embedding model are reranked by the reranking model.

Use the following code to create a retriever target with embedding and reranking.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "retriever",
      "name": "my-target-retriever-2",
      "namespace": "my-organization",
      "retriever": {
         "pipeline": {
            "query_embedding_model": {
               "api_endpoint": {
                  "url": "<my-query-embedding-url>",
                  "model_id": "<my-query-embedding-model>"
               }
            },
            "index_embedding_model": {
               "api_endpoint": {
                  "url": "<my-index-embedding-url>",
                  "model_id": "<my-index-embedding-model>"
               }
            },
            "reranker_model": {
               "api_endpoint": {
                  "url": "<my-ranker-url>",
                  "model_id": "<my-ranker-model>"
               }
            },
            "top_k": 5
         }
      }
   }'

Python

data = {
    "type": "retriever",
    "name": "my-target-retriever-2",
    "namespace": "my-organization",
    "retriever": {
        "pipeline": {
        "query_embedding_model": {
            "api_endpoint": {
                "url": "<my-query-embedding-url>",
                "model_id": "<my-query-embedding-model>"
            }
        },
        "index_embedding_model": {
            "api_endpoint": {
                "url": "<my-index-embedding-url>",
                "model_id": "<my-index-embedding-model>"
            }
        },
        "reranker_model": {
            "api_endpoint": {
                "url": "<my-ranker-url>",
                "model_id": "<my-ranker-model>"
            }
        },
        "top_k": 5
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

RAG Pipeline Targets#

Retrieval Augmented Generation (RAG) pipelines are built by pipelining NeMo Retriever and LLM. A retriever pipeline is used to retrieve relevant documents based on a query, and the LLM is used to generate answers based on the query and the retrieved documents. For more information, refer to RAG Pipelines.

Example Target for Answer Evaluation#

NeMo Evaluator supports Answer Evaluation RAG pipelines. The rag pipeline is replaced by a cached_outputs field that contains pre-generated retrieved documents and pre-generated answers. For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.

Use the following code to create a RAG target for an answer evaluation pipeline.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-1",
      "namespace": "my-organization",
      "rag": {
         "cached_outputs": {
            "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
         }
      }
   }'

Python

data = {
    "type": "rag",
    "name": "my-target-rag-1",
    "namespace": "my-organization",
    "rag": {
        "cached_outputs": {
        "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Answer Generation + Answer Evaluation#

NeMo Evaluator supports Answer Generation + Answer Evaluation RAG pipelines. The retriever pipeline is replaced by a cached_outputs field that contains pre-generated retrieved documents. For more information, refer to Use Custom Data with NVIDIA NeMo Evaluator.

Use the following code to create a RAG target for an Answer Generation + Answer Evaluation pipeline.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-2",
      "namespace": "my-organization",
      "rag": {
         "pipeline": {
            "retriever": {
               "cached_outputs": {
                  "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
               }
            },
            "model": {
               "api_endpoint": {
                  "url": "<my-nim-deployment-base-url>/chat/completions",
                  "model_id": "<my-model>"
               }
            }
         }
      }
  }'

Python

data = {
    "type": "rag",
    "name": "my-target-rag-2",
    "namespace": "my-organization",
    "rag": {
        "pipeline": {
            "retriever": {
                "cached_outputs": {
                    "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
                }
            },
            "model": {
                "api_endpoint": {
                    "url": "<my-nim-deployment-base-url>/chat/completions",
                    "model_id": "<my-model>"
                }
            }
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Retrieval (Embedding only) + Answer Generation + Answer Evaluation#

NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines. The rag pipeline field contains a retriever pipeline and a model.

Use the following code to create a RAG target for a Retrieval (Embedding only) + Answer Generation + Answer Evaluation pipeline.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-3",
      "namespace": "my-organization",
      "rag": {
         "pipeline": {
            "retriever": {
               "pipeline": {
                  "query_embedding_model": {
                     "api_endpoint": {
                        "url": "<my-query-embedding-url>",
                        "model_id": "<my-query-embedding-model>"
                     }
                  },
                  "index_embedding_model": {
                     "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                     }
                  },          
                  "top_k": 3
               }
            },
            "model": {
               "api_endpoint": {
                  "api_endpoint": {
                     "url": "<my-nim-deployment-base-url>/chat/completions",
                     "model_id": "<my-model>"
                  }
               }
            }
         }
      }
   }'

Python

data = {
    "type": "rag",
    "name": "my-target-rag-3",
    "namespace": "my-organization",
    "rag": {
        "pipeline": {
            "retriever": {
                "pipeline": {
                    "query_embedding_model": {
                        "api_endpoint": {
                        "url": "<my-query-embedding-url>",
                        "model_id": "<my-query-embedding-model>"
                        }
                    },
                    "index_embedding_model": {
                        "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                        }
                    },          
                    "top_k": 3
                }
            },
            "model": {
                "api_endpoint": {
                    "api_endpoint": {
                        "url": "<my-nim-deployment-base-url>/chat/completions",
                        "model_id": "<my-model>"
                    }
                }
            }
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Example Target for Retrieval (Embedding + Reranking) + Answer Generation + Answer Evaluation#

NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines. The rag pipeline field contains a retriever pipeline and a model.

Use the following code to create a RAG target for a Retrieval (Embedding + Reranking) + Answer Generation + Answer Evaluation pipeline.

curl

curl -X "POST" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
   {
      "type": "rag",
      "name": "my-target-rag-4",
      "namespace": "my-organization",
      "rag": {
         "pipeline": {
            "retriever": {
               "pipeline": {
                  "query_embedding_model": {
                        "api_endpoint": {
                           "url": "<my-query-embedding-url>",
                           "model_id": "<my-query-embedding-model>"
                        }
                  },
                  "index_embedding_model": {
                     "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                     }
                  },
                  "reranker_model": {
                     "api_endpoint": {
                        "url": "<my-ranker-url>",
                        "model_id": "<my-ranker-model>"
                     }
                  },
                  "top_k": 3
               }
            },
            "model": {
               "api_endpoint": {
                  "api_endpoint": {
                     "url": "<my-nim-deployment-base-url>/chat/completions",
                     "model_id": "<my-model>"
                  }
               }
            }
         }
      }
   }'

Python

data = {
    "type": "rag",
    "name": "my-target-rag-4",
    "namespace": "my-organization",
    "rag": {
        "pipeline": {
            "retriever": {
                "pipeline": {
                    "query_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-query-embedding-url>",
                            "model_id": "<my-query-embedding-model>"
                        }
                    },
                    "index_embedding_model": {
                        "api_endpoint": {
                        "url": "<my-index-embedding-url>",
                        "model_id": "<my-index-embedding-model>"
                        }
                    },
                    "reranker_model": {
                        "api_endpoint": {
                        "url": "<my-ranker-url>",
                        "model_id": "<my-ranker-model>"
                        }
                    },
                    "top_k": 3
                }
            },
            "model": {
                "api_endpoint": {
                    "api_endpoint": {
                        "url": "<my-nim-deployment-base-url>/chat/completions",
                        "model_id": "<my-model>"
                    }
                }
            }
        }
    }
}

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets"

response = requests.post(endpoint, json=data).json()

Delete a Target#

To delete an evaluation target, send a DELETE request to the targets endpoint. You must provide both the namespace and ID of the target as shown in the following code.

Caution

Before you delete a target, ensure that no jobs use it. If a job uses the target, you must delete the job first. To find all jobs that use a target, refer to Example: Filter Jobs by Target.

curl

curl -X "DELETE" "http://${EVALUATOR_HOSTNAME}/v1/evaluation/targets/<my-namespace>/<my-target-id>" \
  -H 'accept: application/json'

Python

endpoint = f"http://{EVALUATOR_HOSTNAME}/v1/evaluation/targets/<my-namespace>/<my-target-id>"
response = requests.delete(endpoint).json()
response

When you delete a target, the response is similar to the following.

{
    "message": "Resource deleted successfully.",
    "id": "eval-target-ABCD1234EFGH5678",
    "deleted_at": null
}

Create Target Response#

When you create a target for an evaluation, the response is similar to the following.

For the full response reference, refer to Evaluator API.

{
    "created_at": "2025-03-19T22:23:28.528061",
    "updated_at": "2025-03-19T22:23:28.528062",
    "id": "eval-target-ABCD1234EFGH5678",
    "name": "my-target-model-1",
    "namespace": "my-organization",
    "type": "model",
    "model": {
        "schema_version": "1.0",
        "id": "model-MvPLX6aEa1zXJq7YMRCosm",
        "type_prefix": "model",
        "namespace": "default",
        "created_at": "2025-03-19T22:23:28.527760",
        "updated_at": "2025-03-19T22:23:28.527762",
        "custom_fields": {},
        "name": "model-MvPLX6aEa1zXJq7YMRCosm",
        "version_id": "main",
        "version_tags": [],
        "api_endpoint": {
            "url": "http://nemo-nim-proxy:8000/v1/chat/completions",
            "model_id": "meta/llama-3.1-8b-instruct",
            "format": "nim"
        }
    },
    "custom_fields": {}
}