Model Configuration#

Metrics that use an LLM — such as LLM-as-a-Judge, RAG, and Agentic metrics — require model configuration for their model, judge_model, or embeddings_model fields.

You can specify models in two ways: inline or by reference.

Inline Model#

Define the model endpoint directly in the metric definition:

"model": {
    "url": "<nim-endpoint-url>/v1",
    "name": "meta/llama-3.1-70b-instruct",
    "format": "nim",
    "api_key_secret": "my-secret",  # optional
}

Field	Required	Description
`url`	Yes	The base URL of the inference endpoint.
`name`	Yes	The model name to send in inference requests.
`format`	No	The API format (`"nim"`, `"openai"`, `"llama_stack"`). Defaults to `"nim"`.
`api_key_secret`	No	Name of a secret containing the API key. Must be in the same workspace.

Use inline models when you want explicit control over the endpoint URL and model name, or when connecting to external APIs.

Model Reference#

Reference a model entity that has been registered with the NeMo Platform Models API:

"model": "my-workspace/my-judge-model"

A model reference is a string in the format workspace/model_name that points to an existing model entity. When you use a model reference, the evaluator:

Validates the model entity exists through the Models API.
Builds the Inference Gateway route URL for the model.
Routes all inference requests through the Inference Gateway.

Model references are useful when you have models registered as model entities and want to:

Reuse the same model across multiple metrics without repeating endpoint details.
Route inference through the Inference Gateway for centralized model management.
Avoid embedding endpoint URLs and credentials directly in metric definitions.

Supported Fields#

Both inline models and model references are supported for the following fields:

Field	Used By
`model`	LLM-as-a-Judge metrics, online benchmark jobs
`judge_model`	RAG and Agentic metrics
`embeddings_model`	RAG metrics that require embeddings (for example, Response Relevancy)

Examples#

LLM-as-a-Judge with Inline Model#

result = client.evaluation.metrics.evaluate(
    metric={
        "type": "llm-judge",
        "model": {
            "url": "https://integrate.api.nvidia.com/v1",
            "name": "meta/llama-3.1-70b-instruct",
            "format": "nim",
            "api_key_secret": "nvidia-api-key",
        },
        "scores": [...],
        "prompt_template": {...},
    },
    dataset={...},
)

LLM-as-a-Judge with Model Reference#

result = client.evaluation.metrics.evaluate(
    metric={
        "type": "llm-judge",
        "model": "my-workspace/my-judge-model",
        "scores": [...],
        "prompt_template": {...},
    },
    dataset={...},
)

RAGAS Metric with Model Reference#

result = client.evaluation.metrics.evaluate(
    metric={
        "type": "topic_adherence",
        "judge_model": "my-workspace/my-judge-model",
    },
    dataset={...},
)