Evaluation Tutorials#

Use these tutorials to become familiar with NVIDIA NeMo Evaluator.

Tip

Tutorials are organized by complexity and typically build on one another.

Before You Start#

Set up Evaluator before following these tutorials. Refer to the Demo Cluster Setup on minikube or production deployment guides for the platform and Evaluator individually.


Run an Academic LM Harness Eval

Learn how to run an evaluation.

Run an Academic LM Harness Eval
Run an LLM Judge Eval

Learn how to evaluate a fine-tuned model using the LLM Judge metric with a custom dataset.

Run an LLM Judge Eval
Evaluate a Fine-tuned Model

Learn how to evaluate a fine-tuned model.

Customize and Evaluate Large Language Models