Evaluation Tutorials#

Use these tutorials to become familiar with NVIDIA NeMo Evaluator.

Tip

Tutorials are organized by complexity and typically build on one another.

Before You Start#

Set up Evaluator with Docker Compose and deploy meta/llama-3.2-3b-instruct for the following tutorials.

Run an Academic LM Harness Eval

Learn how to run an evaluation.

beginner built-in metrics nemo-evaluator

Run an LLM Judge Eval

Learn how to evaluate a fine-tuned model using the LLM Judge metric with a custom dataset.

intermediate custom evaluation llm judge nemo-evaluator

Evaluate a Fine-tuned Model

Learn how to evaluate a fine-tuned model.

intermediate fine-tuning nemo-evaluator nemo-customizer