Evaluation Tutorials#
Use these tutorials to become familiar with NVIDIA NeMo Evaluator.
Tip
Tutorials are organized by complexity and typically build on one another.
Before You Start#
Set up Evaluator with Docker Compose and deploy meta/llama-3.2-3b-instruct for the following tutorials.
Run an Academic LM Harness Eval
Learn how to run an evaluation.
Run an LLM Judge Eval
Learn how to evaluate a fine-tuned model using the LLM Judge metric with a custom dataset.
The following tutorial requires Evaluator deployed following the Demo Cluster Setup on minikube or Kubernetes deployment guides for the platform.
Evaluate a Fine-tuned Model
Learn how to evaluate a fine-tuned model.