Get Started#

Before You Start#

Before you begin, make sure you have:

Python Environment: Python 3.10 or higher (up to 3.13)
OpenAI-Compatible Endpoint: Hosted or self-deployed model API
Docker: For container-based evaluation workflows (optional)
NVIDIA GPU: For local model deployment (optional)

Quick Start Path#

Installation

Install NeMo Evaluator and set up your evaluation environment with all necessary dependencies.

Installation Guide

Quick Start

Deploy your first model and run a simple evaluation in just a few minutes.

Quickstart

Entry Point Decision Guide#

NeMo Evaluator provides three primary entry points, each designed for different user needs and workflows. Use this guide to choose the right approach for your use case.

        flowchart TD
    A[I need to evaluate AI models] --> B{What's your primary goal?}
    
    B -->|Quick evaluations with minimal setup| C[NeMo Evaluator Launcher]
    B -->|Custom integrations and workflows| D[NeMo Evaluator Core]
    B -->|Direct container control| E[Direct Container Usage]
    
    C --> C1[ Unified CLI interface<br/> Multi-backend execution<br/> Built-in result export<br/> 100+ benchmarks ready]
    
    D --> D1[ Programmatic API control<br/> Custom evaluation workflows<br/> Adapter/interceptor system<br/> Framework extensions]
    
    E --> E1[ Maximum flexibility<br/> Custom container workflows<br/> Direct framework access<br/> Advanced users only]
    
    C1 --> F[Start with Launcher Quickstart]
    D1 --> G[Start with Core API Guide]
    E1 --> H[Start with Container Reference]
    
    style C fill:#e1f5fe
    style D fill:#f3e5f5
    style E fill:#fff3e0

What You’ll Learn#

By the end of this section, you’ll be able to:

Install and configure NeMo Evaluator components for your needs
Choose the right approach from the three-tier architecture
Run your first evaluation using hosted or self-deployed endpoints
Configure advanced features like adapters and interceptors
Integrate evaluations into your ML workflows

Typical Workflows#

Launcher Workflow (Most Users)#

Install NeMo Evaluator Launcher
Configure endpoint and benchmarks in YAML
Run evaluations with single CLI command
Export results to MLflow, W&B, or local files

Core API Workflow (Developers)#

Install NeMo Evaluator Core library
Configure adapters and interceptors programmatically
Integrate into existing ML pipelines
Customize evaluation logic and processing

Container Workflow (Container Users)#

Pull pre-built evaluation containers
Run evaluations directly in isolated environments
Mount data and results for persistence
Combine with existing container orchestration