Skip to main content
Ctrl+K
NeMo Evaluator SDK - Home NeMo Evaluator SDK - Home

NeMo Evaluator SDK

NeMo Evaluator SDK - Home NeMo Evaluator SDK - Home

NeMo Evaluator SDK

Table of Contents

  • Home

About

  • Overview
  • Key Features
  • Concepts
    • Architecture
    • Evaluation Model
    • Evaluation Output
    • Execution Backend
    • Framework Definition Files
    • Interceptors
  • Release Notes

Get Started

  • Getting Started
  • Install SDK
  • Quickstart
    • NeMo Evaluator Launcher
    • NeMo Evaluator Core
    • NeMo Framework Container
    • Container Direct

Tutorials

  • About Tutorials
  • How-To Guides
    • Remove Reasoning Traces
    • Switch Executor
  • Tutorials for NeMo Framework
    • Run Evaluations with NeMo Run
  • Evaluate an Existing Endpoint

Evaluation

  • About Model Evaluation
  • Benchmark Catalog
    • About Selecting Benchmarks
    • Available Benchmarks
      • AA-LCR
      • bfcl
      • bigcode-evaluation-harness
      • codec
      • garak
      • genai_perf_eval
      • helm
      • hle
      • ifbench
      • livecodebench
      • lm-evaluation-harness
      • mmath
      • mtbench
      • mteb
      • nemo_skills
      • profbench
      • ruler
      • safety_eval
      • scicode
      • simple_evals
      • tau2_bench
      • tooltalk
      • vlmevalkit
  • Tasks Not Explicitly Defined by FDF
  • Evaluation Techniques
    • Text Generation
    • Log Probability
    • Reasoning
  • Add Evaluation Packages to NeMo Framework

Model Deployment

  • About Model Deployment
  • Bring-Your-Own-Endpoint
    • Hosted Services
    • Testing Endpoint Compatibility
  • Use NeMo Framework
    • Introduction
    • PyTriton Serving Backend
    • Ray Serving Backend
    • Evaluate Megatron Bridge Checkpoints
    • Evaluate Automodel Checkpoints
    • Evaluate TRTLLM Checkpoints

Libraries

  • About NeMo Evaluator Libraries
  • Launcher
    • About NeMo Evaluator Launcher
    • Configuration
      • Deployment
        • vLLM
        • SGLang
        • NIM
        • TensorRT-LLM
        • Generic
        • None (External)
      • Executors
        • Local Executor
        • Slurm Executor
        • Lepton Executor
      • Evaluation
    • Exporters
      • Local Files
      • Weights & Biases
      • MLflow
      • Google Sheets
  • Core
    • About NeMo Evaluator
    • Workflows
      • CLI
      • Python API
    • Benchmark Containers
      • Language Models
      • Code Generation
      • Vision-Language
      • Safety & Security
      • Specialized Tools
      • Efficiency
    • Interceptors
      • System Messages
      • Payload Modification
      • Request Logging
      • Caching
      • Endpoint
      • Response Logging
      • Progress Tracking
      • Raising on Client Errors
      • Reasoning
      • Response Statistics
      • Post-Evaluation Hooks
    • Logging
    • Extending
      • Framework Definition File
        • Framework Section
        • Defaults Section
        • Evaluations Section
        • Advanced Features
        • Integration
        • Troubleshooting

References

  • About References
  • FAQ
  • NeMo Evaluator Core Python API
    • nemo_evaluator.api.api_dataclasses
    • nemo-evaluator.adapters
      • nemo_evaluator.adapters.adapter_config
      • nemo_evaluator.adapters.interceptors
      • nemo_evaluator.adapters.types
    • nemo-evaluator.sandbox
  • NeMo Evaluator Launcher Python API
  • nemo-evaluator CLI
  • nemo-evaluator-launcher CLI
  • Tutorials

Tutorials#

Master NeMo Evaluator with hands-on tutorials and practical examples.

How-To

Hands-on, step-by-step guides showcasing a single feature or use-case.

How-To Guides
Evaluation with NeMo Framework

Deploy models and run evaluations using NeMo Framework container.

Tutorials for NeMo Framework
Evaluate an existing endpoint using local executor
Local Evaluation of Existing Endpoint

previous

Container Direct

next

How-To Guides

NVIDIA NVIDIA
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2025, NVIDIA Corporation.