Skip to main content
Ctrl+K
NeMo Evaluator SDK - Home NeMo Evaluator SDK - Home

NeMo Evaluator SDK

NeMo Evaluator SDK - Home NeMo Evaluator SDK - Home

NeMo Evaluator SDK

Table of Contents

  • Home

About

  • Overview
  • Key Features
  • Concepts
    • Architecture
    • Evaluation Model
    • Evaluation Output
    • Execution Backend
    • Framework Definition Files
    • Interceptors
  • Configuration
  • Telemetry
  • Release Notes

Get Started

  • Getting Started
  • Install SDK
  • Quickstart
    • NeMo Evaluator Launcher
    • NeMo Evaluator Core
    • NeMo Framework Container
    • Container Direct

Tutorials

  • About Tutorials
  • How-To Guides
    • Remove Reasoning Traces
    • Switch Executor
    • Compliance integrity evaluation
  • Tutorials for NeMo Framework
    • Run Evaluations with NeMo Run
  • Evaluate an Existing Endpoint

Evaluation

  • About Model Evaluation
  • Benchmark Catalog
    • About Selecting Benchmarks
    • Available Benchmarks
      • AA-LCR
      • bfcl
      • bigcode-evaluation-harness
      • codec
      • garak
      • genai_perf_eval
      • helm
      • hle
      • ifbench
      • livecodebench
      • lm-evaluation-harness
      • mmath
      • mtbench
      • mteb
      • nemo_skills
      • profbench
      • ruler
      • safety_eval
      • scicode
      • simple_evals
      • tau2_bench
      • tooltalk
      • vlmevalkit
  • Evaluation Configuration Parameters
  • Tasks Not Explicitly Defined by FDF
  • Evaluation Techniques
    • Text Generation
    • Log Probability
    • Reasoning
  • Add Evaluation Packages to NeMo Framework

Model Deployment

  • About Model Deployment
  • Bring-Your-Own-Endpoint
    • Hosted Services
    • Testing Endpoint Compatibility
  • Use NeMo Framework
    • Introduction
    • PyTriton Serving Backend
    • Ray Serving Backend
    • Evaluate Megatron Bridge Checkpoints
    • Evaluate Automodel Checkpoints
    • Evaluate TRTLLM Checkpoints

Libraries

  • About NeMo Evaluator Libraries
  • Launcher
    • About NeMo Evaluator Launcher
    • Configuration
      • Deployment
        • vLLM
        • SGLang
        • NIM
        • TensorRT-LLM
        • Generic
        • None (External)
      • Executors
        • Local Executor
        • Slurm Executor
        • Lepton Executor
      • Evaluation
    • Exporters
      • Local Files
      • Weights & Biases
      • MLflow
      • Google Sheets
  • Core
    • About NeMo Evaluator
    • Workflows
      • CLI
      • Python API
    • Benchmark Containers
      • Language Models
      • Code Generation
      • Vision-Language
      • Safety & Security
      • Specialized Tools
      • Efficiency
    • Interceptors
      • System Messages
      • Payload Modification
      • Request Logging
      • Caching
      • Endpoint
      • Response Logging
      • Progress Tracking
      • Raising on Client Errors
      • Reasoning
      • Response Statistics
      • Post-Evaluation Hooks
    • Logging
    • Extending
      • Framework Definition File
        • Framework Section
        • Defaults Section
        • Evaluations Section
        • Advanced Features
        • Integration
        • Troubleshooting

References

  • About References
  • FAQ
  • NeMo Evaluator Core Python API
    • nemo_evaluator.api.api_dataclasses
    • nemo-evaluator.adapters
      • nemo_evaluator.adapters.adapter_config
      • nemo_evaluator.adapters.interceptors
      • nemo_evaluator.adapters.types
    • nemo-evaluator.sandbox
  • NeMo Evaluator Launcher Python API
  • nemo-evaluator CLI
  • nemo-evaluator-launcher CLI
  • Overview: module code

All modules for which code is available

  • nemo_evaluator.adapters.adapter_config
  • nemo_evaluator.adapters.interceptors.caching_interceptor
  • nemo_evaluator.adapters.interceptors.endpoint_interceptor
  • nemo_evaluator.adapters.interceptors.logging_interceptor
  • nemo_evaluator.adapters.interceptors.payload_modifier_interceptor
  • nemo_evaluator.adapters.interceptors.progress_tracking_interceptor
  • nemo_evaluator.adapters.interceptors.raise_client_error_interceptor
  • nemo_evaluator.adapters.interceptors.reasoning_interceptor
  • nemo_evaluator.adapters.interceptors.response_stats_interceptor
  • nemo_evaluator.adapters.interceptors.system_message_interceptor
  • nemo_evaluator.adapters.types
  • nemo_evaluator.api.api_dataclasses
  • nemo_evaluator.core.entrypoint
  • nemo_evaluator.core.evaluate
  • nemo_evaluator.core.input
  • nemo_evaluator.core.utils
  • nemo_evaluator.sandbox.base
  • nemo_evaluator.sandbox.ecs_fargate
NVIDIA NVIDIA
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2025, NVIDIA Corporation.