For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • About
    • Concepts
    • Environment Components
    • Ecosystem
    • Release Notes
  • Get Started
    • Prerequisites
    • Installation
    • Quickstart
  • Agent Server
  • Model Server
    • vLLM
  • Resources Server
  • Data
    • Prepare and Validate
    • Download from Hugging Face
    • Prompt Config
  • Environment Tutorials
    • Single-Step Environment
    • Multi-Step Environment
    • Stateful Environment
    • Real-World Environment
    • Integrate external libraries
    • Add a benchmark
    • Verification Patterns
      • LLM-as-Judge
    • Aggregate Metrics
  • Training Tutorials
    • NeMo RL
    • Unsloth
    • Multi-Environment Training
    • Training with VeRL
    • Offline Training (SFT/DPO)
  • Model Recipes
    • Nemotron 3 Nano
    • Nemotron 3 Super
  • Infrastructure
    • Deployment Topology
    • Engineering Notes
  • Reference
    • Configuration
    • RL Framework Compatibility
    • CLI Commands
    • FAQ
  • Troubleshooting
    • Configuration Errors
  • Contribute
    • Development Setup
    • Environments
    • Integrate RL Frameworks
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Gym
Environment Tutorials

Verification Patterns

||View as Markdown|

Verification is how an environment evaluates agent behavior and computes a score. Every environment implements some form of verification — the pattern you choose depends on your task.

Equivalence Match

Compare the agent’s output to a known reference answer.

coming soon
Execution and State Match

Execute the agent’s actions, such as tool calls or generated code, and verify the output or resulting state.

coming soon
LLM-as-Judge

Prompt an LLM to evaluate the agent’s output against rubrics, instructions, or reference answers.

Reward Model

Use an LLM trained on human preferences to score outputs for alignment, such as RLHF reward modeling.

coming soon
Previous

Add a benchmark

Next

LLM-as-Judge