Verification Patterns
Verification is how an environment evaluates agent behavior and computes a score. Every environment implements some form of verification — the pattern you choose depends on your task.
Equivalence Match
Compare the agent’s output to a known reference answer.
coming soonExecution and State Match
Execute the agent’s actions, such as tool calls or generated code, and verify the output or resulting state.
coming soonLLM-as-Judge
Prompt an LLM to evaluate the agent’s output against rubrics, instructions, or reference answers.
Reward Model
Use an LLM trained on human preferences to score outputs for alignment, such as RLHF reward modeling.
coming soon