Integration with Eval Factory#
This section describes how to integrate your Framework Definition File with the Eval Factory system.
File Location#
Place your FDF in the core_evals/<framework_name>/
directory of your framework package:
your-framework/
core_evals/
your_framework/
framework.yml # This is your FDF
output.py # Output parser (custom)
__init__.py # Empty init file
setup.py # Package configuration
README.md # Framework documentation
Directory Structure Explanation#
core_evals/: Root directory for evaluation framework definitions. This directory name is required by the Eval Factory system.
your_framework/: Subdirectory named after your framework (must match framework.name
from your FDF).
framework.yml: Your Framework Definition File. This exact filename is required.
output.py: Custom output parser for processing evaluation results. This file should implement the parsing logic specific to your framework’s output format.
init.py: Empty initialization file to make the directory a Python package.
Validation#
The FDF is validated by the NeMo Evaluator system when loaded. Validation occurs through Pydantic models that ensure:
Required fields are present (
name
,pkg_name
,command
)Parameter types are correct (strings, integers, floats, lists)
Template syntax is valid (Jinja2 parsing)
Configuration consistency (endpoint types, parameter references)
Validation Checks#
Schema Validation: Pydantic models ensure required fields exist and have correct types when the FDF is parsed.
Template Validation: Jinja2 templates are rendered with StrictUndefined
, which raises errors for undefined variables.
Reference Validation: Template variables must reference valid fields in the Evaluation
model (config
, target
, framework_name
, pkg_name
).
Consistency Validation: Endpoint types and parameters should be consistent across framework defaults and evaluation-specific configurations.
Registration#
Once your FDF is properly located and validated, the Eval Factory system automatically:
Discovers your framework during initialization
Parses the FDF and validates its structure
Registers available evaluation types
Makes your framework available via CLI commands
Using Your Framework#
After successful integration, you can use your framework with the Eval Factory CLI:
# List available frameworks and tasks
nemo-evaluator ls
# Run an evaluation
nemo-evaluator run_eval --eval_type your_evaluation --model_id my-model ...
Package Configuration#
Ensure your setup.py
or pyproject.toml
includes the FDF in package data:
from setuptools import setup, find_packages
setup(
name="your-framework",
packages=find_packages(),
package_data={
"core_evals": ["**/*.yml"],
},
include_package_data=True,
)
[tool.setuptools.package-data]
core_evals = ["**/*.yml"]
Best Practices#
Follow the exact directory structure and naming conventions
Test your FDF validation locally before deployment
Document your framework’s output format in README.md
Include example configurations in your documentation
Provide sample commands for common use cases
Version your FDF changes alongside framework updates
Keep the FDF synchronized with your framework’s capabilities