nemo_gym.benchmarks#

Benchmark discovery and preparation utilities.

Module Contents#

Classes#

BenchmarkConfig

PrepareBenchmarkConfig

Prepare benchmark data by running the benchmark’s prepare.py script.

Functions#

_load_benchmarks_from_config_paths

list_benchmarks

CLI command: list available benchmarks.

_multiprocess_benchmark_prepare_fn

prepare_benchmark

CLI command: prepare benchmark data.

Data#

API#

nemo_gym.benchmarks.BENCHMARKS_DIR#

None

class nemo_gym.benchmarks.BenchmarkConfig(/, **data: typing.Any)[source]#

Bases: pydantic.BaseModel

name: str#

None

path: pathlib.Path#

None

agent_name: str#

None

num_repeats: int#

None

dataset: nemo_gym.config_types.BenchmarkDatasetConfig#

None

classmethod from_config_path(
config_path: pathlib.Path,
) BenchmarkConfig | None[source]#
classmethod from_initial_config_dict(
path: pathlib.Path,
initial_config_dict: omegaconf.DictConfig,
) BenchmarkConfig | None[source]#
nemo_gym.benchmarks._load_benchmarks_from_config_paths(
config_paths: List[pathlib.Path],
) Dict[str, nemo_gym.benchmarks.BenchmarkConfig][source]#
nemo_gym.benchmarks.list_benchmarks() None[source]#

CLI command: list available benchmarks.

class nemo_gym.benchmarks.PrepareBenchmarkConfig(/, **data: typing.Any)[source]#

Bases: nemo_gym.config_types.BaseNeMoGymCLIConfig

Prepare benchmark data by running the benchmark’s prepare.py script.

The benchmark is identified from a config_paths entry pointing to a benchmarks/*/config.yaml file.

Examples:

ng_prepare_benchmark "+config_paths=[benchmarks/aime24/config.yaml]"

Initialization

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

use_cached_prepared_benchmarks: bool#

‘Field(…)’

num_prepare_benchmark_processes: int#

‘Field(…)’

nemo_gym.benchmarks._multiprocess_benchmark_prepare_fn(args)[source]#
nemo_gym.benchmarks.prepare_benchmark() None[source]#

CLI command: prepare benchmark data.