`bridge.perf_recipes._common`#

Shared helpers for flat performance benchmark recipes.

_benchmark_common applies throughput-measurement defaults. _perf_precision returns a mixed-precision config for a given dtype.

Module Contents#

Functions#

`_benchmark_common`	Apply benchmark-mode defaults that prioritize throughput measurement over convergence.
`_enable_overlap_param_gather_with_optimizer_step`	Enable optimizer-step parameter gather overlap on optimizer and comm-overlap configs.
`_perf_precision`	Return mixed-precision config tuned for perf benchmarks.

API#

bridge.perf_recipes._common._benchmark_common( cfg: megatron.bridge.training.config.ConfigContainer, cross_entropy_impl: str = 'te', ) → None#

Apply benchmark-mode defaults that prioritize throughput measurement over convergence.

Intended for performance benchmark recipes only. Sets short training runs, disables checkpointing/eval, tunes scheduler, and enables perf-oriented kernels.

Must stay in sync with _set_common_perf_overrides in scripts/performance/utils/overrides.py.

Individual recipes may override any of these after calling this function (e.g. Kimi K2 sets grad_reduce_in_fp32 = True).

bridge.perf_recipes._common._enable_overlap_param_gather_with_optimizer_step( cfg: megatron.bridge.training.config.ConfigContainer, ) → None#: Enable optimizer-step parameter gather overlap on optimizer and comm-overlap configs.

bridge.perf_recipes._common._perf_precision(compute_dtype: str)#

Return mixed-precision config tuned for perf benchmarks.

Identical to scripts/performance/utils/precision.get_precision_config but importable from the library side. Always sets grad_reduce_in_fp32=False so that callers that replace cfg.mixed_precision after _benchmark_common() still get the benchmark-mode default.

bridge.perf_recipes._common#

Module Contents#

Functions#

API#

`bridge.perf_recipes._common`#