bridge.perf_recipes._common#
Shared helpers for flat performance benchmark recipes.
_benchmark_common applies throughput-measurement defaults.
_perf_precision returns a mixed-precision config for a given dtype.
Module Contents#
Functions#
Apply benchmark-mode defaults that prioritize throughput measurement over convergence. |
|
Enable optimizer-step parameter gather overlap on optimizer and comm-overlap configs. |
|
Return mixed-precision config tuned for perf benchmarks. |
API#
- bridge.perf_recipes._common._benchmark_common(
- cfg: megatron.bridge.training.config.ConfigContainer,
- cross_entropy_impl: str = 'te',
Apply benchmark-mode defaults that prioritize throughput measurement over convergence.
Intended for performance benchmark recipes only. Sets short training runs, disables checkpointing/eval, tunes scheduler, and enables perf-oriented kernels.
Must stay in sync with
_set_common_perf_overridesinscripts/performance/utils/overrides.py.Individual recipes may override any of these after calling this function (e.g. Kimi K2 sets
grad_reduce_in_fp32 = True).
- bridge.perf_recipes._common._enable_overlap_param_gather_with_optimizer_step(
- cfg: megatron.bridge.training.config.ConfigContainer,
Enable optimizer-step parameter gather overlap on optimizer and comm-overlap configs.
- bridge.perf_recipes._common._perf_precision(compute_dtype: str)#
Return mixed-precision config tuned for perf benchmarks.
Identical to
scripts/performance/utils/precision.get_precision_configbut importable from the library side. Always setsgrad_reduce_in_fp32=Falseso that callers that replacecfg.mixed_precisionafter_benchmark_common()still get the benchmark-mode default.