`core.num_microbatches_calculator`#

Megatron Core number of microbatches calculators.

Module Contents#

Classes#

`NumMicroBatchesCalculator`	Base class for number of microbatches calculator.
`ConstantNumMicroBatchesCalculator`	Calculator of number of microbatches with constant global batch size.
`RampupBatchsizeNumMicroBatchesCalculator`	Calculator of number of microbatches with batch size rampup. Over `steps = (global-batch-size - start-batch-size) / batch_size_increment` increment batch size from start-batch-size to global-batch-size using rampup-samples / steps samples.

Functions#

`get_num_microbatches`	Get number of microbatches.
`get_current_global_batch_size`	Get current global batch size.
`get_micro_batch_size`	Get micro batch size.
`get_current_running_global_batch_size`	Get current running global batch size, taking into account number of DP replicas might be incompatible with true global batch size if `decrease_batch_size_if_needed` is True.
`update_num_microbatches`	Update number of microbatches.
`unset_num_microbatches_calculator`	Unset microbatches calculator.
`init_num_microbatches_calculator`	Initialize number of microbatches calculator. Supporting backward compatibility.
`destroy_num_microbatches_calculator`	Destroy number of microbatches calculator.
`reconfigure_num_microbatches_calculator`	Reconfigure number of microbatches calculator. Supporting backward compatibility.
`_configure_global_num_microbatches_calculator`	Configure number of microbatches calculator. Can be used for initialization and reconfiguration.
`_build_num_microbatches_calculator`	Build number of microbatches calculator. Internal helper method.
`_round`	Round `batch_size` down to nearest batch size divisible by `divisor`.

Data#

`logger`
`_GLOBAL_NUM_MICROBATCHES_CALCULATOR`

API#

core.num_microbatches_calculator.logger#: ‘getLogger(…)’

core.num_microbatches_calculator._GLOBAL_NUM_MICROBATCHES_CALCULATOR: Union[core.num_microbatches_calculator.ConstantNumMicroBatchesCalculator, core.num_microbatches_calculator.RampupBatchsizeNumMicroBatchesCalculator]#: None

core.num_microbatches_calculator.get_num_microbatches() → int#: Get number of microbatches.

core.num_microbatches_calculator.get_current_global_batch_size() → int#: Get current global batch size.

core.num_microbatches_calculator.get_micro_batch_size() → int#: Get micro batch size.

core.num_microbatches_calculator.get_current_running_global_batch_size() → int#: Get current running global batch size, taking into account number of DP replicas might be incompatible with true global batch size if decrease_batch_size_if_needed is True.

core.num_microbatches_calculator.update_num_microbatches( consumed_samples: int, consistency_check: bool = True, verbose: bool = False, ) → None#

Update number of microbatches.

Parameters:

consumed_samples (int) – Number of samples consumed.
consistency_check (bool, optional) – Option to check current schedule’s consistency. Defaults to True.
verbose (bool, optional) – Option to control logging. Defaults to False.

core.num_microbatches_calculator.unset_num_microbatches_calculator()#

Unset microbatches calculator.

Useful for multiple runs. See tests/unit_tests/ckpt_converter/test_ckpt_converter.py for an example.

core.num_microbatches_calculator.init_num_microbatches_calculator( rank: int, rampup_batch_size: Optional[List[int]], global_batch_size: int, micro_batch_size: int, data_parallel_size: int, decrease_batch_size_if_needed: bool = False, ) → None#

Initialize number of microbatches calculator. Supporting backward compatibility.

Parameters:

rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool, optional) – If true, scale down batch size to ensure divisibility by DP size * microbatch size. Defaults to False.

core.num_microbatches_calculator.destroy_num_microbatches_calculator()#: Destroy number of microbatches calculator.

core.num_microbatches_calculator.reconfigure_num_microbatches_calculator( rank: int, rampup_batch_size: Optional[List[int]], global_batch_size: int, micro_batch_size: int, data_parallel_size: int, decrease_batch_size_if_needed: bool = False, ) → None#

Reconfigure number of microbatches calculator. Supporting backward compatibility.

Parameters:

rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool, optional) – If true, scale down batch size to ensure divisibility by DP size * microbatch size. Defaults to False.

core.num_microbatches_calculator._configure_global_num_microbatches_calculator( rank: int, rampup_batch_size: Optional[List[int]], global_batch_size: int, micro_batch_size: int, data_parallel_size: int, decrease_batch_size_if_needed: bool = False, init: bool = False, ) → None#

Configure number of microbatches calculator. Can be used for initialization and reconfiguration.

Parameters:

rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool, optional) – If true, scale down batch size to ensure divisibility by DP size * microbatch size. Defaults to False.
init (bool, optional) – If true, initialize the calculator. Defaults to False.

core.num_microbatches_calculator._build_num_microbatches_calculator( rank: int, rampup_batch_size: Optional[List[int]], global_batch_size: int, micro_batch_size: int, data_parallel_size: int, decrease_batch_size_if_needed: bool, ) → Union[ConstantNumMicroBatchesCalculator, RampupBatchsizeNumMicroBatchesCalculator]#

Build number of microbatches calculator. Internal helper method.

Parameters:

rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool) – If true, scale down batch size to ensure divisibility by DP size * microbatch size.

core.num_microbatches_calculator._round(batch_size: int, divisor: int) → int#: Round batch_size down to nearest batch size divisible by divisor.

class core.num_microbatches_calculator.NumMicroBatchesCalculator#

Bases: abc.ABC

Base class for number of microbatches calculator.

Initialization

get() → int#: Get number of microbatches.

get_current_global_batch_size() → int#: Get current global batch size.

get_micro_batch_size() → int#: Get current global batch size.

get_current_running_global_batch_size() → int#: Get current running global batch size. If decrease_batch_size_if_needed is False, this just equals global batch size.

abstractmethod update(consumed_samples, consistency_check, verbose=False) → None#: Update number of microbatches depending on batch size rampup.

class core.num_microbatches_calculator.ConstantNumMicroBatchesCalculator( global_batch_size: int, micro_batch_size: int, data_parallel_size: int, decrease_batch_size_if_needed: bool, rank: int, )#

Bases: core.num_microbatches_calculator.NumMicroBatchesCalculator

Calculator of number of microbatches with constant global batch size.

Parameters:

global_batch_size (int) – Global batch size.
micro_batch_size (int) – Micro batch size.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool) – If true, decrease batch size to ensure divisibility by DP size * microbatch size (if needed).
rank (int) – Rank (to determine whether logging should be performed).

Initialization

update(consumed_samples, consistency_check, verbose=False) → None#

class core.num_microbatches_calculator.RampupBatchsizeNumMicroBatchesCalculator( global_batch_size: int, micro_batch_size: int, data_parallel_size: int, decrease_batch_size_if_needed: bool, rank: int, start_global_batch_size: int, batch_size_increment: int, ramup_samples: int, )#

Bases: core.num_microbatches_calculator.NumMicroBatchesCalculator

Calculator of number of microbatches with batch size rampup. Over steps = (global-batch-size - start-batch-size) / batch_size_increment increment batch size from start-batch-size to global-batch-size using rampup-samples / steps samples.

Parameters:

global_batch_size (int) – Global batch size post rampup.
micro_batch_size (int) – Micro batch size.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool) – If true, decrease batch size to ensure divisibility by DP size * microbatch size (if needed).
rank (int) – Rank (to determine whether logging should be performed).
start_global_batch_size (int) – Global batch size to start with.
batch_size_increment (int) – Global batch size increments.
ramup_samples (int) – Number of samples to use ramp up global batch size from start_global_batch_size to global_batch_size.

Initialization

update( consumed_samples: int, consistency_check: bool, verbose: bool = False, ) → None#

Update number of microbatches.

Parameters:

consumed_samples (int) – Number of samples consumed.
consistency_check (bool) – Option to check current schedule’s consistency.
verbose (bool, optional) – Option to control logging. Defaults to False.

core.num_microbatches_calculator#

Module Contents#

Classes#

Functions#

Data#

API#

`core.num_microbatches_calculator`#