core.num_microbatches_calculator#
Megatron Core number of microbatches calculators.
Module Contents#
Classes#
Base class for number of microbatches calculator. |
|
Calculator of number of microbatches with constant global batch size. |
|
Calculator of number of microbatches with batch size rampup.
Over |
Functions#
Get number of microbatches. |
|
Get current global batch size. |
|
Get micro batch size. |
|
Get current running global batch size, taking into account number of DP replicas might be
incompatible with true global batch size if |
|
Update number of microbatches. |
|
Unset microbatches calculator. |
|
Initialize number of microbatches calculator. Supporting backward compatibility. |
|
Destroy number of microbatches calculator. |
|
Reconfigure number of microbatches calculator. Supporting backward compatibility. |
|
Configure number of microbatches calculator. Can be used for initialization and reconfiguration. |
|
Build number of microbatches calculator. Internal helper method. |
|
Round |
Data#
API#
- core.num_microbatches_calculator.logger#
‘getLogger(…)’
- core.num_microbatches_calculator._GLOBAL_NUM_MICROBATCHES_CALCULATOR: Union[core.num_microbatches_calculator.ConstantNumMicroBatchesCalculator, core.num_microbatches_calculator.RampupBatchsizeNumMicroBatchesCalculator]#
None
- core.num_microbatches_calculator.get_num_microbatches() int#
Get number of microbatches.
- core.num_microbatches_calculator.get_current_global_batch_size() int#
Get current global batch size.
- core.num_microbatches_calculator.get_micro_batch_size() int#
Get micro batch size.
- core.num_microbatches_calculator.get_current_running_global_batch_size() int#
Get current running global batch size, taking into account number of DP replicas might be incompatible with true global batch size if
decrease_batch_size_if_neededis True.
- core.num_microbatches_calculator.update_num_microbatches(
- consumed_samples: int,
- consistency_check: bool = True,
- verbose: bool = False,
Update number of microbatches.
- Parameters:
consumed_samples (int) – Number of samples consumed.
consistency_check (bool, optional) – Option to check current schedule’s consistency. Defaults to True.
verbose (bool, optional) – Option to control logging. Defaults to False.
- core.num_microbatches_calculator.unset_num_microbatches_calculator()#
Unset microbatches calculator.
Useful for multiple runs. See
tests/unit_tests/ckpt_converter/test_ckpt_converter.pyfor an example.
- core.num_microbatches_calculator.init_num_microbatches_calculator(
- rank: int,
- rampup_batch_size: Optional[List[int]],
- global_batch_size: int,
- micro_batch_size: int,
- data_parallel_size: int,
- decrease_batch_size_if_needed: bool = False,
Initialize number of microbatches calculator. Supporting backward compatibility.
- Parameters:
rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool, optional) – If true, scale down batch size to ensure divisibility by DP size * microbatch size. Defaults to False.
- core.num_microbatches_calculator.destroy_num_microbatches_calculator()#
Destroy number of microbatches calculator.
- core.num_microbatches_calculator.reconfigure_num_microbatches_calculator(
- rank: int,
- rampup_batch_size: Optional[List[int]],
- global_batch_size: int,
- micro_batch_size: int,
- data_parallel_size: int,
- decrease_batch_size_if_needed: bool = False,
Reconfigure number of microbatches calculator. Supporting backward compatibility.
- Parameters:
rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool, optional) – If true, scale down batch size to ensure divisibility by DP size * microbatch size. Defaults to False.
- core.num_microbatches_calculator._configure_global_num_microbatches_calculator(
- rank: int,
- rampup_batch_size: Optional[List[int]],
- global_batch_size: int,
- micro_batch_size: int,
- data_parallel_size: int,
- decrease_batch_size_if_needed: bool = False,
- init: bool = False,
Configure number of microbatches calculator. Can be used for initialization and reconfiguration.
- Parameters:
rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool, optional) – If true, scale down batch size to ensure divisibility by DP size * microbatch size. Defaults to False.
init (bool, optional) – If true, initialize the calculator. Defaults to False.
- core.num_microbatches_calculator._build_num_microbatches_calculator(
- rank: int,
- rampup_batch_size: Optional[List[int]],
- global_batch_size: int,
- micro_batch_size: int,
- data_parallel_size: int,
- decrease_batch_size_if_needed: bool,
Build number of microbatches calculator. Internal helper method.
- Parameters:
rank (int) – Rank of the GPU, only rank 0 will log the information.
rampup_batch_size (Optional[List[int]]) – Rampup batch size, should be in format of [start_global_batch_size, batch_size_increment, ramup_samples].
global_batch_size (int) – Global batch size for the model.
micro_batch_size (int) – Micro batch size at initialization.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool) – If true, scale down batch size to ensure divisibility by DP size * microbatch size.
- core.num_microbatches_calculator._round(batch_size: int, divisor: int) int#
Round
batch_sizedown to nearest batch size divisible bydivisor.
- class core.num_microbatches_calculator.NumMicroBatchesCalculator#
Bases:
abc.ABCBase class for number of microbatches calculator.
Initialization
- get() int#
Get number of microbatches.
- get_current_global_batch_size() int#
Get current global batch size.
- get_micro_batch_size() int#
Get current global batch size.
- get_current_running_global_batch_size() int#
Get current running global batch size. If decrease_batch_size_if_needed is False, this just equals global batch size.
- abstractmethod update(consumed_samples, consistency_check, verbose=False) None#
Update number of microbatches depending on batch size rampup.
- class core.num_microbatches_calculator.ConstantNumMicroBatchesCalculator(
- global_batch_size: int,
- micro_batch_size: int,
- data_parallel_size: int,
- decrease_batch_size_if_needed: bool,
- rank: int,
Bases:
core.num_microbatches_calculator.NumMicroBatchesCalculatorCalculator of number of microbatches with constant global batch size.
- Parameters:
global_batch_size (int) – Global batch size.
micro_batch_size (int) – Micro batch size.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool) – If true, decrease batch size to ensure divisibility by DP size * microbatch size (if needed).
rank (int) – Rank (to determine whether logging should be performed).
Initialization
- update(consumed_samples, consistency_check, verbose=False) None#
- class core.num_microbatches_calculator.RampupBatchsizeNumMicroBatchesCalculator(
- global_batch_size: int,
- micro_batch_size: int,
- data_parallel_size: int,
- decrease_batch_size_if_needed: bool,
- rank: int,
- start_global_batch_size: int,
- batch_size_increment: int,
- ramup_samples: int,
Bases:
core.num_microbatches_calculator.NumMicroBatchesCalculatorCalculator of number of microbatches with batch size rampup. Over
steps = (global-batch-size - start-batch-size) / batch_size_incrementincrement batch size from start-batch-size to global-batch-size using rampup-samples / steps samples.- Parameters:
global_batch_size (int) – Global batch size post rampup.
micro_batch_size (int) – Micro batch size.
data_parallel_size (int) – Data parallel size.
decrease_batch_size_if_needed (bool) – If true, decrease batch size to ensure divisibility by DP size * microbatch size (if needed).
rank (int) – Rank (to determine whether logging should be performed).
start_global_batch_size (int) – Global batch size to start with.
batch_size_increment (int) – Global batch size increments.
ramup_samples (int) – Number of samples to use ramp up global batch size from
start_global_batch_sizetoglobal_batch_size.
Initialization
- update(
- consumed_samples: int,
- consistency_check: bool,
- verbose: bool = False,
Update number of microbatches.
- Parameters:
consumed_samples (int) – Number of samples consumed.
consistency_check (bool) – Option to check current schedule’s consistency.
verbose (bool, optional) – Option to control logging. Defaults to False.