bridge.utils.vocab_utils
#
Module Contents#
Functions#
Calculate padded vocab size for tensor parallelism. |
|
Cached computation of padded vocab size. |
API#
- bridge.utils.vocab_utils.calculate_padded_vocab_size(
- vocab_size: int,
- make_vocab_size_divisible_by: int,
- tensor_model_parallel_size: int,
- logging_enabled: bool = True,
Calculate padded vocab size for tensor parallelism.
This function pads the vocabulary size to ensure itβs divisible by the required multiple for efficient tensor parallel operations.
- Parameters:
vocab_size β The original (unpadded) vocabulary size
make_vocab_size_divisible_by β Base divisibility requirement (e.g., 128)
tensor_model_parallel_size β Number of tensor parallel ranks
logging_enabled β Whether to log the padding information
- Returns:
The padded vocabulary size
- Return type:
int
- bridge.utils.vocab_utils._calculate_padded_vocab_size_cached(
- vocab_size: int,
- make_vocab_size_divisible_by: int,
- tensor_model_parallel_size: int,
Cached computation of padded vocab size.