bridge.utils.vocab_utils#

Module Contents#

Functions#

calculate_padded_vocab_size

Calculate padded vocab size for tensor parallelism.

_calculate_padded_vocab_size_cached

Cached computation of padded vocab size.

API#

bridge.utils.vocab_utils.calculate_padded_vocab_size(
vocab_size: int,
make_vocab_size_divisible_by: int,
tensor_model_parallel_size: int,
logging_enabled: bool = True,
) int#

Calculate padded vocab size for tensor parallelism.

This function pads the vocabulary size to ensure it’s divisible by the required multiple for efficient tensor parallel operations.

Parameters:
  • vocab_size – The original (unpadded) vocabulary size

  • make_vocab_size_divisible_by – Base divisibility requirement (e.g., 128)

  • tensor_model_parallel_size – Number of tensor parallel ranks

  • logging_enabled – Whether to log the padding information

Returns:

The padded vocabulary size

Return type:

int

bridge.utils.vocab_utils._calculate_padded_vocab_size_cached(
vocab_size: int,
make_vocab_size_divisible_by: int,
tensor_model_parallel_size: int,
) int#

Cached computation of padded vocab size.