bridge.diffusion.data.common.sequence_packing_utils#

Module Contents#

Functions#

find_first_bin_that_fits

Finds the first bin in a list of bins that has enough space to fit a sequence of size β€˜s’.

first_fit

Packs sequences of varying lengths into bins using the First-Fit algorithm.

first_fit_decreasing

Packs sequences of varying lengths into bins using the First-Fit Decreasing algorithm.

API#

bridge.diffusion.data.common.sequence_packing_utils.find_first_bin_that_fits(
bins: List[List[int]],
s: int,
bin_size: int,
) int#

Finds the first bin in a list of bins that has enough space to fit a sequence of size β€˜s’.

Parameters:
  • bins – A list of lists, where each inner list represents a bin and contains the current elements in that bin.

  • s – The size of the sequence to be placed in a bin.

  • bin_size – The maximum capacity of each bin.

Returns:

The index of the first bin that can fit the sequence β€˜s’, or -1 if no such bin exists.

bridge.diffusion.data.common.sequence_packing_utils.first_fit(
seqlens: List[int],
pack_size: int,
) List[List[int]]#

Packs sequences of varying lengths into bins using the First-Fit algorithm.

Parameters:
  • seqlens – A list of integers, representing the lengths of the sequences to be packed.

  • pack_size – The maximum capacity of each bin.

Returns:

A list of lists, where each inner list represents a bin and contains the indices of the sequences assigned to that bin.

bridge.diffusion.data.common.sequence_packing_utils.first_fit_decreasing(
seqlens: List[int],
pack_size: int,
) List[List[int]]#

Packs sequences of varying lengths into bins using the First-Fit Decreasing algorithm.

This is a variation of the First-Fit algorithm where the sequences are sorted by decreasing length before packing.

Parameters:
  • seqlens – A list of integers, representing the lengths of the sequences to be packed.

  • pack_size – The maximum capacity of each bin.

Returns:

A list of lists, similar to the output of the β€˜first_fit’ function.