bridge.diffusion.data.common.diffusion_task_encoder_with_sp#

Module Contents#

Classes#

Functions#

cook

Processes a raw sample dictionary from energon dataset and returns a new dictionary with specific keys.

API#

bridge.diffusion.data.common.diffusion_task_encoder_with_sp.cook(sample: dict) dict#

Processes a raw sample dictionary from energon dataset and returns a new dictionary with specific keys.

Parameters:

sample (dict) – The input dictionary containing the raw sample data.

Returns:

A new dictionary containing the processed sample data with the following keys: - All keys from the result of basic_sample_keys(sample) - ‘json’: The contains meta data like resolution, aspect ratio, fps, etc. - ‘pth’: contains video latent tensor - ‘pickle’: contains text embeddings

Return type:

dict

class bridge.diffusion.data.common.diffusion_task_encoder_with_sp.DiffusionTaskEncoderWithSequencePacking(
*args,
max_frames: int = None,
text_embedding_max_length: int = 512,
seq_length: int = None,
patch_spatial: int = 2,
patch_temporal: int = 1,
packing_buffer_size: int = None,
**kwargs,
)#

Bases: megatron.energon.DefaultTaskEncoder, abc.ABC

cookers#

None

abstractmethod encode_sample(sample: dict) dict#
select_samples_to_pack(
samples: List[megatron.bridge.diffusion.data.common.diffusion_sample.DiffusionSample],
) List[List[megatron.bridge.diffusion.data.common.diffusion_sample.DiffusionSample]]#

Selects sequences to pack for mixed image-video training.

pack_selected_samples(
samples: List[megatron.bridge.diffusion.data.common.diffusion_sample.DiffusionSample],
) megatron.bridge.diffusion.data.common.diffusion_sample.DiffusionSample#

Construct a new Diffusion sample by concatenating the sequences.

abstractmethod batch(
samples: List[megatron.bridge.diffusion.data.common.diffusion_sample.DiffusionSample],
) dict#