`bridge.diffusion.data.common.diffusion_sample`#

Module Contents#

Classes#

DiffusionSample

Data class representing a sample for diffusion tasks.

API#

class bridge.diffusion.data.common.diffusion_sample.DiffusionSample#

Bases: megatron.energon.Sample

Data class representing a sample for diffusion tasks.

.. attribute:: video

Video latents (C T H W).

Type:: torch.Tensor

.. attribute:: t5_text_embeddings

Text embeddings (S D).

Type:: torch.Tensor

.. attribute:: t5_text_mask

Mask for text embeddings.

Type:: torch.Tensor

.. attribute:: loss_mask

Mask indicating valid positions for loss computation.

Type:: torch.Tensor

.. attribute:: image_size

Tensor containing image dimensions.

Type:: Optional[torch.Tensor]

.. attribute:: fps

Frame rate of the video.

Type:: Optional[torch.Tensor]

.. attribute:: num_frames

Number of frames in the video.

Type:: Optional[torch.Tensor]

.. attribute:: padding_mask

Mask indicating padding positions.

Type:: Optional[torch.Tensor]

.. attribute:: seq_len_q

Sequence length for query embeddings.

Type:: Optional[torch.Tensor]

.. attribute:: seq_len_q_padded

Sequence length for query embeddings after padding.

Type:: Optional[torch.Tensor]

.. attribute:: seq_len_kv

Sequence length for key/value embeddings.

Type:: Optional[torch.Tensor]

.. attribute:: pos_ids

Positional IDs.

Type:: Optional[torch.Tensor]

.. attribute:: latent_shape

Shape of the latent tensor.

Type:: Optional[torch.Tensor]

.. attribute:: video_metadata

Metadata of the video.

Type:: Optional[dict]

video: torch.Tensor#: None

context_embeddings: torch.Tensor#: None

context_mask: torch.Tensor#: None

image_size: Optional[torch.Tensor]#: None

loss_mask: torch.Tensor#: None

fps: Optional[torch.Tensor]#: None

num_frames: Optional[torch.Tensor]#: None

padding_mask: Optional[torch.Tensor]#: None

seq_len_q: Optional[torch.Tensor]#: None

seq_len_q_padded: Optional[torch.Tensor]#: None

seq_len_kv: Optional[torch.Tensor]#: None

seq_len_kv_padded: Optional[torch.Tensor]#: None

pos_ids: Optional[torch.Tensor]#: None

latent_shape: Optional[torch.Tensor]#: None

video_metadata: Optional[dict]#: None

to_dict() → dict#: Converts the sample to a dictionary.

__add__(other: Any) → int#: Adds the sequence length of this sample with another sample or integer.

__radd__(other: Any) → int#: Handles reverse addition for summing with integers.

__lt__(other: Any) → bool#: Compares this sample’s sequence length with another sample or integer.

bridge.diffusion.data.common.diffusion_sample#

Module Contents#

Classes#

API#

`bridge.diffusion.data.common.diffusion_sample`#