bridge.diffusion.data.common.diffusion_sample#
Module Contents#
Classes#
Data class representing a sample for diffusion tasks. |
API#
- class bridge.diffusion.data.common.diffusion_sample.DiffusionSample#
Bases:
megatron.energon.SampleData class representing a sample for diffusion tasks.
.. attribute:: video
Video latents (C T H W).
- Type:
torch.Tensor
.. attribute:: t5_text_embeddings
Text embeddings (S D).
- Type:
torch.Tensor
.. attribute:: t5_text_mask
Mask for text embeddings.
- Type:
torch.Tensor
.. attribute:: loss_mask
Mask indicating valid positions for loss computation.
- Type:
torch.Tensor
.. attribute:: image_size
Tensor containing image dimensions.
- Type:
Optional[torch.Tensor]
.. attribute:: fps
Frame rate of the video.
- Type:
Optional[torch.Tensor]
.. attribute:: num_frames
Number of frames in the video.
- Type:
Optional[torch.Tensor]
.. attribute:: padding_mask
Mask indicating padding positions.
- Type:
Optional[torch.Tensor]
.. attribute:: seq_len_q
Sequence length for query embeddings.
- Type:
Optional[torch.Tensor]
.. attribute:: seq_len_q_padded
Sequence length for query embeddings after padding.
- Type:
Optional[torch.Tensor]
.. attribute:: seq_len_kv
Sequence length for key/value embeddings.
- Type:
Optional[torch.Tensor]
.. attribute:: pos_ids
Positional IDs.
- Type:
Optional[torch.Tensor]
.. attribute:: latent_shape
Shape of the latent tensor.
- Type:
Optional[torch.Tensor]
.. attribute:: video_metadata
Metadata of the video.
- Type:
Optional[dict]
- video: torch.Tensor#
None
- context_embeddings: torch.Tensor#
None
- context_mask: torch.Tensor#
None
- image_size: Optional[torch.Tensor]#
None
- loss_mask: torch.Tensor#
None
- fps: Optional[torch.Tensor]#
None
- num_frames: Optional[torch.Tensor]#
None
- padding_mask: Optional[torch.Tensor]#
None
- seq_len_q: Optional[torch.Tensor]#
None
- seq_len_q_padded: Optional[torch.Tensor]#
None
- seq_len_kv: Optional[torch.Tensor]#
None
- seq_len_kv_padded: Optional[torch.Tensor]#
None
- pos_ids: Optional[torch.Tensor]#
None
- latent_shape: Optional[torch.Tensor]#
None
- video_metadata: Optional[dict]#
None
- to_dict() dict#
Converts the sample to a dictionary.
- __add__(other: Any) int#
Adds the sequence length of this sample with another sample or integer.
- __radd__(other: Any) int#
Handles reverse addition for summing with integers.
- __lt__(other: Any) bool#
Compares this sample’s sequence length with another sample or integer.