Utilities#
Helper classes for common tasks in diffusion model training and sampling.
API Reference#
StackedRandomGenerator#
- class physicsnemo.diffusion.utils.StackedRandomGenerator(
- device: device,
- seeds: Sequence[int],
Wrapper for
torch.Generatorthat allows specifying a different random seed for each sample in a minibatch.- Parameters:
device (torch.device) – Device to use for the random number generator.
seeds (Sequence[int]) – Sequence (e.g. list or tuple) of random seeds for each sample in the minibatch. Its length defines the batch size of generated samples.
- randint(
- *args: Any,
- size: Size | Sequence[int],
- **kwargs: Any,
Generate stacked samples from a uniform distribution over the integers.
- Parameters:
*args (Any) – Required positional arguments to pass to
torch.randint.size (Sequence[int] | torch.Size) – Size of the output tensor. Accepts any sequence of integers or a
torch.Sizeinstance. First dimension must match the number of random seeds.**kwargs (Any) – Additional keyword arguments to pass to
torch.randint.
- Returns:
Stacked samples from a uniform distribution over the integers. Shape matches
size.- Return type:
torch.Tensor
- randn(
- size: Size | Sequence[int],
- **kwargs: Any,
Generate stacked samples from a standard normal distribution. Each sample is generated using a different random seed.
- Parameters:
size (Sequence[int] | torch.Size) – Size of the output tensor. Accepts any sequence of integers or a
torch.Sizeinstance. First dimension must match the number of random seeds.**kwargs (Any) – Additional arguments to pass to
torch.randn.
- Returns:
Stacked samples from a standard normal distribution. Shape matches
size.- Return type:
torch.Tensor
- randn_like(
- input: Tensor,
Generate stacked samples from a standard normal distribution with the same shape and data type as the input tensor.
- Parameters:
input (torch.Tensor) – Input tensor to match the shape, data type, memory layout, and device of.
- Returns:
Stacked samples from a standard normal distribution. Shape matches
input.shape.- Return type:
torch.Tensor
- randt(
- nu: int,
- size: Size | Sequence[int],
- **kwargs: Any,
Generate stacked samples from a standard Student-t distribution with
nudegrees of freedom. This is useful when sampling from heavy-tailed diffusion models.- Parameters:
nu (int) – Degrees of freedom for the Student-t distribution. Must be > 2.
size (Sequence[int] | torch.Size) – Size of the output tensor. Accepts any sequence of integers or a
torch.Sizeinstance. First dimension must match the number of random seeds.**kwargs (Any) – Additional arguments to pass to
torch.randn.
- Returns:
Stacked samples from a standard Student-t distribution. Shape matches
size.- Return type:
torch.Tensor
InfiniteSampler#
- class physicsnemo.diffusion.utils.InfiniteSampler(
- dataset: Dataset,
- rank: int = 0,
- num_replicas: int = 1,
- shuffle: bool = True,
- seed: int = 0,
- window_size: float = 0.5,
- start_idx: int = 0,
Bases:
Sampler[int]Sampler for torch.utils.data.DataLoader that loops over the dataset indefinitely.
This sampler yields indices indefinitely, optionally shuffling items as it goes. It can also perform distributed sampling when rank and num_replicas are specified.
- Parameters:
dataset (torch.utils.data.Dataset) – The dataset to sample from
rank (int, default=0) – The rank of the current process within num_replicas processes
num_replicas (int, default=1) – The number of processes participating in distributed sampling
shuffle (bool, default=True) – Whether to shuffle the indices
seed (int, default=0) – Random seed for reproducibility when shuffling
window_size (float, default=0.5) – Fraction of dataset to use as window for shuffling. Must be between 0 and 1. A larger window means more thorough shuffling but slower iteration.
start_idx (int, default=0) – The initial index to use for the sampler. This is used for resuming training.