Utilities#

Helper classes for common tasks in diffusion model training and sampling.

API Reference#

StackedRandomGenerator#

class physicsnemo.diffusion.utils.StackedRandomGenerator(
device: device,
seeds: Sequence[int],
)[source]#

Wrapper for torch.Generator that allows specifying a different random seed for each sample in a minibatch.

Parameters:
  • device (torch.device) – Device to use for the random number generator.

  • seeds (Sequence[int]) – Sequence (e.g. list or tuple) of random seeds for each sample in the minibatch. Its length defines the batch size of generated samples.

randint(
*args: Any,
size: Size | Sequence[int],
**kwargs: Any,
) Tensor[source]#

Generate stacked samples from a uniform distribution over the integers.

Parameters:
  • *args (Any) – Required positional arguments to pass to torch.randint.

  • size (Sequence[int] | torch.Size) – Size of the output tensor. Accepts any sequence of integers or a torch.Size instance. First dimension must match the number of random seeds.

  • **kwargs (Any) – Additional keyword arguments to pass to torch.randint.

Returns:

Stacked samples from a uniform distribution over the integers. Shape matches size.

Return type:

torch.Tensor

randn(
size: Size | Sequence[int],
**kwargs: Any,
) Tensor[source]#

Generate stacked samples from a standard normal distribution. Each sample is generated using a different random seed.

Parameters:
  • size (Sequence[int] | torch.Size) – Size of the output tensor. Accepts any sequence of integers or a torch.Size instance. First dimension must match the number of random seeds.

  • **kwargs (Any) – Additional arguments to pass to torch.randn.

Returns:

Stacked samples from a standard normal distribution. Shape matches size.

Return type:

torch.Tensor

randn_like(
input: Tensor,
) Tensor[source]#

Generate stacked samples from a standard normal distribution with the same shape and data type as the input tensor.

Parameters:

input (torch.Tensor) – Input tensor to match the shape, data type, memory layout, and device of.

Returns:

Stacked samples from a standard normal distribution. Shape matches input.shape.

Return type:

torch.Tensor

randt(
nu: int,
size: Size | Sequence[int],
**kwargs: Any,
) Tensor[source]#

Generate stacked samples from a standard Student-t distribution with nu degrees of freedom. This is useful when sampling from heavy-tailed diffusion models.

Parameters:
  • nu (int) – Degrees of freedom for the Student-t distribution. Must be > 2.

  • size (Sequence[int] | torch.Size) – Size of the output tensor. Accepts any sequence of integers or a torch.Size instance. First dimension must match the number of random seeds.

  • **kwargs (Any) – Additional arguments to pass to torch.randn.

Returns:

Stacked samples from a standard Student-t distribution. Shape matches size.

Return type:

torch.Tensor

InfiniteSampler#

class physicsnemo.diffusion.utils.InfiniteSampler(
dataset: Dataset,
rank: int = 0,
num_replicas: int = 1,
shuffle: bool = True,
seed: int = 0,
window_size: float = 0.5,
start_idx: int = 0,
)[source]#

Bases: Sampler[int]

Sampler for torch.utils.data.DataLoader that loops over the dataset indefinitely.

This sampler yields indices indefinitely, optionally shuffling items as it goes. It can also perform distributed sampling when rank and num_replicas are specified.

Parameters:
  • dataset (torch.utils.data.Dataset) – The dataset to sample from

  • rank (int, default=0) – The rank of the current process within num_replicas processes

  • num_replicas (int, default=1) – The number of processes participating in distributed sampling

  • shuffle (bool, default=True) – Whether to shuffle the indices

  • seed (int, default=0) – Random seed for reproducibility when shuffling

  • window_size (float, default=0.5) – Fraction of dataset to use as window for shuffling. Must be between 0 and 1. A larger window means more thorough shuffling but slower iteration.

  • start_idx (int, default=0) – The initial index to use for the sampler. This is used for resuming training.