Convolutional Networks#

class physicsnemo.models.pix2pix.pix2pix.Pix2Pix(*args, **kwargs)[source]#

Bases: Module

Convolutional encoder-decoder based on pix2pix generator models.

Note

The pix2pix architecture supports options for 1D, 2D and 3D fields which can be constroled using the dimension parameter.

Parameters:
  • in_channels (int) – Number of input channels

  • out_channels (Union[int, Any], optional) – Number of output channels

  • dimension (int) – Model dimensionality (supports 1, 2, 3).

  • conv_layer_size (int, optional) – Latent channel size after first convolution, by default 64

  • n_downsampling (int, optional) – Number of downsampling blocks, by default 3

  • n_upsampling (int, optional) – Number of upsampling blocks, by default 3

  • n_blocks (int, optional) – Number of residual blocks in middle of model, by default 3

  • activation_fn (Any, optional) – Activation function, by default “relu”

  • batch_norm (bool, optional) – Batch normalization, by default False

  • padding_type (str, optional) – Padding type (‘reflect’, ‘replicate’ or ‘zero’), by default “reflect”

Example

>>> #2D convolutional encoder decoder
>>> model = physicsnemo.models.pix2pix.Pix2Pix(
... in_channels=1,
... out_channels=2,
... dimension=2,
... conv_layer_size=4)
>>> input = torch.randn(4, 1, 32, 32) #(N, C, H, W)
>>> output = model(input)
>>> output.size()
torch.Size([4, 2, 32, 32])

Note

Reference: Isola, Phillip, et al. “Image-To-Image translation with conditional adversarial networks” Conference on Computer Vision and Pattern Recognition, 2017. https://arxiv.org/abs/1611.07004

Reference: Wang, Ting-Chun, et al. “High-Resolution image synthesis and semantic manipulation with conditional GANs” Conference on Computer Vision and Pattern Recognition, 2018. https://arxiv.org/abs/1711.11585

Note

Based on the implementation: NVIDIA/pix2pixHD

class physicsnemo.models.pix2pix.pix2pix.ResnetBlock(
dimension: int,
channels: int,
padding_type: str = 'reflect',
activation: Module = ReLU(),
use_batch_norm: bool = False,
use_dropout: bool = False,
)[source]#

Bases: Module

A simple ResNet block

Parameters:
  • dimension (int) – Model dimensionality (supports 1, 2, 3).

  • channels (int) – Number of feature channels

  • padding_type (str, optional) – Padding type (‘reflect’, ‘replicate’ or ‘zero’), by default “reflect”

  • activation (nn.Module, optional) – Activation function, by default nn.ReLU()

  • use_batch_norm (bool, optional) – Batch normalization, by default False

class physicsnemo.models.pix2pix.pix2pixunet.Pix2PixUnet(*args, **kwargs)[source]#

Bases: Module

Convolutional encoder-decoder based on pix2pix generator models using Unet.

Note

The pix2pix with Unet architecture only supports 2D field.

Parameters:
  • in_channels (int) – Number of input channels

  • out_channels (Union[int, Any], optional) – Number of output channels

  • n_downsampling (int) – Number of downsampling in UNet

  • filter_size (int, optional) – Number of filters in last convolution layer, by default 64

  • norm_layer (optional) – Normalization layer, by default nn.BatchNorm2d

  • use_dropout (bool, optional) – Use dropout layers, by default False

Note

Reference: Isola, Phillip, et al. “Image-To-Image translation with conditional adversarial networks” Conference on Computer Vision and Pattern Recognition, 2017. https://arxiv.org/abs/1611.07004

Reference: Wang, Ting-Chun, et al. “High-Resolution image synthesis and semantic manipulation with conditional GANs” Conference on Computer Vision and Pattern Recognition, 2018. https://arxiv.org/abs/1711.11585

Note

Based on the implementation: junyanz/pytorch-CycleGAN-and-pix2pix

class physicsnemo.models.srrn.super_res_net.SRResNet(*args, **kwargs)[source]#

Bases: Module

3D convolutional super-resolution network.

Parameters:
  • in_channels (int) – Number of input channels.

  • out_channels (int) – Number of output channels.

  • large_kernel_size (int, optional, default=7) – Convolutional kernel size for first and last convolution.

  • small_kernel_size (int, optional, default=3) – Convolutional kernel size for internal convolutions.

  • conv_layer_size (int, optional, default=32) – Latent channel size.

  • n_resid_blocks (int, optional, default=8) – Number of residual blocks.

  • scaling_factor (int, optional, default=8) – Scaling factor to increase the output feature size compared to the input. Must be 2, 4, or 8.

  • activation_fn (str, optional, default="prelu") – Activation function.

Forward:

in_vars (torch.Tensor) – Input tensor of shape \((B, C_{in}, D, H, W)\) where \(B\) is batch size, \(C_{in}\) is the number of input channels, and \(D, H, W\) are the spatial dimensions.

Outputs:

torch.Tensor – Output tensor of shape \((B, C_{out}, D \times s, H \times s, W \times s)\) where \(s\) is the scaling_factor.

Examples

>>> import torch
>>> model = physicsnemo.models.srrn.SRResNet(
...     in_channels=1,
...     out_channels=2,
...     conv_layer_size=4,
...     scaling_factor=2,
... )
>>> input = torch.randn(4, 1, 8, 8, 8)  # (B, C, D, H, W)
>>> output = model(input)
>>> output.size()
torch.Size([4, 2, 16, 16, 16])

Notes

Based on the implementation: sgrvinod/a-PyTorch-Tutorial-to-Super-Resolution

class physicsnemo.models.srrn.super_res_net.ConvolutionalBlock3d(
in_channels: int,
out_channels: int,
kernel_size: int,
stride: int = 1,
batch_norm: bool = False,
activation_fn: Module = Identity(),
)[source]#

Bases: Module

3D convolutional block with optional batch normalization and activation.

Parameters:
  • in_channels (int) – Number of input channels.

  • out_channels (int) – Number of output channels.

  • kernel_size (int) – Convolutional kernel size.

  • stride (int, optional, default=1) – Convolutional stride.

  • batch_norm (bool, optional, default=False) – Whether to use batch normalization.

  • activation_fn (nn.Module, optional, default=nn.Identity()) – Activation function.

Forward:

input (torch.Tensor) – Input tensor of shape \((B, C_{in}, D, H, W)\).

Outputs:

torch.Tensor – Output tensor of shape \((B, C_{out}, D', H', W')\).

class physicsnemo.models.srrn.super_res_net.PixelShuffle3d(scale: int)[source]#

Bases: Module

3D pixel-shuffle operation for sub-pixel upscaling.

Rearranges elements in a tensor of shape \((B, C \times r^3, D, H, W)\) to a tensor of shape \((B, C, D \times r, H \times r, W \times r)\) where \(r\) is the upscale factor.

Parameters:

scale (int) – Upscale factor. Channel dimension is reduced by scale^3.

Forward:

input (torch.Tensor) – Input tensor of shape \((B, C \times r^3, D, H, W)\).

Outputs:

torch.Tensor – Output tensor of shape \((B, C, D \times r, H \times r, W \times r)\).

Notes

Reference: http://www.multisilicon.com/blog/a25332339.html

class physicsnemo.models.srrn.super_res_net.ResidualConvBlock3d(
n_layers: int = 1,
kernel_size: int = 3,
conv_layer_size: int = 64,
activation_fn: Module = Identity(),
)[source]#

Bases: Module

3D residual convolutional block.

Parameters:
  • n_layers (int, optional, default=1) – Number of convolutional layers.

  • kernel_size (int, optional, default=3) – Convolutional kernel size.

  • conv_layer_size (int, optional, default=64) – Latent channel size.

  • activation_fn (nn.Module, optional, default=nn.Identity()) – Activation function.

Forward:

input (torch.Tensor) – Input tensor of shape \((B, C, D, H, W)\).

Outputs:

torch.Tensor – Output tensor of shape \((B, C, D, H, W)\) (same as input).

class physicsnemo.models.srrn.super_res_net.SubPixel_ConvolutionalBlock3d(
kernel_size: int = 3,
conv_layer_size: int = 64,
scaling_factor: int = 2,
)[source]#

Bases: Module

Convolutional block with pixel shuffle for sub-pixel upscaling.

Parameters:
  • kernel_size (int, optional, default=3) – Convolutional kernel size.

  • conv_layer_size (int, optional, default=64) – Latent channel size.

  • scaling_factor (int, optional, default=2) – Pixel shuffle scaling factor.

Forward:

input (torch.Tensor) – Input tensor of shape \((B, C, D, H, W)\).

Outputs:

torch.Tensor – Output tensor of shape \((B, C, D \times s, H \times s, W \times s)\) where \(s\) is the scaling_factor.