Convolutional Networks#
- class physicsnemo.models.pix2pix.pix2pix.Pix2Pix(*args, **kwargs)[source]#
Bases:
ModuleConvolutional encoder-decoder based on pix2pix generator models.
Note
The pix2pix architecture supports options for 1D, 2D and 3D fields which can be constroled using the dimension parameter.
- Parameters:
in_channels (int) – Number of input channels
out_channels (Union[int, Any], optional) – Number of output channels
dimension (int) – Model dimensionality (supports 1, 2, 3).
conv_layer_size (int, optional) – Latent channel size after first convolution, by default 64
n_downsampling (int, optional) – Number of downsampling blocks, by default 3
n_upsampling (int, optional) – Number of upsampling blocks, by default 3
n_blocks (int, optional) – Number of residual blocks in middle of model, by default 3
activation_fn (Any, optional) – Activation function, by default “relu”
batch_norm (bool, optional) – Batch normalization, by default False
padding_type (str, optional) – Padding type (‘reflect’, ‘replicate’ or ‘zero’), by default “reflect”
Example
>>> #2D convolutional encoder decoder >>> model = physicsnemo.models.pix2pix.Pix2Pix( ... in_channels=1, ... out_channels=2, ... dimension=2, ... conv_layer_size=4) >>> input = torch.randn(4, 1, 32, 32) #(N, C, H, W) >>> output = model(input) >>> output.size() torch.Size([4, 2, 32, 32])
Note
Reference: Isola, Phillip, et al. “Image-To-Image translation with conditional adversarial networks” Conference on Computer Vision and Pattern Recognition, 2017. https://arxiv.org/abs/1611.07004
Reference: Wang, Ting-Chun, et al. “High-Resolution image synthesis and semantic manipulation with conditional GANs” Conference on Computer Vision and Pattern Recognition, 2018. https://arxiv.org/abs/1711.11585
Note
Based on the implementation: NVIDIA/pix2pixHD
- class physicsnemo.models.pix2pix.pix2pix.ResnetBlock(
- dimension: int,
- channels: int,
- padding_type: str = 'reflect',
- activation: Module = ReLU(),
- use_batch_norm: bool = False,
- use_dropout: bool = False,
Bases:
ModuleA simple ResNet block
- Parameters:
dimension (int) – Model dimensionality (supports 1, 2, 3).
channels (int) – Number of feature channels
padding_type (str, optional) – Padding type (‘reflect’, ‘replicate’ or ‘zero’), by default “reflect”
activation (nn.Module, optional) – Activation function, by default nn.ReLU()
use_batch_norm (bool, optional) – Batch normalization, by default False
- class physicsnemo.models.pix2pix.pix2pixunet.Pix2PixUnet(*args, **kwargs)[source]#
Bases:
ModuleConvolutional encoder-decoder based on pix2pix generator models using Unet.
Note
The pix2pix with Unet architecture only supports 2D field.
- Parameters:
in_channels (int) – Number of input channels
out_channels (Union[int, Any], optional) – Number of output channels
n_downsampling (int) – Number of downsampling in UNet
filter_size (int, optional) – Number of filters in last convolution layer, by default 64
norm_layer (optional) – Normalization layer, by default nn.BatchNorm2d
use_dropout (bool, optional) – Use dropout layers, by default False
Note
Reference: Isola, Phillip, et al. “Image-To-Image translation with conditional adversarial networks” Conference on Computer Vision and Pattern Recognition, 2017. https://arxiv.org/abs/1611.07004
Reference: Wang, Ting-Chun, et al. “High-Resolution image synthesis and semantic manipulation with conditional GANs” Conference on Computer Vision and Pattern Recognition, 2018. https://arxiv.org/abs/1711.11585
Note
Based on the implementation: junyanz/pytorch-CycleGAN-and-pix2pix
- class physicsnemo.models.srrn.super_res_net.SRResNet(*args, **kwargs)[source]#
Bases:
Module3D convolutional super-resolution network.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
large_kernel_size (int, optional, default=7) – Convolutional kernel size for first and last convolution.
small_kernel_size (int, optional, default=3) – Convolutional kernel size for internal convolutions.
conv_layer_size (int, optional, default=32) – Latent channel size.
n_resid_blocks (int, optional, default=8) – Number of residual blocks.
scaling_factor (int, optional, default=8) – Scaling factor to increase the output feature size compared to the input. Must be
2,4, or8.activation_fn (str, optional, default="prelu") – Activation function.
- Forward:
in_vars (torch.Tensor) – Input tensor of shape \((B, C_{in}, D, H, W)\) where \(B\) is batch size, \(C_{in}\) is the number of input channels, and \(D, H, W\) are the spatial dimensions.
- Outputs:
torch.Tensor – Output tensor of shape \((B, C_{out}, D \times s, H \times s, W \times s)\) where \(s\) is the
scaling_factor.
Examples
>>> import torch >>> model = physicsnemo.models.srrn.SRResNet( ... in_channels=1, ... out_channels=2, ... conv_layer_size=4, ... scaling_factor=2, ... ) >>> input = torch.randn(4, 1, 8, 8, 8) # (B, C, D, H, W) >>> output = model(input) >>> output.size() torch.Size([4, 2, 16, 16, 16])
Notes
Based on the implementation: sgrvinod/a-PyTorch-Tutorial-to-Super-Resolution
- class physicsnemo.models.srrn.super_res_net.ConvolutionalBlock3d(
- in_channels: int,
- out_channels: int,
- kernel_size: int,
- stride: int = 1,
- batch_norm: bool = False,
- activation_fn: Module = Identity(),
Bases:
Module3D convolutional block with optional batch normalization and activation.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
kernel_size (int) – Convolutional kernel size.
stride (int, optional, default=1) – Convolutional stride.
batch_norm (bool, optional, default=False) – Whether to use batch normalization.
activation_fn (nn.Module, optional, default=nn.Identity()) – Activation function.
- Forward:
input (torch.Tensor) – Input tensor of shape \((B, C_{in}, D, H, W)\).
- Outputs:
torch.Tensor – Output tensor of shape \((B, C_{out}, D', H', W')\).
- class physicsnemo.models.srrn.super_res_net.PixelShuffle3d(scale: int)[source]#
Bases:
Module3D pixel-shuffle operation for sub-pixel upscaling.
Rearranges elements in a tensor of shape \((B, C \times r^3, D, H, W)\) to a tensor of shape \((B, C, D \times r, H \times r, W \times r)\) where \(r\) is the upscale factor.
- Parameters:
scale (int) – Upscale factor. Channel dimension is reduced by
scale^3.- Forward:
input (torch.Tensor) – Input tensor of shape \((B, C \times r^3, D, H, W)\).
- Outputs:
torch.Tensor – Output tensor of shape \((B, C, D \times r, H \times r, W \times r)\).
Notes
- class physicsnemo.models.srrn.super_res_net.ResidualConvBlock3d(
- n_layers: int = 1,
- kernel_size: int = 3,
- conv_layer_size: int = 64,
- activation_fn: Module = Identity(),
Bases:
Module3D residual convolutional block.
- Parameters:
n_layers (int, optional, default=1) – Number of convolutional layers.
kernel_size (int, optional, default=3) – Convolutional kernel size.
conv_layer_size (int, optional, default=64) – Latent channel size.
activation_fn (nn.Module, optional, default=nn.Identity()) – Activation function.
- Forward:
input (torch.Tensor) – Input tensor of shape \((B, C, D, H, W)\).
- Outputs:
torch.Tensor – Output tensor of shape \((B, C, D, H, W)\) (same as input).
- class physicsnemo.models.srrn.super_res_net.SubPixel_ConvolutionalBlock3d(
- kernel_size: int = 3,
- conv_layer_size: int = 64,
- scaling_factor: int = 2,
Bases:
ModuleConvolutional block with pixel shuffle for sub-pixel upscaling.
- Parameters:
kernel_size (int, optional, default=3) – Convolutional kernel size.
conv_layer_size (int, optional, default=64) – Latent channel size.
scaling_factor (int, optional, default=2) – Pixel shuffle scaling factor.
- Forward:
input (torch.Tensor) – Input tensor of shape \((B, C, D, H, W)\).
- Outputs:
torch.Tensor – Output tensor of shape \((B, C, D \times s, H \times s, W \times s)\) where \(s\) is the
scaling_factor.