Fully Connected and MLP Layers#

class physicsnemo.nn.module.fully_connected_layers.Conv1dFCLayer(*args, **kwargs)[source]#

Bases: ConvFCLayer

Channel-wise fully connected layer using 1D convolutions.

Applies a 1x1 convolution followed by an optional activation function. This is equivalent to a fully connected layer operating on the channel dimension of 1D signals.

Parameters:
  • in_features (int) – Number of input channels \(C_{in}\).

  • out_features (int) – Number of output channels \(C_{out}\).

  • activation_fn (Union[nn.Module, Callable[[Tensor], Tensor], None], optional, default=None) – Activation function to use. Can be None for no activation.

  • activation_par (Union[nn.Parameter, None], optional, default=None) – Learnable scaling parameter for adaptive activations.

  • weight_norm (bool, optional, default=False) – Weight normalization (not currently supported, raises NotImplementedError).

Forward:

x (torch.Tensor) – Input tensor of shape \((B, C_{in}, L)\) where \(B\) is batch size and \(L\) is sequence length.

Outputs:

torch.Tensor – Output tensor of shape \((B, C_{out}, L)\).

forward(x: Tensor) Tensor[source]#

Forward pass through the 1D convolutional layer.

reset_parameters() None[source]#

Reset layer weights to Xavier uniform initialization.

class physicsnemo.nn.module.fully_connected_layers.Conv2dFCLayer(*args, **kwargs)[source]#

Bases: ConvFCLayer

Channel-wise fully connected layer using 2D convolutions.

Applies a 1x1 convolution followed by an optional activation function. This is equivalent to a fully connected layer operating on the channel dimension of 2D images.

Parameters:
  • in_channels (int) – Number of input channels \(C_{in}\).

  • out_channels (int) – Number of output channels \(C_{out}\).

  • activation_fn (Union[nn.Module, Callable[[Tensor], Tensor], None], optional, default=None) – Activation function to use. Can be None for no activation.

  • activation_par (Union[nn.Parameter, None], optional, default=None) – Learnable scaling parameter for adaptive activations.

Forward:

x (torch.Tensor) – Input tensor of shape \((B, C_{in}, H, W)\) where \(B\) is batch size, \(H\) is height, and \(W\) is width.

Outputs:

torch.Tensor – Output tensor of shape \((B, C_{out}, H, W)\).

forward(x: Tensor) Tensor[source]#

Forward pass through the 2D convolutional layer.

reset_parameters() None[source]#

Reset layer weights to Xavier uniform initialization.

class physicsnemo.nn.module.fully_connected_layers.Conv3dFCLayer(*args, **kwargs)[source]#

Bases: ConvFCLayer

Channel-wise fully connected layer using 3D convolutions.

Applies a 1x1x1 convolution followed by an optional activation function. This is equivalent to a fully connected layer operating on the channel dimension of 3D volumes.

Parameters:
  • in_channels (int) – Number of input channels \(C_{in}\).

  • out_channels (int) – Number of output channels \(C_{out}\).

  • activation_fn (Union[nn.Module, Callable[[Tensor], Tensor], None], optional, default=None) – Activation function to use. Can be None for no activation.

  • activation_par (Union[nn.Parameter, None], optional, default=None) – Learnable scaling parameter for adaptive activations.

Forward:

x (torch.Tensor) – Input tensor of shape \((B, C_{in}, D, H, W)\) where \(B\) is batch size, \(D\) is depth, \(H\) is height, and \(W\) is width.

Outputs:

torch.Tensor – Output tensor of shape \((B, C_{out}, D, H, W)\).

forward(x: Tensor) Tensor[source]#

Forward pass through the 3D convolutional layer.

reset_parameters() None[source]#

Reset layer weights to Xavier uniform initialization.

class physicsnemo.nn.module.fully_connected_layers.ConvFCLayer(*args, **kwargs)[source]#

Bases: Module

Base class for 1x1 convolutional layers acting on image channels.

This abstract base class provides activation handling for convolutional layers that act like fully connected layers over the channel dimension.

Parameters:
  • activation_fn (Union[nn.Module, Callable[[Tensor], Tensor], None], optional, default=None) – Activation function to use. Can be None for no activation.

  • activation_par (Union[nn.Parameter, None], optional, default=None) – Learnable scaling parameter for adaptive activations.

Forward:

x (torch.Tensor) – Input tensor (shape depends on subclass).

Outputs:

torch.Tensor – Output tensor with activation applied.

apply_activation(x: Tensor) Tensor[source]#

Apply activation function with optional learnable scaling.

Parameters:

x (torch.Tensor) – Input tensor of arbitrary shape.

Returns:

Tensor with activation applied, same shape as input.

Return type:

torch.Tensor

class physicsnemo.nn.module.fully_connected_layers.ConvNdFCLayer(*args, **kwargs)[source]#

Bases: ConvFCLayer

Channel-wise fully connected layer with N-dimensional convolutions.

Applies a kernel-1 convolution followed by an optional activation function. For dimensions 1, 2, or 3, use Conv1dFCLayer, Conv2dFCLayer, or Conv3dFCLayer instead for better performance.

Parameters:
  • in_channels (int) – Number of input channels \(C_{in}\).

  • out_channels (int) – Number of output channels \(C_{out}\).

  • activation_fn (Union[nn.Module, None], optional, default=None) – Activation function to use. Can be None for no activation.

  • activation_par (Union[nn.Parameter, None], optional, default=None) – Learnable scaling parameter for adaptive activations.

Forward:

x (torch.Tensor) – Input tensor of shape \((B, C_{in}, *spatial)\) where \(B\) is batch size and \(*spatial\) represents arbitrary spatial dimensions.

Outputs:

torch.Tensor – Output tensor of shape \((B, C_{out}, *spatial)\).

forward(x: Tensor) Tensor[source]#

Forward pass through the N-dimensional convolutional layer.

initialise_parameters(
model: Module,
) None[source]#

Initialize weights and biases for a module.

Parameters:

model (nn.Module) – Module to initialize.

reset_parameters() None[source]#

Reset layer weights by recursively applying Xavier initialization.

class physicsnemo.nn.module.fully_connected_layers.ConvNdKernel1Layer(*args, **kwargs)[source]#

Bases: Module

Kernel-1 convolution layer for N-dimensional inputs.

Implements a 1x1 convolution by reshaping the input to 1D, applying a 1D convolution, and reshaping back. For dimensions 1, 2, or 3, use the specialized layer classes for better performance.

Parameters:
  • in_channels (int) – Number of input channels \(C_{in}\).

  • out_channels (int) – Number of output channels \(C_{out}\).

Forward:

x (torch.Tensor) – Input tensor of shape \((B, C_{in}, *spatial)\) where \(B\) is batch size and \(*spatial\) represents arbitrary spatial dimensions.

Outputs:

torch.Tensor – Output tensor of shape \((B, C_{out}, *spatial)\).

forward(x: Tensor) Tensor[source]#

Forward pass through the N-dimensional kernel-1 convolution.

class physicsnemo.nn.module.fully_connected_layers.FCLayer(*args, **kwargs)[source]#

Bases: Module

Densely connected neural network layer.

A single fully connected layer with optional activation, weight normalization, and weight factorization.

Parameters:
  • in_features (int) – Size of input features \(D_{in}\).

  • out_features (int) – Size of output features \(D_{out}\).

  • activation_fn (Union[nn.Module, Callable[[Tensor], Tensor], None], optional, default=None) – Activation function to use. Can be None for no activation.

  • weight_norm (bool, optional, default=False) – Applies weight normalization to the layer.

  • weight_fact (bool, optional, default=False) – Applies weight factorization to the layer.

  • activation_par (Union[nn.Parameter, None], optional, default=None) – Learnable scaling parameter for adaptive activations.

Forward:

x (torch.Tensor) – Input tensor of shape \((*, D_{in})\) where \(*\) denotes any number of leading batch dimensions.

Outputs:

torch.Tensor – Output tensor of shape \((*, D_{out})\).

forward(x: Tensor) Tensor[source]#

Forward pass through the layer.

reset_parameters() None[source]#

Reset fully connected layer weights to Xavier uniform initialization.

class physicsnemo.nn.module.fully_connected_layers.Linear(*args, **kwargs)[source]#

Bases: Module

Fully connected (dense) layer with customizable initialization.

The layer’s weights and biases can be initialized using custom strategies like "kaiming_normal", and scaled by init_weight and init_bias. :param in_features: Size of each input sample \(D_{in}\). :type in_features: int :param out_features: Size of each output sample \(D_{out}\). :type out_features: int :param bias: If True, adds a learnable bias to the output. If False, the layer

will not learn an additive bias.

Parameters:
  • init_mode (str, optional, default="kaiming_normal") – The initialization mode for weights and biases. Supported modes are: "xavier_uniform", "xavier_normal", "kaiming_uniform", "kaiming_normal".

  • init_weight (float, optional, default=1) – A scaling factor to multiply with the initialized weights.

  • init_bias (float, optional, default=0) – A scaling factor to multiply with the initialized biases.

  • amp_mode (bool, optional, default=False) – Whether mixed-precision (AMP) training is enabled.

Forward:

x (torch.Tensor) – Input tensor of shape \((*, D_{in})\) where \(*\) denotes any number of leading batch dimensions.

Outputs:

torch.Tensor – Output tensor of shape \((*, D_{out})\).

forward(x: Tensor) Tensor[source]#

Forward pass through the linear layer.

Multi-layer perceptron (MLP) module with optional Transformer Engine support.

class physicsnemo.nn.module.mlp_layers.Mlp(
in_features: int,
hidden_features: int | list[int] | None = None,
out_features: int | None = None,
act_layer: Module | type[Module] | str = <class 'torch.nn.modules.activation.GELU'>,
drop: float = 0.0,
final_dropout: bool = True,
bias: bool = True,
use_batchnorm: bool = False,
spectral_norm: bool = False,
use_te: bool = False,
)[source]#

Bases: Module

Multi-layer perceptron with configurable architecture.

Supports arbitrary depth, dropout, batch normalization, spectral normalization, bias control, and optional Transformer Engine linear layers.

Parameters:
  • in_features (int) – Number of input features.

  • hidden_features (int | list[int] | None, optional) – Hidden layer dimension(s). Can be: - int: Single hidden layer with this dimension - list[int]: Multiple hidden layers with specified dimensions - None: Single hidden layer with in_features dimension Default is None.

  • out_features (int | None, optional) – Number of output features. If None, defaults to in_features. Default is None.

  • act_layer (nn.Module | type[nn.Module] | str, optional) – Activation function. Can be: - str: Name of activation (e.g., "gelu", "relu", "silu") - type: Activation class to instantiate (e.g., nn.GELU) - nn.Module: Pre-instantiated activation module Default is nn.GELU.

  • drop (float, optional) – Dropout rate applied after each hidden layer. Default is 0.0.

  • final_dropout (bool, optional) – Whether to apply dropout after the final linear layer. Default is True.

  • bias (bool, optional) – Whether to include bias terms in the linear layers. Default is True.

  • use_batchnorm (bool, optional) – If True, applies BatchNorm1d after each linear layer (including the output layer). Default is False.

  • spectral_norm (bool, optional) – If True, applies spectral normalization to all linear layer weights, constraining the spectral norm to 1. Default is False.

  • use_te (bool, optional) – Whether to use Transformer Engine linear layers for optimized performance. Requires Transformer Engine to be installed. Default is False.

Examples

>>> import torch
>>> mlp = Mlp(in_features=64, hidden_features=128, out_features=32)
>>> x = torch.randn(2, 64)
>>> out = mlp(x)
>>> out.shape
torch.Size([2, 32])
>>> mlp = Mlp(in_features=64, hidden_features=[128, 256, 128], out_features=32)
>>> x = torch.randn(2, 64)
>>> out = mlp(x)
>>> out.shape
torch.Size([2, 32])
>>> # With batch normalization and spectral normalization
>>> mlp = Mlp(
...     in_features=10,
...     hidden_features=[32, 16],
...     out_features=4,
...     use_batchnorm=True,
...     spectral_norm=True,
... )
>>> mlp(torch.randn(8, 10)).shape
torch.Size([8, 4])
forward(x: Tensor) Tensor[source]#

Forward pass of the MLP.

Parameters:

x (torch.Tensor) – Input tensor of shape (*, in_features) where * denotes any number of batch dimensions.

Returns:

Output tensor of shape (*, out_features).

Return type:

torch.Tensor