scimba_torch.neural_nets.coordinates_based_nets.mlp¶
Multi-Layer Perceptron (MLP) architectures.
Functions
|
Initializes parameters. |
Classes
|
A linear transformation with factorized parameterization of the weights. |
|
A general Multi-Layer Perceptron (MLP) architecture. |
|
A general Multiplicative Multi-Layer Perceptron (MMLP) architecture. |
|
A Multi-Layer Perceptron with modulation based on auxiliary input. |
|
A Multi-MLP architecture that creates a separate MLP for each output variable. |
- factorized_glorot_normal(mean=1.0, stddev=0.1)[source]¶
Initializes parameters.
Use a factorized version of the Glorot normal initialization.
- Parameters:
mean (
float) – Mean of the log-normal distribution used to scale the singular values.stddev (
float) – Standard deviation of the log-normal distribution.
- Return type:
Callable- Returns:
A function that takes a shape tuple and returns the factorized parameters s and v.
Example
>>> init_fn = factorized_glorot_normal() >>> s, v = init_fn((64, 128))
- class FactorizedLinear(input_dim, output_dim, has_bias=True)[source]¶
Bases:
ModuleA linear transformation with factorized parameterization of the weights.
The weight matrix is expressed as the product of two factors: - s: A column-wise scaling factor. - v: A normalized weight matrix.
- Parameters:
input_dim (
int) – Size of each input sample.output_dim (
int) – Size of each output sample.has_bias (
bool) – Whether to include a bias term (default: True).
- s¶
Column-wise scaling factors
- v¶
Normalized weight matrix
- bias¶
Bias vector added to the output
- class GenericMLP(in_size, out_size, **kwargs)[source]¶
Bases:
ScimbaModuleA general Multi-Layer Perceptron (MLP) architecture.
- Parameters:
in_size (
int) – Dimension of the inputout_size (
int) – Dimension of the output**kwargs –
Additional keyword arguments:
activation_type (
str, default=”tanh”): The activation function type to use in hidden layers.activation_output (
str, default=”id”): The activation function type to use in the output layer.layer_sizes (
list[int], default=[20]*6): A list of integers representing the number of neurons in each hidden layer.weights_norm_bool (
bool, default=False): If True, applies weight normalization to the layers.random_fact_weights_bool (
bool, default=False): If True, applies factorized weights to the layers.
Example
>>> model = GenericMLP(10, 1, activation_type='relu', layer_sizes=[64, 128, 64])
A list of hidden linear layers.
- output_layer¶
The final output linear layer.
- forward(inputs, with_last_layer=True)[source]¶
Apply the network to the inputs.
- Parameters:
inputs (
Tensor) – Input tensorwith_last_layer (
bool) – Whether to apply the final output layer
- Return type:
Tensor- Returns:
The result of the network
Expands the hidden layers of the MLP to new sizes.
The new sizes must match the number of hidden layers in the MLP. The weights of the new layers are initialized to zero, and the weights of the old layers are copied into the new layers. The output layer is also expanded to match the new sizes.
- Parameters:
new_layer_sizes (
list[int]) – list of integers representing the new sizes of the hidden layers.set_to_zero (
bool) – If True, initializes the weights of the new layers to zero. Otherwise, set them to small random values.
- class MultiMLP(in_size, out_size, **kwargs)[source]¶
Bases:
ScimbaModuleA Multi-MLP architecture that creates a separate MLP for each output variable.
Each output variable is computed by its own MLP that takes all inputs and produces a single output. The outputs are concatenated to form the final output.
- Parameters:
in_size (
int) – Dimension of the inputout_size (
int) – Dimension of the output (number of output variables)**kwargs –
Additional keyword arguments:
activation_type (
str, default=”tanh”): The activation function type to use in hidden layers.activation_output (
str, default=”id”): The activation function type to use in the output layer.layer_sizes (
list[int], default=[20]*6): A list of integers representing the number of neurons in each hidden layer per MLP.weights_norm_bool (
bool, default=False): If True, applies weight normalization to the layers.random_fact_weights_bool (
bool, default=False): If True, applies factorized weights to the layers.
Example
>>> model = MultiMLP(3, 2, activation_type='relu', layer_sizes=[32, 64, 32]) >>> # Creates 2 MLPs, each taking 3 inputs and producing 1 output
- mlps¶
A list of individual MLPs, one per output variable.
- forward(inputs, with_last_layer=True)[source]¶
Apply the network to the inputs.
- Parameters:
inputs (
Tensor) – Input tensor of shape (batch_size, in_size)with_last_layer (
bool) – Whether to apply the final output layer
- Return type:
Tensor- Returns:
The result of the network of shape (batch_size, out_size)
- class GenericMMLP(in_size, out_size, **kwargs)[source]¶
Bases:
ScimbaModuleA general Multiplicative Multi-Layer Perceptron (MMLP) architecture.
As proposed by Yanfei Xiang.
- Parameters:
in_size (
int) – Dimension of the inputout_size (
int) – Dimension of the output**kwargs –
Additional keyword arguments:
activation_type (
str, default=”tanh”): The activation function type to use in hidden layers.activation_output (
str, default=”id”): The activation function type to use in the output layer.layer_sizes (
list[int], default=[10, 20, 20, 20, 5]): A list of integers representing the number of neurons in each hidden layer.weights_norm_bool (
bool, default=False): If True, applies weight normalization to the layers.random_fact_weights_bool (
bool, default=False): If True, applies factorized weights to the layers.
Example
>>> model = GenericMMLP( ... 10, 5, activation_type='relu', layer_sizes=[64, 128, 64] ... )
A list of hidden linear layers.
A list of multiplicative linear layers.
- output_layer¶
The final output linear layer.
- class GenericModulationMLP(x_size, y_size, out_size, **kwargs)[source]¶
Bases:
ScimbaModuleA Multi-Layer Perceptron with modulation based on auxiliary input.
Each layer applies the transformation: gamma_l(y) * W_l * x + b_l(y) where gamma_l and b_l are small modulation networks that take y as input.
- Parameters:
x_size (
int) – Dimension of the main input xy_size (
int) – Dimension of the modulation input yout_size (
int) – Dimension of the output**kwargs –
Additional keyword arguments:
activation_type (
str, default=”tanh”): The activation function type to use in hidden layers.activation_output (
str, default=”id”): The activation function type to use in the output layer.layer_sizes (
list[int], default=[20]*6): A list of integers representing the number of neurons in each hidden layer for x.modulation_layer_sizes (
list[int], default=[10, 10]): A list of integers representing the hidden layer sizes for gamma and b networks.weights_norm_bool (
bool, default=False): If True, applies weight normalization to the layers.random_fact_weights_bool (
bool, default=False): If True, applies factorized weights to the layers.
Example
>>> model = GenericModulationMLP( ... x_size=3, y_size=2, out_size=1, ... layer_sizes=[64, 64], modulation_layer_sizes=[16, 16] ... )
- forward(x, y=None, with_last_layer=True)[source]¶
Apply the modulated network to the inputs.
- Parameters:
x (
Tensor) – Main input data of shape (batch_size, x_size), or if y is None, concatenated input of shape (batch_size, x_size + y_size)y (
Tensor|None) – Modulation input data of shape (batch_size, y_size). Optional if x contains both x and y concatenated.with_last_layer (
bool) – Whether to apply the final output layer
- Return type:
Tensor- Returns:
The result of the network of shape (batch_size, out_size)