[SOLVED] Saving Pytorch model with specified activation function

Saving Pytorch model with specified activation function

I am trying to save my Pytorch model with its activation function used.

Here is a simple example

# define class for neural network
class NNM(nn.Module):
    def __init__(self, num_features, num_hidden):
        super(NNM, self).__init__()
        self.fc1 = nn.Linear(num_features, num_hidden)
        self.fc2 = nn.Linear(num_hidden, 1)
        
        self.saved_parameters = []
        
    def forward(self, x):
        x = torch.sigmoid(self.fc1(x))
        return self.fc2(x)
    
    def save_parameters(self):
        self.saved_parameters.append(copy.deepcopy(self.state_dict()))

# created model a few lines later

model = NNM(28, 100)

Here, the list saved_parameters and the function save_parameters will allow to save model's parameters in a list a specified location of training.

Using model.eval() just shows model's layers (input-hidden and hidden-output).

NNM(
  (fc1): Linear(in_features=28, out_features=100, bias=True)
  (fc2): Linear(in_features=100, out_features=1, bias=True)
)

What I need is something that integrates the used activation function.

NNM(
  (fc1): Linear(in_features=28, out_features=100, bias=True)
  (fc2): Linear(in_features=100, out_features=1, bias=True)
  (act_func): Sigmoid(fc1-fc2)
)

Or a simpler solution to save the activation function with related layers in a dictionary.

Solution

The simplest way to have an explicit structure when using a sequential architecture is to define all layers (including any linear or flattening layers) in a nn.Sequential module:

class NNM(nn.Sequential):
    def __init__(self, num_features, num_hidden):
        super().__init__(
            nn.Linear(num_features, num_hidden),
            nn.Sigmoid(),
            nn.Linear(num_hidden, 1))

        self.saved_parameters = []

    def save_parameters(self):
        self.saved_parameters.append(copy.deepcopy(self.state_dict()))

Then your instance representation will be explicit:

>>> NNM(28, 100)
NNM(
  (0): Linear(in_features=28, out_features=100, bias=True)
  (1): Sigmoid()
  (2): Linear(in_features=100, out_features=1, bias=True)
)