I am reading a NN from here and I don't understand the forward pass complitely.
Why at some point we are doing x = F.relu(x1) + F.relu(x2) + F.relu(x3)
? It seems that the input of the linear layer lin1 is the sum(!!!) of the previous 3 layers. This seems quite strange to me, as I expect that a layer is fed only the output of the previous layer.
def forward(self, data):
x, edge_index, batch = data.x, data.edge_index, data.batch
edge_attr = None
x = F.relu(self.conv1(x, edge_index, edge_attr))
x, edge_index, edge_attr, batch = self.pool1(x, edge_index, edge_attr, batch)
x1 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)
x = F.relu(self.conv2(x, edge_index, edge_attr))
x, edge_index, edge_attr, batch = self.pool2(x, edge_index, edge_attr, batch)
x2 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)
x = F.relu(self.conv3(x, edge_index, edge_attr))
x3 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)
x = F.relu(x1) + F.relu(x2) + F.relu(x3)
x = F.relu(self.lin1(x))
x = F.dropout(x, p=self.dropout_ratio, training=self.training)
x = F.relu(self.lin2(x))
x = F.dropout(x, p=self.dropout_ratio, training=self.training)
x = F.log_softmax(self.lin3(x), dim=-1)
return x
The sum of the outputs from different layers is used to create a combined representation of the input data before passing it through subsequent layers. This technique is a way to enrich the model's understanding of the data by fusing information from multiple layers, which can be particularly useful in hierarchical or graph-based models where different layers may capture different structural or contextual information.