pythonpytorchpycaretskorch

Using SKORCH with PyCaret for Regression problems


Using the fantastic article https://towardsdatascience.com/pycaret-skorch-build-pytorch-neural-networks-using-minimal-code-57079e197f33 there is a great example of using SKORCH and PyCaret to do Classification problems, but I am having trouble getting it working for Regression problems.

import pycaret
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
from skorch import NeuralNetRegressor
from sklearn.pipeline import Pipeline
from skorch.helper import DataFrameTransformer
from pycaret.regression import *
from pycaret.datasets import get_data

data = get_data('boston')
target = "medv"

reg1 = setup(data = data, 
            target = target,
            train_size = 0.8,
            fold = 5,
            session_id = 123,
            silent = True)

class RegressorModule(nn.Module):
    def __init__(
            self,
            num_units=100,
            nonlin=F.relu,
    ):
        super(RegressorModule, self).__init__()
        self.num_units = num_units
        self.nonlin = nonlin

        self.dense0 = nn.Linear(14, num_units)
        self.nonlin = nonlin
        self.dense1 = nn.Linear(num_units, 10)
        self.output = nn.Linear(10, 1)

    def forward(self, X, **kwargs):
        X = self.nonlin(self.dense0(X))
        X = F.relu(self.dense1(X))
        X = self.output(X)
        return X

net_regr = NeuralNetRegressor(
    RegressorModule,
    max_epochs=20,
    lr=0.1,
    device='cuda'
)

nn_pipe = Pipeline(
    [
        ("transform", DataFrameTransformer()),
        ("net", net_regr),
    ]
)

skorch_model = create_model(nn_pipe)

But it errors with:

ValueError: The target data shouldn't be 1-dimensional but instead have 2 dimensions, with the second dimension having the same size as the number of regression targets (usually 1). Please reshape your target data to be 2-dimensional (e.g. y = y.reshape(-1, 1).

If I take the same data and normalise it, reshape it etc and pass that straight to SKORCH, it works fine, like so:

X = data.copy().to_numpy().astype(np.float32)
mean = X.mean(axis=0)
X -= mean
std = X.std(axis=0)
X /= std

y = data[target].to_numpy().astype(np.float32)
y = y.reshape(-1, 1)
net_regr.fit(X, y)

enter image description here

So the problem is somewhere in how it takes the PyCaret (DataFrame based) data and SKORCH converts for use in PyTorch, that the y is staying single dimension, which is fine for the Classification model in the above link, but not for regression where it needs to be 2D. Is there anyway I can intercept / transform the y?

Thanks :)


Solution

  • It is mentioned in Pytorch Dataset with Skorch. Anyway this will not solve the problem. If you overload the Fit of the NeuralNetworkRegressor like:

    class MyNet(NeuralNetRegressor):
    def fit(self, X, y):
        if y.ndim == 1:
            y = y.values.reshape(-1, 1)
        return super().fit(X, y)
    
    net_regr = MyNet(
        RegressorModule,
        max_epochs=20,
        lr=0.1,
        train_split=None
    )
    

    it should work.