pythonmatplotlibscikit-learn

Why does plotting errors vs actual via PredictionErrorDisplay result in a value error?


I have trained a random forest regression model using sklearn, and used it to make some predictions on a test dataset. Naturally there are errors where the values predicted by the model are not the same as the actual values; in this case, the model's Mean Average Error and Mean Squared Error are quite high.

I want to visualise the errors, so that I can understand whether the errors are consistently large, or whether there are just a few unusually large errors driving up the averages.

I'm trying to use sklearn's PredictionErrorDisplay function to do this, but the following code returns the error message "ValueError: Unable to coerce to Series, length must be 1":

errors = PredictionErrorDisplay(y_true = test_targets, y_pred = test_predictions)
errors.plot()
plt.savefig('Output.png')
plt.clf()

Does anyone know how I can resolve this please? My reading of the error is that I need to convert the object PredictionErrorDisplay creates into a different format, but I'm not sure how to do that, or what the format needs to be exactly.


Solution

  • The solution has been supplied by rayan2338.

    The problem was that test_targets was a pandas dataframe and therefore incompatible with PredictionErrorDisplay.

    The following code produces the required result:

    test_targets_array = np.array(test_targets)
    test_targets_array = test_targets_array.flatten()
    errors = PredictionErrorDisplay(y_true = test_targets_array, y_pred = test_predictions)
    errors.plot()
    plt.savefig('Output.png')
    plt.clf()