pythonpandasmachine-learninglogistic-regressionstatsmodels

How does prediction for Ordered logit regression work?


I am learning about Ordered logit regression and I was wondering how the prediction works mathematically and how can I do it in python by myself. I know that in python i can just simply use predict but I was wondering on how can I make a prediction with only coefs from model.summary().

import pandas as pd
from statsmodels.miscmodels.ordinal_model import  OrderedModel


data = pd.DataFrame({
    'score': [3.2, 4.5, 5.6, 6.7, 7.8, 8.9, 9.1],
    'rating': [1,2,3,4,5,6,6]  
})

X = data[['score']]
y = data['rating']


ordinal_model = OrderedModel(y, X, distr='logit')


ordinal_results = ordinal_model.fit(method='bfgs')


print(ordinal_results.summary())

The outcome is:

Time:                        17:05:52                                         
No. Observations:                   7                                         
Df Residuals:                       1                                         
Df Model:                           1                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
score         66.3902   5669.125      0.012      0.991    -1.1e+04    1.12e+04
1/2          285.5835   2.56e+04      0.011      0.991   -4.98e+04    5.04e+04
2/3            4.2698     88.656      0.048      0.962    -169.493     178.032
3/4            4.1879    155.834      0.027      0.979    -301.241     309.617
4/5            4.3867    136.765      0.032      0.974    -263.668     272.442
5/6            3.4706    220.734      0.016      0.987    -429.161     436.102
==============================================================================

Using coef vector how do i get the same output as in

ordinal_results.model.predict(ordinal_results.params, exog = (4.3))
[[0.5264086 0.4735914 0.        0.        0.        0.       ]]

I thought that I simply should use softmax on linear sum of coef and new data but that didn't work


Solution

  • You suggested the "linear sum of coef and new data", which is correct, but since you only have one feature it's just the coefficient times the new data value:

    66.3902 * 4.3
    >>> 285.47786
    

    But the other entries of the coef column aren't really coefficients in the traditional sense (and there is no softmax); instead they represent cutoffs for the discrete targets.

    The prediction from the linear model (285.48 above) is taken as the mean of a normal distribution y with standard deviation of 1 (by default, see parameter distr), and the probability of each target is the probability that y is between the associated cutoffs.

    It's not documented so well what those cutoffs are, but I assume that the first non-coefficient coef is the first cutoff, and the rest indicate the difference between consecutive cutoffs. So

    p_1 = P(y < 285.5835) ~= 0.5264086
    p_2 = P(285.5835 < y < 285.5835 + 4.2698) ~= 0.4735914
    p_3 = P( 285.5835 + 4.2698 < y <  285.5835 + 4.2698 + 4.1879) ~= 0
    etc.