python pandas machine-learning logistic-regression statsmodels

How does prediction for Ordered logit regression work?

I am learning about Ordered logit regression and I was wondering how the prediction works mathematically and how can I do it in python by myself. I know that in python i can just simply use predict but I was wondering on how can I make a prediction with only coefs from model.summary().

import pandas as pd
from statsmodels.miscmodels.ordinal_model import  OrderedModel


data = pd.DataFrame({
    'score': [3.2, 4.5, 5.6, 6.7, 7.8, 8.9, 9.1],
    'rating': [1,2,3,4,5,6,6]  
})

X = data[['score']]
y = data['rating']


ordinal_model = OrderedModel(y, X, distr='logit')


ordinal_results = ordinal_model.fit(method='bfgs')


print(ordinal_results.summary())

The outcome is:

Time:                        17:05:52                                         
No. Observations:                   7                                         
Df Residuals:                       1                                         
Df Model:                           1                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
score         66.3902   5669.125      0.012      0.991    -1.1e+04    1.12e+04
1/2          285.5835   2.56e+04      0.011      0.991   -4.98e+04    5.04e+04
2/3            4.2698     88.656      0.048      0.962    -169.493     178.032
3/4            4.1879    155.834      0.027      0.979    -301.241     309.617
4/5            4.3867    136.765      0.032      0.974    -263.668     272.442
5/6            3.4706    220.734      0.016      0.987    -429.161     436.102
==============================================================================

Using coef vector how do i get the same output as in

ordinal_results.model.predict(ordinal_results.params, exog = (4.3))

[[0.5264086 0.4735914 0.        0.        0.        0.       ]]

I thought that I simply should use softmax on linear sum of coef and new data but that didn't work

Solution

You suggested the "linear sum of coef and new data", which is correct, but since you only have one feature it's just the coefficient times the new data value:

66.3902 * 4.3
>>> 285.47786

But the other entries of the coef column aren't really coefficients in the traditional sense (and there is no softmax); instead they represent cutoffs for the discrete targets.

The prediction from the linear model (285.48 above) is taken as the mean of a normal distribution y with standard deviation of 1 (by default, see parameter distr), and the probability of each target is the probability that y is between the associated cutoffs.

It's not documented so well what those cutoffs are, but I assume that the first non-coefficient coef is the first cutoff, and the rest indicate the difference between consecutive cutoffs. So

p_1 = P(y < 285.5835) ~= 0.5264086
p_2 = P(285.5835 < y < 285.5835 + 4.2698) ~= 0.4735914
p_3 = P( 285.5835 + 4.2698 < y <  285.5835 + 4.2698 + 4.1879) ~= 0
etc.