I have to do some work in R for university and I'm stuck. When I use the predicts() function from the glm.predict package, I keep getting an error message that I don't understand. Codes that I get from others or even from AI don't work either.
The data is a dta.dataset.
First, I had to estimate a multinomial regression and include three of my independent variables (in addition to the dependent variable) as factors (female, class and edlevel as well as de independent variable vote).
dput(df_Wahlen)
df_Wahlen <- df_Wahlen |>
mutate(
edlevel = factor(edlevel), female = factor(female, labels = c("Mann", "Frau")), class = factor(class, labels = c("Mittelschicht", "Arbeiterklasse", "Keine")), vote = factor(vote, levels = c(2,1,3,4,5), labels = c("Conservative", "Labour", "LibDem", "Green", "Reform"))
)
Now, when I want to predict probability using the nominal regression model with predicts() and also apply discrete changes, R tells me that either the variable is out of bounds or I want numerical values for a factor variable.
model_mlogit = multinom(vote ~ edlevel + female + class + age + leftright, data = df_Wahlen)
df_pred2 <- predicts(model_mlogit, "0;0;2;20;6", type = "simulation", sim.count = 10000, set.seed = 1848)
R tells me this:
Error in variable[, 1] : subscript out of bounds
When I try the new levels:
df_pred2 <- predicts(model_mlogit, "0;Mann;2;20;6", type = "simulation", sim.count = 10000, set.seed = 1848)
R tells me this:
Error in getValues(values, data) :
female is specified as numeric in the values argument, but it is a factor/character.
And when I try min
df_pred2 <- predicts(model_mlogit, "min;min;2;20;6", type = "simulation", sim.count = 10000, set.seed = 1848)
I get this:
Error in getValues(values, data) :
edlevel is specified as numeric in the values argument, but it is a factor/character.
I believe there is a simple solution, but since I am a beginner, I would appreciate some help. Thank you very much!
I don't have the package glm.predict on my PC, but here is a possible solution.
Suppose you would like to predict for a man, with the first education level, from an unknown citizen class, who is 20 years old, and his political views are "Reform".
Then your code should look like this:
df_pred2 <- predicts(model_mlogit, "1;1;3;20;5", type = "simulation", sim.count = 10000, set.seed = 1848)
One of the issues in your code was that the leftright variable had only 5 levels, whereas in your prediction, your input was 6.