I am trying to run a negative binomial regression on the following:
df <- structure(list(Year = c("2018", "2018", "2018", "2018", "2018",
"2018", "2018", "2018", "2018", "2018", "2018", "2018", "2019",
"2019", "2019", "2019", "2019", "2019"), Month = c("1", "10",
"11", "12", "2", "3", "4", "5", "6", "7", "8", "9", "1", "2",
"3", "4", "5", "6"), count = c(109L, 91L, 73L, 74L, 94L, 113L,
92L, 100L, 114L, 111L, 106L, 86L, 116L, 92L, 94L, 84L, 78L, 98L
), year_mon = c("2018 - 1", "2018 - 10", "2018 - 11", "2018 - 12",
"2018 - 2", "2018 - 3", "2018 - 4", "2018 - 5", "2018 - 6", "2018 - 7",
"2018 - 8", "2018 - 9", "2019 - 1", "2019 - 2", "2019 - 3", "2019 - 4",
"2019 - 5", "2019 - 6")), row.names = c(NA, -18L), groups = structure(list(
Year = c("2018", "2019"), .rows = structure(list(1:12, 13:18), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1:2, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
I'm assuming this is the best regression technique for this other than Poisson Regression but I run the following....
library(MASS)
summary(glm.nb(count ~ year_mon, data=df))
..and get this error...
Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE neededError in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE needed
Unsure what exactly I am doing wrong here. I'm not exactly attached to Negative Binom for this but I want another model to compare to than just Poisson, and this looks like a good fit.
As @rawr says, you need to convert the predictor variable to some kind of numeric value: otherwise you have one point per level of the categorical predictor. This works, for example:
glm.nb(count~as.numeric(factor(year_mon)), data=df)
... although it's probably better/more readable to modify the variable inside your data frame first (or create a new variable inside the data frame) rather than doing the conversion on the fly