I am trying to find out factors that affect the knowledge of making Juice. I have identified Multinomial logistic regression as a suitable method.
I am following the example available here: https://stats.oarc.ucla.edu/r/dae/multinomial-logistic-regression/
I have loaded the required packages as per the tutorial however there are things I am unsure of.
Is my data format okay, can I go ahead with the analysis? The tutorial indicates that multinom function does not required data to be reshaped so I am thinking they are okay.
Knowledge of juice, whether Yes, No or Use To are the out come variable with three categories. Whereas Age, Gender, Education and Access to resources are predictor variables. I hope I am correct.
After loading the packages, I loaded the data and previewed it in r studio.
The next step ideally following the tutorial is running the code below:
ml$prog2 <- relevel(ml$prog, ref = "academic")
However I am stack I because I am not sure which one will be my reference category in my case? And what the reference really means.
I tried the following
> Juice$prog2 <- relevel(Juice$Juice_knwolege, ref = "Gender") Error in relevel.default(Juice$Juice_knwolege, ref = "Gender") : 'relevel' only for (unordered) factors
I run the codes as follows;
Juice <- read.csv("C:/Users/Danny/Desktop/R_tests/Juice.csv")
Juice
Juice$Juice_knwolege<- as.factor(Juice$Juice_knwolege)
Juice$Juice <- relevel(Juice$Juice_knwolege, ref = "Yes")
library(nnet)
factors <- multinom(Juice_knwolege ~ Age + Gender + Education + Access_to_resources, data = Juice)
> summary(factors)
Call:
multinom(formula = Juice_knwolege ~ Age + Gender + Education +
Access_to_resources, data = Juice)
Coefficients:
(Intercept) Age GenderMale EducationSecondary EducationTertiary
Use To -120.3615 2.479032 21.80543 74.89392 4.281682
Yes -172.8887 5.132177 -375.50862 12.05444 77.662962
Access_to_resourcesYes
Use To -186.6950
Yes 111.3332
Std. Errors:
(Intercept) Age GenderMale EducationSecondary EducationTertiary
Use To 2.031427e-24 7.805099e-15 4.112871e-14 NaN NaN
Yes 3.625640e+02 5.990289e+01 2.498731e-67 1.780224e-46 1.550035e-59
Access_to_resourcesYes
Use To NaN
Yes 362.564
Residual Deviance: 0.0001394772
AIC: 24.00014
Warning message:
In sqrt(diag(vc)) : NaNs produced
Is this correct, I do not know how to interpret the results yet however I am trying to make sense of the results and it does not appear to make sense. no where does it show females, primary education or the age or the No for Juice knowledge.
I think you need to make sure knwolege is a factor before you can relevel
Juice$Juice_knwolege<- as.factor(Juice$Juice_knwolege)
Then set a reference category
Juice$prog2 <- relevel(Juice$Juice_knwolege, ref = "Yes") # Assuming you want 'Yes' as reference
then do your regression:
library(nnet)
model <- multinom(prog2 ~ Age + Gender + Education + Access_to_resources, data = Juice)