rregressionlinear-regressionlogistic-regressionmultinomial

Multinomial Logistic regression in R studio


I am trying to find out factors that affect the knowledge of making Juice. I have identified Multinomial logistic regression as a suitable method.

Here is my original data. enter image description here

I am following the example available here: https://stats.oarc.ucla.edu/r/dae/multinomial-logistic-regression/

I have loaded the required packages as per the tutorial however there are things I am unsure of.

Is my data format okay, can I go ahead with the analysis? The tutorial indicates that multinom function does not required data to be reshaped so I am thinking they are okay.

Knowledge of juice, whether Yes, No or Use To are the out come variable with three categories. Whereas Age, Gender, Education and Access to resources are predictor variables. I hope I am correct.

After loading the packages, I loaded the data and previewed it in r studio.

The next step ideally following the tutorial is running the code below:

ml$prog2 <- relevel(ml$prog, ref = "academic")

However I am stack I because I am not sure which one will be my reference category in my case? And what the reference really means.

I tried the following

> Juice$prog2 <- relevel(Juice$Juice_knwolege, ref = "Gender") Error in relevel.default(Juice$Juice_knwolege, ref = "Gender") : 'relevel' only for (unordered) factors

I run the codes as follows;

Juice <- read.csv("C:/Users/Danny/Desktop/R_tests/Juice.csv")
Juice
Juice$Juice_knwolege<- as.factor(Juice$Juice_knwolege)
Juice$Juice <- relevel(Juice$Juice_knwolege, ref = "Yes")
library(nnet)
factors <- multinom(Juice_knwolege ~ Age + Gender + Education + Access_to_resources, data = Juice)

> summary(factors)
Call:
multinom(formula = Juice_knwolege ~ Age + Gender + Education + 
    Access_to_resources, data = Juice)

Coefficients:
       (Intercept)      Age GenderMale EducationSecondary EducationTertiary
Use To   -120.3615 2.479032   21.80543           74.89392          4.281682
Yes      -172.8887 5.132177 -375.50862           12.05444         77.662962
       Access_to_resourcesYes
Use To              -186.6950
Yes                  111.3332

Std. Errors:
        (Intercept)          Age   GenderMale EducationSecondary EducationTertiary
Use To 2.031427e-24 7.805099e-15 4.112871e-14                NaN               NaN
Yes    3.625640e+02 5.990289e+01 2.498731e-67       1.780224e-46      1.550035e-59
       Access_to_resourcesYes
Use To                    NaN
Yes                   362.564

Residual Deviance: 0.0001394772 
AIC: 24.00014 
Warning message:
In sqrt(diag(vc)) : NaNs produced

Is this correct, I do not know how to interpret the results yet however I am trying to make sense of the results and it does not appear to make sense. no where does it show females, primary education or the age or the No for Juice knowledge.


Solution

  • I think you need to make sure knwolege is a factor before you can relevel

    Juice$Juice_knwolege<- as.factor(Juice$Juice_knwolege)
    

    Then set a reference category

    Juice$prog2 <- relevel(Juice$Juice_knwolege, ref = "Yes") # Assuming you want 'Yes' as reference
    

    then do your regression:

    library(nnet)
    model <- multinom(prog2 ~ Age + Gender + Education + Access_to_resources, data = Juice)