rmlogit

mlogit: Error in idx_name.dfidx(x) : More than one idx column


My data are in the data df:

data <- structure(list(personID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), 
        problem = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), choice = c("Right", 
        "Right", "Left", "Middle", "Left", "Middle", "Right", "Right","Right"), 
        valueLeft = c(42.0570675657767, 45.9309219826825, 67.9396886866177, 72.9432649788673, 49.4515911099392, 50.4063604914605, 27.930814474658, 59.6351010950862, 62.4926813609122), 
        valueMiddle = c(44.8488506631671, 46.7573345964733, 34.8353743300335, 53.0737324797719, 53.6425649799281, 22.8107239238157, 70.9399442391851, 61.5264080883177, 52.5180668579461), 
        valueRight = c(63.1719866895542, 49.6082608281623, 62.155490653361, 63.5991448059925, 35.8780394435125, 48.0016025122861, 44.7708824302011, 53.0087719413007, 46.0633500203071)), 
        row.names = c(NA, -9L), class = "data.frame")

I put it in wide format in the data.logit df with dfidx:

data.logit <- dfidx(data, shape = "wide", 
                    choice = "choice", 
                    drop.index = TRUE,
                    id.var = "personID"
)

Then I introduce indepvar and insert it into data.logit:

indepvar <- c()
for (i in 1:length(data.logit$idx$id2)) {
  ifelse(data.logit$idx$id2[i]=="Left", indepvar[i] <- 
           data.logit$valueLeft[i], 
         ifelse(data.logit$idx$id2[i]=="Middle", indepvar[i] <- 
                  data.logit$valueMiddle[i],
                ifelse(data.logit$idx$id2[i]=="Right", indepvar[i] <- 
                         data.logit$valueRight[i], ""
                )))}
indepvar <- data.frame(indepvar)
data.logit <- cbind(data.logit, indepvar)
remove(indepvar,i)

I then split data.logit by problem and put each of the resulting dfs into a list:

list_menus <- list()
for (i in 1:3) {
  list_menus[[paste0("Problem_",i)]] <- 
    dplyr::filter(data.logit, problem==i)
  remove(i)
}

Finally, I want to use mlogit to estimate a model (the same model) at each df in this list, with choice the dependent and indepvar the independent variable:

list_estimates <- list()
for (i in 1:length(list_menus)) {
  list_estimates[[i]] <- 
    mlogit(formula = choice ~ 1 + indepvar,
      data = list_menus[[i]],
       drop.index = TRUE,
       id.var = "personID")
  remove(i)
}

This should lead to a list of mlogit estimation outputs, one for each df in problems_list. Instead, it leads to

Error in idx_name.dfidx(x) : More than one idx column

This error did not occur with mlogit v1.1-1 and R v.4.4.1 some months ago, but does arise with mlogit v.1.1-3 and R v.4.5.1 now.


Solution

  • The immediate Problem is, that by calling cbind you remove the dfidx class from data.logit.

    > data.logit$indepvar <- indepvar
    > idx_name(data.logit)
    idx 
      6 
    > class(data.logit)
    [1] "dfidx"      "data.frame"
    > test <- cbind(data.logit, indepvar)
    > class(test)
    [1] "data.frame"
    > idx_name(test)
    Fehler in UseMethod("idx_name") : 
      nicht anwendbare Methode für 'idx_name' auf Objekt der Klasse "data.frame" angewendet
    

    Anyway I ran into all kinds of problems with your approach and I would instead recommend using the varying and sep commands in dfidx like the Fishing example from the help-page (https://rdrr.io/cran/mlogit/man/mlogit.html)

    data.logit <- dfidx(data, shape = "wide", 
                        choice = "choice", 
                        drop.index = TRUE,
                        varying = 4:6, 
                        sep = "ue" # compare "." in Fishing data set column names 2:9
    )
    # valid total model
    summary(mlogit(formula = choice ~ 1 + val,
           data = data.logit))
    
    # subset
    mlogit(formula = choice ~ 1 + val,
           data = data.logit[data.logit$problem == 1, ])
    
    
    list_estimates <- list()
    for (i in 1:length(list_menus)) {
      list_estimates[[i]] <- 
        mlogit(formula = choice ~ 1 + val,
               data = data.logit[data.logit$problem == i, ])
      remove(i)
    }