rmlogit

mlogit : using varying alternatives for mlogit in R


I am trying to use varying alternatives for each person. However not able to get it working. If I make the alternatives same for each person, it works fine. How to make it varying and work.

Data :

> dput( df1 )
structure(list(Choice = c(1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 
0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 
1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L), A = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L), B = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 1L, 0L, 0L, -1L, 0L, 
0L), C = c(1L, 0L, 0L, 0L, -1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 0L, 0L, 
1L, 0L, 0L, -1L, 0L, 0L, 0L, 0L, 0L, 0L), D = c(0L, 1L, 0L, 0L, 
0L, -1L, 0L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 0L, 1L, 0L, 0L, -1L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L), E = c(0L, 0L, 1L, 0L, 0L, 0L, -1L, 0L, 0L, 0L, 1L, 
0L, 0L, -1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, -1L, 0L), F = c(0L, 0L, 
0L, 1L, 0L, 0L, 0L, -1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 0L, -1L, 0L, 0L, 1L, 0L, 0L, -1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, -1L), Alternative = c(1L, 2L, 3L, 4L, 1L, 
2L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L)), row.names = c(NA, -38L), class = "data.frame")

Code :

model = mlogit( Choice ~ B + C + D + E + F | 0, data = df1, 
               
                alt.levels = unique( df1$Alternative ), 
               
               shape = "long")

Error

Error in dfidx::dfidx(data = data, dfa$idx, drop.index = dfa$drop.index,  : 
  the data must be balanced in order to use the levels argument

Solution

  • You need to provide mlogit with an explicit ID variable denoting which participant made the choice. It can't infer them from the data.frame you've provided.

    I'm assuming in your reproducible example that the alternatives in rows running sequentially from [1 - 4] or [1 - 3] represent the choice sets presented to a unique individual. If so, then you can fit a model like so:

    library(mlogit)
    
    # Explicitly create an ID variable
    df1$ID <- rep(1:12, times = c(rep(4, 2), rep(3, 10)))
    
    #Convert to dfidx data
    dfx1 <- mlogit.data(df1, 
                        shape = "long", 
                        choice = "Choice",
                        id.var = "ID")
    
    # Fit a model
    m0 <- mlogit(Choice ~ B + C + D + E + F | 0, 
                 data = dfx1)