ranalysis

nTrials must be be greater.... issue on conjoint design


I'm trying to create a list of conjoint cards using R.

I have followed the professor's introduction, with my own dataset, but I'm stuck with this issue, which I have no idea.

library(conjoint)
experiment<-expand.grid(
  ServiceRange = c("RA", "Active", "Passive","Basic"),
  IdentProce = c("high", "mid", "low"),
  Fee = c(1000,500,100),
  Firm = c("KorFin","KorComp","KorStrt", "ForComp")
) 
print(experiment)

design=caFactorialDesign(data=experiment, type="orthogonal")
print(design)

at the "design" line, I'm keep getting the following error message:

Error in optFederov(~., data, nTrials = i, approximate = FALSE, nRepeats = 50) :    
nTrials must not be greater than the number of rows in data

How do I address this issue?


Solution

  • You're getting this error because you have 144 rows in experiment, but the nTrials mentioned in the error gets bigger than 144. This causes an error for optFederov(), which is called inside caFactorialDesign(). The problem stems from the fact that your Fee column has relatively large values.

    I'm not familiar with how the conjoint package is set up, but I can show you how to troubleshoot this error. You can read the conjoint documentation for more on how to select appropriate experimental data.
    (Note that the example data in the documentation always has very low numeric values, usually values between 1-10. Compare that with your Fee vector, which has values up to 1000.)

    You can see the source code for a function loaded into your RStudio namespace by highlighting the function name (e.g. caFactorialDesign) and hitting Command-Return (on a Mac - probably something similar on PC). You can also just look at the source code on GitHub.

    The caFactorialDesign is implemented here. That link highlights the line (26) that is throwing the error for you:

    temp.design<-optFederov(~., data, nTrials=i, approximate=FALSE, nRepeats=50)
    

    Recall the error message:

    nTrials must not be greater than the number of rows in data

    You've passed in experiment as the data parameter, so nrow(experiment) will tell us what the upper limit on nTrials is:

    nrow(experiment) # 144
    

    We can actually just think of the error for this dataset as:

    nTrials must not be greater than 144

    Ok, so how is the value for nTrials determined? We can see nTrials is actually an argument to optFederov(), and its value is set as i - often a sign that there's a for-loop wrapping an operation. And in fact, that's what we see:

    for (i in ca.number: profiles.number)
    {
       temp.design<-optFederov(~., data, nTrials=i, approximate=FALSE, nRepeats=50)
       ...
    }
    

    This tells us that optFederov() is going to get called for each value of i in the loop, which will start at ca.number and will go up to profiles.number (inclusive).

    How are these two variables assigned? If we look a little higher up in the caFactorialDesign() definition, ca.number is defined on lines 5-9:

    num <- data.frame(data.matrix(data))
    vars.number<-length(num)
    levels.number<-0
    for (i in 1:length(num)) levels.number<-levels.number+max(num[i])
    ca.number<-levels.number-vars.number+1
    

    You can run these calculations outside of the function - just remember that data == experiment. So just change that first line to num <- data.frame(data.matrix(experiment)), and then run that chunk of code. You can see that ca.number == 1008!!

    In other words, the very first value of i in the for-loop which calls optFederov() is already way bigger than the max limit: 1008 >> 144.

    It's possible you can include these numeric values as factors or strings in your definition of experiment - I'm not sure if that is an appropriate way to do this analysis. But I hope it's clear that you won't be able to use such large values in caFactorialDesign(), unless you have a much larger number of total observations in your data.