rt-test

How to fix my t.test error message in R that has no missing value?


My data frame is the following :

Df <- structure(list(SES = c("High", "High", "High", "Low", "High", 
"Low", "High", "High", "High", "Low", "Low", "Low", "High", "High", 
"Low", "High", "High", "Low", "High", "High", "Low", "High", 
"Low", "Low", "Low", "Low", "High", "Low", "High", "Low", "High", 
"High", "Low", "High", "Low", "High", "High", "High", "Low", 
"High", "High", "Low", "Low", "High", "Low", "Low", "Low", "Low", 
"High", "High", "Low", "High"), entry_age = c(12, 2.5, 7, 2.5, 
2.5, 12, 9, 2.5, 3, 8, 12, 2.5, 5.5, 6, 2.5, 2.5, 2.5, 16, 12, 
5, 7, 2.5, 12, 2.5, 2.5, 12, 12, 12, 6, 24, 2.5, 2.5, 2, 3.5, 
2.5, 2.5, 2.5, 4, 7, 12, 7, 9, 12, 6, 18, 15, 8, 12, 2.5, 6, 
10, 5)), row.names = c(NA, -52L), class = c("tbl_df", "tbl", 
"data.frame"))

I have a nice difference in means and would like to test its significance with a t-test using the t.test function as follows:

t.test(Df$SES, Df$entry_age)

So very easy, nothing complicated. However, what I obtain is the following error code, which I don't understand:

Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") : 
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) :
  l'argument n'est ni numérique, ni logique : renvoi de NA
2: In var(x) : NAs introduced by coercion

I ran a NA test and there is none.

I did not find the meaning of this error message in Google. What can I try next?


Solution

  • Look at help('t.test') to understand the usage; the way you call it, it expects to test values between the groups x=Df$SE (which is not what you want) and y=Df$entry_age. Then try this:

    Df <- structure(list(SES = c("High", "High", "High", "Low", "High", 
    "Low", "High", "High", "High", "Low", "Low", "Low", "High", "High", 
    "Low", "High", "High", "Low", "High", "High", "Low", "High", 
    "Low", "Low", "Low", "Low", "High", "Low", "High", "Low", "High", 
    "High", "Low", "High", "Low", "High", "High", "High", "Low", 
    "High", "High", "Low", "Low", "High", "Low", "Low", "Low", "Low", 
    "High", "High", "Low", "High"), entry_age = c(12, 2.5, 7, 2.5, 
    2.5, 12, 9, 2.5, 3, 8, 12, 2.5, 5.5, 6, 2.5, 2.5, 2.5, 16, 12, 
    5, 7, 2.5, 12, 2.5, 2.5, 12, 12, 12, 6, 24, 2.5, 2.5, 2, 3.5, 
    2.5, 2.5, 2.5, 4, 7, 12, 7, 9, 12, 6, 18, 15, 8, 12, 2.5, 6, 
    10, 5)), row.names = c(NA, -52L), class = c("tbl_df", "tbl", 
    "data.frame"))
    
    t.test(entry_age~SES, data=Df)
    #> 
    #>  Welch Two Sample t-test
    #> 
    #> data:  entry_age by SES
    #> t = -2.9888, df = 35.479, p-value = 0.005059
    #> alternative hypothesis: true difference in means between group High and group Low is not equal to 0
    #> 95 percent confidence interval:
    #>  -6.695627 -1.280563
    #> sample estimates:
    #> mean in group High  mean in group Low 
    #>           5.303571           9.291667
    

    Created on 2022-05-17 by the reprex package (v2.0.1)