rstatisticschi-squared

How to run a chisq.test() with this data?


I have these data:

> dput(df)
structure(list(Freq = c(41L, 31L, 11L, 0L), group = structure(c(1L, 
1L, 2L, 2L), .Label = c("A", "B"), class = "factor"), Survived = structure(c(2L, 
1L, 2L, 1L), .Label = c("No", "Yes"), class = "factor")), row.names = c(NA, 
4L), class = "data.frame")
  Freq group Survived
1   41     A      Yes
2   31     A       No
3   11     B      Yes
4    0     B       No

And I try to follow https://data-flair.training/blogs/chi-square-test-in-r/ but I'm not sure how to use the data. For example, when I use chisq.test(df$group, df$Survived) I receive

> chisq.test(df$group, df$Survived)

    Pearson's Chi-squared test

data:  df$group and df$Survived
X-squared = 0, df = 1, p-value = 1

which is meaningless (and didn't take into account Freq, right?)? I want to know whether there is a difference between the groups A and B.


Solution

  • Firstly, you need to transform the dataframe to a contingency table:

    tab <- xtabs(Freq ~ ., df) # Specifically, xtabs(Freq ~ group + Survived, df)
    
    #      Survived
    # group No Yes
    #     A 31  41
    #     B  0  11
    

    Then pass it into chisq.test():

    chisq.test(tab)
    
    #   Pearson's Chi-squared test with Yates' continuity correction
    # 
    # data:  tab
    # X-squared = 5.8315, df = 1, p-value = 0.01574