rdplyraggregatemultidplyr

Creating a frequency 2x2 table in R but replacing frequency data with numerical data from another variable


I am having trouble to create a table in a format required to run some analyses.

Here is a simplified example of how my large dataset looks like

        Sample <- c(1,2,2,3,3) 
        Species <- c("sp1","sp2","sp3","sp1","sp1")
        Counts <- c(3,2,4,10,3)
        mydata <- as.data.frame(cbind(Sample,Species,Counts))
        mydata$Counts <-as.integer(mydata$Counts)
        mydata
      Sample Species Counts
    1      1     sp1      3
    2      2     sp2      2
    3      2     sp3      4
    4      3     sp1     10
    5      3     sp1      3
    
        (table_0 <- table(mydata$Sample,mydata$Species))
    sp1 sp2 sp3
  1   1   0   0
  2   0   1   1
  3   2   0   0

The table above is a frequency table, but that is exactly the general 2x2 format I need (i.e. Species by Sample ID). However, I need instead of frequency data, the total counts replacing those frequencies. Exactly this:

    sp1 sp2 sp3
  1   3   0   0
  2   0   2   4
  3   13   0   0

As you can see, Sample 3 have two observations for Species 1 "sp1", one has an abundance of 10 and the other one of three, the sum is 13. How can I generate a table like this for a large data set, so I avoid wasting time and making mistakes while doing it manually?


Solution

  • You could use pivot_wider like this:

    library(tidyverse)
    
    mydata %>% 
        # first we get the sum of `Counts` for each sample and species
        group_by(Sample,Species) %>% 
        summarise(Counts = sum(Counts)) %>%
    
        # then we make the dataframe wider, replacing empty values with zeros 
        pivot_wider(names_from = Species, values_from = Counts, values_fill = 0) %>%
    
        # removing the Sample column, like in the example
        ungroup() %>%
        select(-Sample)
    
    # A tibble: 3 × 3
        sp1   sp2   sp3
      <int> <int> <int>
    1     3     0     0
    2     0     2     4
    3    13     0     0