I am having trouble to create a table in a format required to run some analyses.
Here is a simplified example of how my large dataset looks like
Sample <- c(1,2,2,3,3)
Species <- c("sp1","sp2","sp3","sp1","sp1")
Counts <- c(3,2,4,10,3)
mydata <- as.data.frame(cbind(Sample,Species,Counts))
mydata$Counts <-as.integer(mydata$Counts)
mydata
Sample Species Counts
1 1 sp1 3
2 2 sp2 2
3 2 sp3 4
4 3 sp1 10
5 3 sp1 3
(table_0 <- table(mydata$Sample,mydata$Species))
sp1 sp2 sp3
1 1 0 0
2 0 1 1
3 2 0 0
The table above is a frequency table, but that is exactly the general 2x2 format I need (i.e. Species by Sample ID). However, I need instead of frequency data, the total counts replacing those frequencies. Exactly this:
sp1 sp2 sp3
1 3 0 0
2 0 2 4
3 13 0 0
As you can see, Sample 3 have two observations for Species 1 "sp1", one has an abundance of 10 and the other one of three, the sum is 13. How can I generate a table like this for a large data set, so I avoid wasting time and making mistakes while doing it manually?
You could use pivot_wider
like this:
library(tidyverse)
mydata %>%
# first we get the sum of `Counts` for each sample and species
group_by(Sample,Species) %>%
summarise(Counts = sum(Counts)) %>%
# then we make the dataframe wider, replacing empty values with zeros
pivot_wider(names_from = Species, values_from = Counts, values_fill = 0) %>%
# removing the Sample column, like in the example
ungroup() %>%
select(-Sample)
# A tibble: 3 × 3
sp1 sp2 sp3
<int> <int> <int>
1 3 0 0
2 0 2 4
3 13 0 0