I have a sample dataset:
Species <-c("Bass", "Bass", "Bass", "Bass", "Bass", "Bass","Bass","Bass","Bass")
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3","a3","a3","a3")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Amphipoden", "Mysis", "Amphipoden","Mysis","Polychaeten","Mollusca")
df <- data.frame(Species, FishID, Prey)
For having Bass as a predator, there are 3 unique individual Basses as different FishID: a1, a2 and a3. I would like to calculate the absolute percentage of occurrence of a prey species per FishID (individual Bass).
So in this case: Amphipods occurs 3 times, so 100% in the stomachs of Bass (found in all three of the individuals), for Mysis idem. For polychaete however, is found only two times in the stomach of Bass: so this would be then 66,6%. And Moluscs are only found one time, so 33,3 %
As an end result, I am looking for something like this:
Species <-c("Bass", "Bass", "Bass", "Bass")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Mollusca")
Percentage <- c(100, 100, 66,6, 33,3)
df2 <- data.frame(Species,Prey, Percentage)
I tried this:
df %>%
group_by(Species,Prey) %>%
summarise(n = n()) %>%
mutate(percent = n / sum(n) * 100)
But it isn't giving me hat I want.
Anty help is welcome.
Thank you in advance!
library(tidyverse)
df |>
# only count Prey once per Species/FishID (which I presume always go together)
distinct(Species, FishID, Prey) |>
mutate(count = 1) |>
# complete with missing combinations for each FishID
complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |>
summarize(percent = sum(count) / n(), .by = c(Species, Prey))
# A tibble: 4 × 3
Species Prey percent
<chr> <chr> <dbl>
1 Bass Amphipoden 1
2 Bass Mollusca 0.333
3 Bass Mysis 1
4 Bass Polychaeten 0.667