rgroup-bypercentageabsolute

Calculate percentage of occurrence of a factor of the total amount of occurrences of a group factor


I have a sample dataset:

Species <-c("Bass", "Bass", "Bass", "Bass", "Bass", "Bass","Bass","Bass","Bass")
FishID <- c("a1", "a1", "a1", "a2", "a2", "a3","a3","a3","a3")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Amphipoden", "Mysis", "Amphipoden","Mysis","Polychaeten","Mollusca")

df <- data.frame(Species, FishID, Prey)

For having Bass as a predator, there are 3 unique individual Basses as different FishID: a1, a2 and a3. I would like to calculate the absolute percentage of occurrence of a prey species per FishID (individual Bass).

So in this case: Amphipods occurs 3 times, so 100% in the stomachs of Bass (found in all three of the individuals), for Mysis idem. For polychaete however, is found only two times in the stomach of Bass: so this would be then 66,6%. And Moluscs are only found one time, so 33,3 %

As an end result, I am looking for something like this:

Species <-c("Bass", "Bass", "Bass", "Bass")
Prey <- c("Amphipoden", "Mysis", "Polychaeten", "Mollusca")
Percentage <- c(100, 100, 66,6, 33,3)
df2 <- data.frame(Species,Prey, Percentage)

I tried this:

df %>%
  group_by(Species,Prey) %>% 
  summarise(n = n()) %>%
  mutate(percent = n / sum(n) * 100)

But it isn't giving me hat I want.

Anty help is welcome.

Thank you in advance!


Solution

  • library(tidyverse)
    df |>
      # only count Prey once per Species/FishID (which I presume always go together)
      distinct(Species, FishID, Prey) |>
      mutate(count = 1) |>
      # complete with missing combinations for each FishID
      complete(nesting(Species, FishID), Prey, fill = list(count = 0)) |> 
      summarize(percent = sum(count) / n(), .by = c(Species, Prey))
    
    
    # A tibble: 4 × 3
      Species Prey        percent
      <chr>   <chr>         <dbl>
    1 Bass    Amphipoden    1    
    2 Bass    Mollusca      0.333
    3 Bass    Mysis         1    
    4 Bass    Polychaeten   0.667