[SOLVED] Shannon Index for multiple groups in one dataframe

Shannon Index for multiple groups in one dataframe

I need to calculate the Shannon Index for multiple samples at multiple sites and I have no idea how to go about it. I'm using R and the data looks something like this.

Sample	Species	Count
17a	Shark	17
17a	Dolphin	25
17a	Sting Ray	1
17a	Badger	234
17b	Shark	4
17b	Dolphin	6
17b	Sting Ray	19
17b	Badger	25
18a	Shark	45
18a	Dolphin	4
18a	Sting Ray	4
18a	Badger	3

I feel like I need to split by sample some how but then after that I am completely stuck, I dont use R very often.

Thanks for any help!

Solution

You can get the Shannon index per Sample by grouping your data frame by Sample and applying vegan::diversity().

df |>
    dplyr::mutate(shannon_index = vegan::diversity(Count), .by = Sample)

   Sample   Species Count shannon_index
1     17a     Shark    17     0.5511595
2     17a   Dolphin    25     0.5511595
3     17a Sting Ray     1     0.5511595
4     17a    Badger   234     0.5511595
5     17b     Shark     4     1.1609846
6     17b   Dolphin     6     1.1609846
7     17b Sting Ray    19     1.1609846
8     17b    Badger    25     1.1609846
9     18a     Shark    45     0.7095302
10    18a   Dolphin     4     0.7095302
11    18a Sting Ray     4     0.7095302
12    18a    Badger     3     0.7095302

Used data:

> dput(df)

structure(list(Sample = c("17a", "17a", "17a", "17a", "17b", 
"17b", "17b", "17b", "18a", "18a", "18a", "18a"), Species = c("Shark", 
"Dolphin", "Sting Ray", "Badger", "Shark", "Dolphin", "Sting Ray", 
"Badger", "Shark", "Dolphin", "Sting Ray", "Badger"), Count = c(17L, 
25L, 1L, 234L, 4L, 6L, 19L, 25L, 45L, 4L, 4L, 3L)), row.names = c(NA, 
-12L), class = "data.frame")