I need to calculate the Shannon Index for multiple samples at multiple sites and I have no idea how to go about it. I'm using R and the data looks something like this.
Sample | Species | Count |
---|---|---|
17a | Shark | 17 |
17a | Dolphin | 25 |
17a | Sting Ray | 1 |
17a | Badger | 234 |
17b | Shark | 4 |
17b | Dolphin | 6 |
17b | Sting Ray | 19 |
17b | Badger | 25 |
18a | Shark | 45 |
18a | Dolphin | 4 |
18a | Sting Ray | 4 |
18a | Badger | 3 |
I feel like I need to split by sample some how but then after that I am completely stuck, I dont use R very often.
Thanks for any help!
You can get the Shannon index per Sample by grouping your data frame by Sample
and applying vegan::diversity()
.
df |>
dplyr::mutate(shannon_index = vegan::diversity(Count), .by = Sample)
Sample Species Count shannon_index
1 17a Shark 17 0.5511595
2 17a Dolphin 25 0.5511595
3 17a Sting Ray 1 0.5511595
4 17a Badger 234 0.5511595
5 17b Shark 4 1.1609846
6 17b Dolphin 6 1.1609846
7 17b Sting Ray 19 1.1609846
8 17b Badger 25 1.1609846
9 18a Shark 45 0.7095302
10 18a Dolphin 4 0.7095302
11 18a Sting Ray 4 0.7095302
12 18a Badger 3 0.7095302
Used data:
> dput(df)
structure(list(Sample = c("17a", "17a", "17a", "17a", "17b",
"17b", "17b", "17b", "18a", "18a", "18a", "18a"), Species = c("Shark",
"Dolphin", "Sting Ray", "Badger", "Shark", "Dolphin", "Sting Ray",
"Badger", "Shark", "Dolphin", "Sting Ray", "Badger"), Count = c(17L,
25L, 1L, 234L, 4L, 6L, 19L, 25L, 45L, 4L, 4L, 3L)), row.names = c(NA,
-12L), class = "data.frame")