I have a metadata set where I want to calculate the difference of two numeric columns if they match specific character strings in a different column. For instance, the column patch
from the dput()
contains many different IDs: GL134_UP_1, GL24_DN_2, etc where GL# is an ID, UP or DN is a location, and _1, _2, or _3 are replicates. For each individual GL# of a single location and all replicates (GL134_UP_1, GL134_UP_2, GL134_UP_3) I want to subtract numeric values from the column sn_sp_dist
from the same GL#'s down location (GL134_DN_1, GL134_DN_2, GL134_DN_3).
I imagine I'll be needing group_by()
to specify the character column, but then I'm lost defining specific GL#'s.
library(dplyr)
diff <- met.ex %>%
group_by(patch) %>%
> dput(met.ex)
structure(list(patch = structure(c(4L, 5L, 6L, 1L, 2L, 3L, 10L,
11L, 7L, 8L, 9L), .Label = c("GL134_DN_1", "GL134_DN_2", "GL134_DN_3",
"GL134_UP_1", "GL134_UP_2", "GL134_UP_3", "GL22_DN_1", "GL22_DN_2",
"GL22_DN_3", "GL22_UP_1", "GL22_UP_3"), class = "factor"), sn_sp_dist = structure(c(6L,
7L, 2L, 5L, 9L, 10L, 8L, 4L, 3L, 9L, 1L), .Label = c("123", "165",
"2", "248", "321", "34", "42", "44", "77", "987"), class = "factor")), class = "data.frame", row.names = c(NA,
-11L))
You could do something like this:
colnames(met.ex) <- c("patch", "value")
met.ex %>%
separate(patch, into = c("id", "direction", "rep"), sep = "_", remove = FALSE) %>%
# replace all alphabet characters with nothing
mutate(id = as.numeric(str_match(id, "\\d+")),
rep = as.numeric(rep),
patch = as.numeric(patch)) %>%
select(-patch) %>%
group_by(id, rep) %>%
pivot_wider(names_from = direction, values_from = value) %>%
mutate(diff = UP - DN)
id rep UP DN diff
<dbl> <dbl> <dbl> <dbl> <dbl>
1 134 1 6 5 1
2 134 2 7 9 -2
3 134 3 2 10 -8
4 22 1 8 3 5
5 22 3 4 1 3
6 22 2 NA 9 NA
For more information on cleaning and tidying dataframes read this.