Update
I have untidy data that looks like the example df2
.
df2 = data.frame(label = c("C360", "C360~C183", "C157~C360", "C183", "C157", "C157~C183", "C195~C183"),
values = c(10, 11, 12, 13, 14, 15, 16))
> df2
label values
1 C360 10
2 C360~C183 11
3 C157~C360 12
4 C183 13
5 C157 14
6 C157~C183 15
7 C195~C183 16
I need to sum partial matches that include the letters in different positions, so that I will the following desired output:
string sum
1 C360 33
2 C183 55
3 C157 41
4 C195 16
Old
df1 = data.frame(label = c("a", "a.1", "2.a", "b", "c"),
values = c(10, 11, 12, 13, 14))
> df1
label values
1 a 10
2 a.1 11
3 2.a 12
4 b 13
5 c 14
Expected output:
label sum
1 a 33
2 b 13
3 c 14
New
stack(setNames(strsplit(df2$label, "~"), df2$values)) |>
type.convert(as.is = TRUE) |>
`colnames<-`(names(df2)) |>
aggregate(values ~ label, sum)
label values
1 C157 41
2 C183 55
3 C195 16
4 C360 33
Data:
df2 = data.frame(label = c("C360", "C360~C183", "C157~C360", "C183", "C157", "C157~C183", "C195~C183"),
values = c(10, 11, 12, 13, 14, 15, 16))
Old
For df1
, you might consider
aggregate(df1$values, list(letter = gsub("[^a-z]", "", df1$label)), sum)
letter x
1 a 33
2 b 13
3 c 14
where the appropriate regex (something more concrete than "[^a-z]"
) depends on your letter
(labels
) column.