I have a simple dataframe like the following:
ID, Type, a, b, c, d, e, f, etc.
ob1, 1, 1, 2, 3, 4, 5, 6, etc.
ob1, 2, 3, 4, 5, 6, 7, 1, etc.
I need to add the values of every 3 columns together, to produce new columns with the summed values. This would produce the following output:
ID, Type, sum1, sum2, etc.
ob1, 1, 6, 15, etc.
ob1, 2, 12, 14, etc.
Using sequencing, I can do this manually for individual columns, but because I have many columns, how can I perform this summation automatically for every 3 columns (after a set starting point)?
In base R you can do something like this:
num_cols <- df[-c(1:2)]
cbind(df[1:2], do.call(cbind,
lapply(setNames(seq(1,length(num_cols), 3),
paste0("sum", seq(length(num_cols)/3))), \(a) {
apply(num_cols[a:(a + 2)], 1, \(b) sum(as.numeric(gsub(",", "", b))))
})))
Because there are commas, I used gsub
to remove them,
setNames
is used to give each column a dynamic name,
apply
is used within lapply
to summarise each row
ID. Type. sum1 sum2
1 ob1, 1, 6 15
2 ob1, 2, 12 14