rdataframesubsetrowsum

Sum every 3 columns of a dataframe to form new columns


I have a simple dataframe like the following:

ID, Type, a, b, c, d, e, f, etc.
ob1, 1,   1, 2, 3, 4, 5, 6, etc.
ob1, 2,   3, 4, 5, 6, 7, 1, etc.

I need to add the values of every 3 columns together, to produce new columns with the summed values. This would produce the following output:

ID, Type, sum1, sum2,  etc.
ob1, 1,     6,   15,   etc.
ob1, 2,    12,   14,   etc.

Using sequencing, I can do this manually for individual columns, but because I have many columns, how can I perform this summation automatically for every 3 columns (after a set starting point)?


Solution

  • In base R you can do something like this:

    num_cols <- df[-c(1:2)]
    
    cbind(df[1:2], do.call(cbind, 
                           lapply(setNames(seq(1,length(num_cols), 3), 
                                           paste0("sum", seq(length(num_cols)/3))), \(a) {
      apply(num_cols[a:(a + 2)], 1, \(b) sum(as.numeric(gsub(",", "", b))))
      
    })))
    
    

    Because there are commas, I used gsub to remove them, setNames is used to give each column a dynamic name, apply is used within lapply to summarise each row

       ID. Type. sum1 sum2
    1 ob1,    1,    6   15
    2 ob1,    2,   12   14