r

How to sum a column dependent on a value in another column


I am wanting to "sum the Length column when mat_Source column = ark" and be able to do this for each of the values in the mat_source.

Example data:

asset_ID length age mat_source con_num
1 500 6.00000 ark 1
2 170 6.00000 bg 1
3 480 9.00000 brnl 2
4 284 10.00000 brnl 2
5 30 4.00000 bg 3
6 500 3.09589 brnl 1
7 270 6.79726 brnl 4
8 476 3.00000 ark 4
9 395 5.00000 brnl 1
10 80 18.90411 ark 2
11 500 14.00000 brnl 1
12 500 6.00000 ark 1
13 500 22.00000 ark 4
14 110 2.00000 brnl 3
15 500 14.00000 brnl 1
16 363 2.00000 brnl 4

I have looked at other questions asked on the internet relating to summing data and they are similar but not exactly what I'm after.


Solution

  • Using the dplyr package. Note you will need to install dplyr prior to running this code:

    library(dplyr)
    
    # Your data
    df <- structure(list(asset_ID = 1:16, length = c(500L, 170L, 480L, 
    284L, 30L, 500L, 270L, 476L, 395L, 80L, 500L, 500L, 500L, 110L, 
    500L, 363L), age = c(6, 6, 9, 10, 4, 3.09589, 6.79726, 3, 5, 
    18.90411, 14, 6, 22, 2, 14, 2), mat_source = c("ark", "bg", 
    "brnl", "brnl", "bg", "brnl", "brnl", "ark", "brnl", "ark", "brnl", 
    "ark", "ark", "brnl", "brnl", "brnl"), con_num = c(1L, 1L, 2L, 
    2L, 3L, 1L, 4L, 4L, 1L, 2L, 1L, 1L, 4L, 3L, 1L, 4L)), class = "data.frame", row.names = c(NA, 
    -16L))
    
    # Result
    result <- df %>%
      group_by(mat_source) %>%
      summarise(sum_length = sum(length))
    
    result
    # A tibble: 3 × 2
      mat_source sum_length
      <chr>           <int>
    1 ark              2056
    2 bg                200
    3 brnl             3402