rdataframerowsum

How to calculate column sum of column values based on condition from another column


I have dataframe

Category = c("Fixed","New" ,"Fixed" ,"Regular", "Fixed", "New", "New")
Day1 = c(313, NA,239, 341, 14,11,521)
Day2=c(113, 510,939, NA, 741, 141,541)
Day3=c(13, 5,3, 41, 74,NA, 520)
Day4 = c(41, 10,23, 34, 401,191,NA)
df <- data.frame(Category, Day1, Day2, Day3, Day4, check.names = FALSE)

For columns from Day 1 to 4 I would like to potentially write a for loop and calculate the total sum of each Day with the condition on column Category where Category = Fixed and Category = New

What i have attempted

Subset the dataframe based on conditions

subset_data <- df[(df$Category=='Fixed' | df$Category=='New')]

then calculate the row sums for the subset_data for each day however at this point am unsure.

what is the correct way to calculate the Day 1 to 4 columns sums based on Category =Fixed or New column condition?

The line of code below attempts colSums() function but it returns an empty array.

specific_sums <- colSums(subset_data[c("Day1")])


Solution

  • You can use aggregate to do the aggregation. Invoke the subset parameter to only select the part that you want:

    aggregate(.~Category, df, sum,subset = Category %in% c('New', 'Fixed'))
    
      Category Day1 Day2 Day3 Day4
    1    Fixed  693 1793  126  465
    2      New  842 1192  636  722