runiqueidentifiercolumn-sum

How to add together certain rows within a column in R by unique IDs?


I'm new and sorry if my question is badly worded.

I'm working in r and I have table called Rent that might look like this :

Rent
       ID      Invoice    Payment      Paid Date
       lucy   7/1/2018     100        9/1/2018
       lucy   7/1/2018     150        10/1/2018
       lucy   8/1/2018     100        11/1/2018

So what I want to do is that since Lucy has two payments on 7/1/2018, I want to combine them together and then sum the payment, and use the latest Paid Date.

What I have so far is that

#to create a row that has the sum of the sales prices 

    Rent[,sum_late:=sum( as.numeric(("Sales Price"))),
    by= c("Id","Invoice Date")]

#take the first of the unique IDs by the max paid date
    head (SD,1) by=c("ID", "Invoice Date", max("Paid Date") 

But when I run the first line all the sum_late column is N/A. I'm not sure what I did wrong. Ideally, I would want a table just like this.

Rent
       ID      Invoice    Payment      Paid Date
       lucy   7/1/2018     250        10/1/2018
       lucy   8/1/2018     100        11/1/2018

Sorry if this is a stupid question, I appreciate any help and feedback!! Thank you all for your time!!


Solution

  • We can change Paid_Date to date class, group_by ID and Invoice, sum Payment and select max Paid_Date.

    library(dplyr)
    Rent %>%
      mutate_at(vars(Invoice, Paid_Date), as.Date, '%d/%m/%Y') %>%
      group_by(ID, Invoice) %>%
      summarise(Payment = sum(Payment), 
                Paid_Date = max(Paid_Date))
    
    #  ID    Invoice    Payment Paid_Date 
    #  <chr> <date>       <int> <date>    
    #1 lucy  2018-01-07     250 2018-01-10
    #2 lucy  2018-01-08     100 2018-01-11
    

    Or if you prefer data.table using the same logic.

    library(data.table)
    setDT(Rent)[, c("Invoice", "Paid_Date") := .(as.IDate(Invoice, '%d/%m/%Y'), 
                                                 as.IDate(Paid_Date, '%d/%m/%Y'))]
    Rent[, .(Payment = sum(Payment), Paid_Date = max(Paid_Date)), .(ID, Invoice)]
    

    data

    Rent <- structure(list(ID = c("lucy", "lucy", "lucy"), Invoice = c("7/1/2018", 
    "7/1/2018", "8/1/2018"), Payment = c(100L, 150L, 100L), Paid_Date = c("9/1/2018", 
    "10/1/2018", "11/1/2018")), class = "data.frame", row.names = c(NA, -3L))