I'm new and sorry if my question is badly worded.
I'm working in r and I have table called Rent that might look like this :
Rent
ID Invoice Payment Paid Date
lucy 7/1/2018 100 9/1/2018
lucy 7/1/2018 150 10/1/2018
lucy 8/1/2018 100 11/1/2018
So what I want to do is that since Lucy has two payments on 7/1/2018, I want to combine them together and then sum the payment, and use the latest Paid Date.
What I have so far is that
#to create a row that has the sum of the sales prices
Rent[,sum_late:=sum( as.numeric(("Sales Price"))),
by= c("Id","Invoice Date")]
#take the first of the unique IDs by the max paid date
head (SD,1) by=c("ID", "Invoice Date", max("Paid Date")
But when I run the first line all the sum_late column is N/A. I'm not sure what I did wrong. Ideally, I would want a table just like this.
Rent
ID Invoice Payment Paid Date
lucy 7/1/2018 250 10/1/2018
lucy 8/1/2018 100 11/1/2018
Sorry if this is a stupid question, I appreciate any help and feedback!! Thank you all for your time!!
We can change Paid_Date
to date class, group_by
ID
and Invoice
, sum
Payment
and select max
Paid_Date
.
library(dplyr)
Rent %>%
mutate_at(vars(Invoice, Paid_Date), as.Date, '%d/%m/%Y') %>%
group_by(ID, Invoice) %>%
summarise(Payment = sum(Payment),
Paid_Date = max(Paid_Date))
# ID Invoice Payment Paid_Date
# <chr> <date> <int> <date>
#1 lucy 2018-01-07 250 2018-01-10
#2 lucy 2018-01-08 100 2018-01-11
Or if you prefer data.table
using the same logic.
library(data.table)
setDT(Rent)[, c("Invoice", "Paid_Date") := .(as.IDate(Invoice, '%d/%m/%Y'),
as.IDate(Paid_Date, '%d/%m/%Y'))]
Rent[, .(Payment = sum(Payment), Paid_Date = max(Paid_Date)), .(ID, Invoice)]
data
Rent <- structure(list(ID = c("lucy", "lucy", "lucy"), Invoice = c("7/1/2018",
"7/1/2018", "8/1/2018"), Payment = c(100L, 150L, 100L), Paid_Date = c("9/1/2018",
"10/1/2018", "11/1/2018")), class = "data.frame", row.names = c(NA, -3L))