Using the data.table package in R, I am trying to create a cartesian product of two data.tables using the merge method as one would do in base R.
In base the following works:
#assume this order data orders <- data.frame(date = as.POSIXct(c('2012-08-28','2012-08-29','2012-09-01')), first.name = as.character(c('John','George','Henry')), last.name = as.character(c('Doe','Smith','Smith')), qty = c(10,50,6)) #and these dates dates <- data.frame(date = seq(from = as.POSIXct('2012-08-28'), to = as.POSIXct('2012-09-07'), by = 'day')) #get the unique customers cust<-unique(orders[,c('first.name','last.name')]) #using merge from base R, get the cartesian product merge(dates, cust, by = integer(0))
However, the same technique does not work using data.table and this error is thrown:
"Error in merge.data.table(dates.dt, cust.dt, by = integer(0)) : A non-empty vector of column names for `by` is required."
#data.table approach library(data.table) orders.dt <- data.table(orders) dates.dt <- data.table(dates) cust.dt <- unique(orders.dt[, list(first.name, last.name)]) #try to use merge (data.table) in the same manner as base merge(dates.dt, cust.dt, by = integer(0))
Error in merge.data.table(dates.dt, cust.dt, by = integer(0)) : A non-empty vector of column names for `by` is required.
I want the result to reflect all customer names for all dates, just like in base, but do it in a data.table-centric way. Is this possible?
If you first construct full names from the first and last in the
cust-dataframe, you can then use
CJ (cross-join). You cannot use all three vectors since there would be 99 items and teh first names would get inappropriately mixed with last names.
> nrow(CJ(dates$date, cust$first.name, cust$last.name ) )  99
This returns the desired data.table object:
> CJ(dates$date,paste(cust$first.name, cust$last.name) ) V1 V2 1: 2012-08-28 George Smith 2: 2012-08-28 Henry Smith 3: 2012-08-28 John Doe 4: 2012-08-29 George Smith 5: 2012-08-29 Henry Smith 6: 2012-08-29 John Doe 7: 2012-08-30 George Smith 8: 2012-08-30 Henry Smith 9: 2012-08-30 John Doe 10: 2012-08-31 John Doe 11: 2012-08-31 George Smith 12: 2012-08-31 Henry Smith 13: 2012-09-01 John Doe 14: 2012-09-01 George Smith 15: 2012-09-01 Henry Smith 16: 2012-09-02 George Smith 17: 2012-09-02 Henry Smith 18: 2012-09-02 John Doe 19: 2012-09-03 Henry Smith 20: 2012-09-03 John Doe 21: 2012-09-03 George Smith 22: 2012-09-04 Henry Smith 23: 2012-09-04 John Doe 24: 2012-09-04 George Smith 25: 2012-09-05 George Smith 26: 2012-09-05 Henry Smith 27: 2012-09-05 John Doe 28: 2012-09-06 George Smith 29: 2012-09-06 Henry Smith 30: 2012-09-06 John Doe 31: 2012-09-07 George Smith 32: 2012-09-07 Henry Smith 33: 2012-09-07 John Doe V1 V2