I want to do an association analysis, but need to bring my data frame into the right format, which only shows transactions. 1) How can I multiply my "Sub Category" column by the amount of "Quantity" column?
2) How can I group the Transactions by Order-ID?
I have this df:
`Order ID` `Sub-Category` `Quantity`
<chr> <chr> <dbl>
1 CA-2017-152156 Bookcases 2
2 CA-2017-152156 Chairs 3
3 CA-2017-138688 Labels 2
1) I want this:
`Order ID` `Sub-Category` `Sub-Category2` `Sub-Category3`
<chr> <chr> <chr> <chr>
1 CA-2017-152156 Bookcases Bookcases NULL
2 CA-2017-152156 Chairs Chairs Chairs
3 CA-2017-138688 Labels Labels NULL
(After that I want to combine same Order IDs. E.g. row 1 and 2. Do you have a hint for that?) Thank you!
The following answers to point 1).
Max <- max(df1$Quantity)
res <- lapply(seq_len(nrow(df1)), function(i){
c(rep(as.character(df1[i, 2]), df1[i, 3]), rep(NA, Max - df1[i, 3]))
})
res <- cbind(df1[1], do.call(rbind, res))
names(res)[-1] <- paste0(names(df1)[2], names(res)[-1])
res
# Order ID Sub-Category1 Sub-Category2 Sub-Category3
#1 CA-2017-152156 Bookcases Bookcases <NA>
#2 CA-2017-152156 Chairs Chairs Chairs
#3 CA-2017-138688 Labels Labels <NA>
Data in dput
format.
df1 <-
structure(list(`Order ID` = structure(c(2L, 2L, 1L),
.Label = c("CA-2017-138688", "CA-2017-152156"),
class = "factor"), `Sub-Category` = structure(1:3,
.Label = c("Bookcases", "Chairs", "Labels"), class =
"factor"), Quantity = c(2L, 3L, 2L)), class = "data.frame",
row.names = c("1", "2", "3"))