[SOLVED] How do I create a nested vector from one data.table column data grouping by another column?

How do I create a nested vector from one data.table column data grouping by another column?

Given the following data.table:

Col_1 <- c("A", "A", "A", "B", "B", "B")
Col_2 <- c("AA", NA, "AA", NA, "BB", "BBB")
dt <- data.table(Col_1, Col_2)

I need to create a new column with a nested vector with the existing elements in Col_2 for each instance of Col_1, providing a desired output such as:

Col_1    c3
<char> <int>
1:      A     c("AA", NA, "AA")
2:      B     c(NA, "BB", "BBB")

I tried the code below...

dt[, .(c3= sapply(Col_2,c))
   , .(Col_1)]

... which outputs:

    Col_1     c3
   <char> <list>
1:      A     AA
2:      A     NA
3:      A     AA
4:      B     NA
5:      B     BB
6:      B    BBB

How do I get the desired result?

Solution

You should use nested .() to define it

> dt[, .(c3 = .(Col_2)), by = Col_1]
    Col_1        c3
   <char>    <list>
1:      A  AA,NA,AA
2:      B NA,BB,BBB

or you can use aggregate from base R like below (be aware of na.action = argument)

> aggregate(cbind(c3 = Col_2) ~ Col_1, dt, list, na.action = na.pass)
  Col_1          c3
1     A  AA, NA, AA
2     B NA, BB, BBB