rdata.tableuniquelong-format-data

get id's of multiple rows in one cell in r data.table


I have a data.table in long format:

dt <- data.table(id=1:3, Q1=c(1,2,1), Q2=c(3,1,3), Q3=c("a","b","a"), Q4=c("d","c","d"))
dt
   id Q1 Q2 Q3 Q4
1:  1  1  3  a  d
2:  2  2  1  b  c
3:  3  1  3  a  d

If I do

unique(dt[,Q1:Q4])
   Q1 Q2 Q3 Q4
1:  1  3  a  d
2:  2  1  b  c

I get unique rows, but I lose the id's information. I would want to keep these in one row only, like this:

    id Q1 Q2 Q3 Q4
1: 1,3  1  3  a  d
2:   2  2  1  b  c

How could I achieve this?

Thank you


Solution

  • dt[, .(id = paste0(id, collapse = ";")), by = .(Q1, Q2, Q3, Q4)]
    #    Q1 Q2 Q3 Q4  id
    # 1:  1  3  a  d 1;3
    # 2:  2  1  b  c   2