I have a dataset that looks like this:
CATA 1 10101
CATA 2 11101
CATA 3 10011
CATB 1 10100
CATB 2 11100
CATB 3 10011
etc.
and I want to combine these different rows into a single, long row like this:
CATA 101011110110011
CATB 101001110010011
I've tried doing this with melt() and then dcast(), but it doesn't seem to work. Does anyone have some simple pieces of code to do this?
Look at the paste
command and specifically the collapse
argument. It's not clear what should happen if/when you have different values for the first column, so I won't venture to guess. Update your question if you get stuck.
dat <- data.frame(V1 = "CATA", V2 = 1:3, V3 = c(10101, 11101, 10011))
paste(dat$V3, collapse= "")
[1] "101011110110011"
Note that you may want to convert the data to character first to prevent leading zeros from being trimmed.
EDIT: to address multiple values for the first column
Use plyr
's ddply
function which expects a data.frame as an input and a grouping variable(s). We then use the same paste()
trick as before along with summarize()
.
library(plyr)
dat <- data.frame(V1 = sample(c("CATA", "CATB"), 10, TRUE)
, V2 = 1:10
, V3 = sample(0:100, 10, TRUE)
)
ddply(dat, "V1", summarize, newCol = paste(V3, collapse = ""))
V1 newCol
1 CATA 16110
2 CATB 19308974715042