
converting different rows of a data frame to one single row in R

I have a dataset that looks like this:

CATA 1 10101
CATA 2 11101
CATA 3 10011
CATB 1 10100
CATB 2 11100
CATB 3 10011


and I want to combine these different rows into a single, long row like this:

CATA 101011110110011
CATB 101001110010011

I've tried doing this with melt() and then dcast(), but it doesn't seem to work. Does anyone have some simple pieces of code to do this?


  • Look at the paste command and specifically the collapse argument. It's not clear what should happen if/when you have different values for the first column, so I won't venture to guess. Update your question if you get stuck.

    dat <- data.frame(V1 = "CATA", V2 = 1:3, V3 = c(10101, 11101, 10011))
    paste(dat$V3, collapse= "")
    [1] "101011110110011"

    Note that you may want to convert the data to character first to prevent leading zeros from being trimmed.

    EDIT: to address multiple values for the first column

    Use plyr's ddply function which expects a data.frame as an input and a grouping variable(s). We then use the same paste() trick as before along with summarize().

        dat <- data.frame(V1 = sample(c("CATA", "CATB"), 10, TRUE)
                        , V2 = 1:10
                        , V3 = sample(0:100, 10, TRUE)
        ddply(dat, "V1", summarize, newCol = paste(V3, collapse = ""))
        V1         newCol
    1 CATA          16110
    2 CATB 19308974715042