rsplitstrsplitsplitstackshape

Split multiple columns into rows


I'm working with a very raw set of data and need to shape it up in order to work with it. I am trying to split selected columns based on seperator '|'

d <- data.frame(id = c(022,565,893,415),
     name = c('c|e','m|q','w','w|s|e'), 
     score = c('e','k|e','e|k|e', 'e|o'))

Is it possible to split the dataframe at one so it looks like this in the end.

df <- data.frame(id = c(22,22,565,565,565,565,893,893,893,415,415,415,415,415,415),
            name = c('c','e','m','m','q','q','w','w','w','w','w','s','s','e','e'),
            score = c('e','e','k','e','k','e','e','k','e','e','o','e','o','e','o'))

So far I've tried various different string split funtions but haven't had much luck :(

Can anybody help?


Solution

  • Here's a simple base R approach in two steps:

    1) split the columns:

    x <- lapply(d[-1], strsplit, "|", fixed = TRUE)
    

    2) expand and combine:

    d2 <- setNames(do.call(rbind, Map(expand.grid, d$id, x$name, x$score)), names(d)) 
    

    The result is then:

    #    id name score
    #1   22    c     e
    #2   22    e     e
    #3  565    m     k
    #4  565    q     k
    #5  565    m     e
    #6  565    q     e
    #7  893    w     e
    #8  893    w     k
    #9  893    w     e
    #10 415    w     e
    #11 415    s     e
    #12 415    e     e
    #13 415    w     o
    #14 415    s     o
    #15 415    e     o