rcastingdata.tablereshapereshape2

Change default separator in dcast from long to wide formats


Is it possible to change the default separator when cast (dcast) assigns new column headers?

I am converting a file from long to wide formats applying the data.table::dcast function, and I get the following headers:

value_1, value_2, value_3,...  

In reshape::cast, one can assign the "sep" parameter (e.g., sep= "") and in that way, the column headers output looks like what I want to:

value1, value2, value3,... 

However, reshape takes minutes for my data frame with over 200,000 rows, whereas data.table::dcast takes seconds. dcast also outputs the columns in the order I want, where reshape does not. Is there any easy way to change the output with dcast, or do I need to change the column headers manually?

For example:

example <- data.frame(id=rep(c(1,2,3,4),4),index=c(rep(1,4),rep(2,4),rep(1,4),rep(2,4)),variable=c(rep("resp",8),rep("conc",8)),value=rnorm(16,5,1))
dcast(example,id~variable+index)

The example gives the column headers:

conc_1, conc_2, resp_1, resp_2

I want the column headers to read:

conc1, conc2, resp1, resp2

I have tried:

dcast(example,id~variable+index,sep="")

dcast appears to ignore sep entirely, because giving a symbol does not change the output either.


Solution

  • You can't, since that option wasn't incorporated into the reshape2::dcast function. But it's fairly trivial to do this after running dcast.

    casted_data <- reshape2::dcast(example,id~variable+index)
    
    
    library(stringr)
    names(casted_data) <- str_replace(names(casted_data), "_", ".")
    
    > casted_data
      id   conc.1   conc.2   resp.1   resp.2
    1  1 5.554279 5.225686 5.684371 5.093170
    2  2 4.826810 5.484334 5.270886 4.064688
    3  3 5.650187 3.587773 3.881672 3.983080
    4  4 4.327841 4.851891 5.628488 4.305907
    
    # If you need to do this often, just wrap dcast in a function and 
    # change the names before returning the result.
    
    f <- function(df, ..., sep = ".") {
        res <- dcast(df, ...)
        names(res) <- str_replace(names(res), "_", sep)
        res
    }
    
    > f(example, id~variable+index, sep = "")
      id   conc1   conc2   resp1   resp2
    1  1 5.554279 5.225686 5.684371 5.093170
    2  2 4.826810 5.484334 5.270886 4.064688
    3  3 5.650187 3.587773 3.881672 3.983080
    4  4 4.327841 4.851891 5.628488 4.305907