rindexingmergeread.csvr-rownames

Merging multiple csvs in R with index based on the file name


I am trying to merge several csv files with the same columns in R. I tried the solution proposed in : Merging of multiple excel files in R and works perfectly. However, I would like to add an index column (row name) to identify which file each row corresponds to. I tried:

files <- list.files(pattern="*.csv")

require(purrr)

mainDF <- files %>% map_dfr(read.csv, row.names=files) 

But I get the error:

Error in read.table(file = file, header = header, sep = sep, quote = quote, : invalid 'row.names' length

I would like to get a column similar to this, or ideally just the numbers e.g. 1, 2 etc

I would like a column like this, or even just the number

Any ideas?


Solution

  • One way to deal with this is the .id argument of map_dfr(). If the list passed to map_dfr() is named, you can include a column in the output with the name of each list element. If the list is unnamed, the index will be included in the column instead. That way, the rows corresponding to each .csv will be associated with that index.

    So you could do the following. Note that the second line is optional. If you omit the naming, you will get the index (1,2,...) instead.

    files <- list.files(pattern="*.csv")
    
    names(files) <- paste('file', 1:length(files), sep = '_')
    
    require(purrr)
    
    mainDF <- files %>% map_dfr(read.csv, .id = 'file_ID') 
    

    The resulting data.frame will have a column named file_ID.