I have 3 .csv
files in one directory. The data sets have headers and text; text; text;
format.
`list_data <- list("Red; Green",
"Green; Blue",
"Blue; Yellow")
print(list_data)`
and I would like to merge them into 1 .csv based on the same headers.
files <- list.files(pattern = '\\.csv')
tables <- lapply(files, read.csv, header = TRUE)
combined.df <- do.call(rbind , tables)
str(combined.df)
How to get around this? many thanks in advance.
Making some assumptions about the real data, I'll change the sample data slightly:
list_data <- list(file1="Red; Green\n1;2", file2="Green; Blue\n3;4", file3="Blue; Yellow\n5;6")
I added names to enable the option to preserve file name in the rows of data (far below).
# tables <- lapply(setNames(nm = files), read.csv2) # what you call with a file list
tables <- lapply(list_data, function(z) read.csv2(text = z))
str(tables)
# List of 3
# $ file1:'data.frame': 1 obs. of 2 variables:
# ..$ Red : int 1
# ..$ Green: int 2
# $ file2:'data.frame': 1 obs. of 2 variables:
# ..$ Green: int 3
# ..$ Blue : int 4
# $ file3:'data.frame': 1 obs. of 2 variables:
# ..$ Blue : int 5
# ..$ Yellow: int 6
dplyr::bind_rows(tables)
# Red Green Blue Yellow
# 1 1 2 NA NA
# 2 NA 3 4 NA
# 3 NA NA 5 6
data.table::rbindlist(tables, fill = TRUE, use.name = TRUE)
# Red Green Blue Yellow
# <int> <int> <int> <int>
# 1: 1 2 NA NA
# 2: NA 3 4 NA
# 3: NA NA 5 6
If you want to preserve the file name with each row, then
dplyr::bind_rows(tables, .id = "filename")
# filename Red Green Blue Yellow
# 1 file1 1 2 NA NA
# 2 file2 NA 3 4 NA
# 3 file3 NA NA 5 6
data.table::rbindlist(tables, fill = TRUE, use.name = TRUE, idcol = "filename")
# filename Red Green Blue Yellow
# <char> <int> <int> <int> <int>
# 1: file1 1 2 NA NA
# 2: file2 NA 3 4 NA
# 3: file3 NA NA 5 6