rlistdataframemodelingweather

How to assign data frame names as the first row?


I have several list objects, each containing 31 dataframes, which I have names 'file1980 through to 'file2010'. These were made by splitting the original (11315 rows) data frame into 31 equal sized (365 rows) data frames using the following:

n <- 31   
dataList <- split(MainData, factor(sort(rank(row.names(MainData))%%n)))
    names(dataList) <- paste0("file",1980:2010)

The individual data frames look like this:

   jDate   V1     V2     V3   V4     V5          V6      V7
1   001  -6.83  -5.83  -7.83 0.05 0.8217593   8.101852 100.0
2   002  -6.33  -4.83  -7.83 0.10 2.2453704   9.259259 100.0
3   003  -5.83  -4.83  -6.83 0.30 1.9444444   8.101852  94.7
4   004  -5.83  -4.83  -6.83 0.10 1.0416667   8.101852  97.5
5   005  -6.33  -4.83  -7.83 0.00 1.1226852   9.259259  98.5
6   006  -7.83  -5.83  -9.83 0.03 2.0949074  10.416667 100.0

They will be exported with row names removed into *.txt files for use in another piece of software. However, this software starts by reading the first row (which is the column names incsv or txt formats), as the file name, but for the software to run the first row needs to be the file name, so 'file1980' and so on.

I'm hoping to split the list into 31 equally sized files that look like this (with sequential file names, file1980 to file 2010, in the top row):

    file1980
1   001  -6.83  -5.83  -7.83 0.05 0.8217593   8.101852 100.0
2   002  -6.33  -4.83  -7.83 0.10 2.2453704   9.259259 100.0
3   003  -5.83  -4.83  -6.83 0.30 1.9444444   8.101852  94.7
4   004  -5.83  -4.83  -6.83 0.10 1.0416667   8.101852  97.5
5   005  -6.33  -4.83  -7.83 0.00 1.1226852   9.259259  98.5
6   006  -7.83  -5.83  -9.83 0.03 2.0949074  10.416667 100.0
7   007  -5.33  -4.83  -5.83 0.00 1.4930556   8.101852  97.6
8   008  -7.33  -5.83  -8.83 0.00 0.9027778   9.259259 100.0
9   009  -7.33  -6.83  -7.83 0.03 0.8217593   8.101852  90.2

I have now seen two methods which add a new row1 with the file name, but the results here both failed to replace/remove the column names, or replaced them with NA. ON an individual file it is easy to just use names(main file) <- NULL, but this only solves half the problem.

The resulting files all need to be sent along a preset filepath.


Solution

  • I believe it's better not to edit the data frame to do this sort of thing, better first print the required string first to the file, then append the data.

    write.table() is quite flexible luckily :

    # data prep, taking 2 chunks of iris
    MainData <- head(iris, 10)
    # optional if you want columns with equal width in the csv :
    MainData[] <- lapply(MainData, format)
    n <- 2   
    dataList <- split(
      MainData, 
      factor(sort(rank(row.names(MainData_formatted))%%n))
    )
    names(dataList) <- paste0("file",seq(n))
    dataList
    
    # print to file
    for (file in names(dataList)) {
      # define the path, here in the working directory
      path <- paste0(file, ".csv")
      # print(path) # uncomment to see the path
      # print string to file
      writeLines(file, path)
      # append the data, without headers, quotes or row names
      write.table(
        dataList[[file]], path, 
        col.names = FALSE, row.names = FALSE, quote = FALSE, append = TRUE
      )
    }
    
    # in file1.csv :
    
    # file1
    # 5.1 3.5 1.4 0.2 setosa
    # 4.9 3.0 1.4 0.2 setosa
    # 4.7 3.2 1.3 0.2 setosa
    # 4.6 3.1 1.5 0.2 setosa
    # 5.0 3.6 1.4 0.2 setosa
    
    # cleanup
    file.remove(c("file1.csv", "file2.csv"))
    

    write.table() uses " " as a separator, you might use sep = "\t" if you want it tab delimited