rlistdate.net-cf-3.5

list files in R by dates


I have a set of netcdf file that is organised by dates in my directory ( each file is one day of data). I read all the files in R using

require(RNetCDF) files= list.files( ,pattern='*.nc',full.names=TRUE)

When I run the codes R reads 2014 and 2013, then parts of 2010 is at the end .. ( see below sample output in R)

"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc"
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc"
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc"

"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc"

"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc"

I am trying to generate daily times series for these files using a loop..so when i apply the rest of my codes.. data for from June to Aug 2010 comes to end of daily time series. I rather suspect that this has to do how the files are listed R

Is there any way to list files in R and ensure it is organized dates?


Solution

  • Here are your files unsorted

    paths <- c("./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc",
               "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc",
               "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc",
               "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc",
               "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc",
               "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc",
               "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc",
               "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc",
               "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc")
    

    I'm using a regular expression to extract the 8 digits in the date, YYYYMMDD, and you should be able to sort by the string of digits, but you can also just convert them into dates

    ## matches ...Nx.<number of digits = 8>... and captures the stuff in <>
    ## and saves this match to the first capture group, \\1
    pattern <- '.*Nx\\.(\\d{8}).*'
    
    gsub(pattern, '\\1', paths)
    # [1] "19820223" "19820224" "19820225" "20130829" "20130830" "20130831"
    # [7] "20100626" "20100827" "20100828"
    
    sort(gsub(pattern, '\\1', paths))
    # [1] "19820223" "19820224" "19820225" "20100626" "20100827" "20100828"
    # [7] "20130829" "20130830" "20130831"
    
    ## not necessary to convert that into dates but you can
    as.Date(sort(gsub(pattern, '\\1', paths)), '%Y%m%d')
    # [1] "1982-02-23" "1982-02-24" "1982-02-25" "2010-06-26" "2010-08-27"
    # [6] "2010-08-28" "2013-08-29" "2013-08-30" "2013-08-31"
    

    And order the original paths

    ## so you can use the above to order the paths
    paths[order(gsub(pattern, '\\1', paths))]
    # [1] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc"
    # [2] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc"
    # [3] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc"
    # [4] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc"
    # [5] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc"
    # [6] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc"
    # [7] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc"
    # [8] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc"
    # [9] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc"