I have a set of netcdf file that is organised by dates in my directory ( each file is one day of data). I read all the files in R using
require(RNetCDF)
files= list.files( ,pattern='*.nc',full.names=TRUE)
When I run the codes R reads 2014 and 2013, then parts of 2010 is at the end .. ( see below sample output in R)
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc"
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc"
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc"
I am trying to generate daily times series for these files using a loop..so when i apply the rest of my codes.. data for from June to Aug 2010 comes to end of daily time series. I rather suspect that this has to do how the files are listed R
Is there any way to list files in R and ensure it is organized dates?
Here are your files unsorted
paths <- c("./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc",
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc",
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc",
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc",
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc",
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc",
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc",
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc",
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc")
I'm using a regular expression to extract the 8 digits in the date, YYYYMMDD, and you should be able to sort by the string of digits, but you can also just convert them into dates
## matches ...Nx.<number of digits = 8>... and captures the stuff in <>
## and saves this match to the first capture group, \\1
pattern <- '.*Nx\\.(\\d{8}).*'
gsub(pattern, '\\1', paths)
# [1] "19820223" "19820224" "19820225" "20130829" "20130830" "20130831"
# [7] "20100626" "20100827" "20100828"
sort(gsub(pattern, '\\1', paths))
# [1] "19820223" "19820224" "19820225" "20100626" "20100827" "20100828"
# [7] "20130829" "20130830" "20130831"
## not necessary to convert that into dates but you can
as.Date(sort(gsub(pattern, '\\1', paths)), '%Y%m%d')
# [1] "1982-02-23" "1982-02-24" "1982-02-25" "2010-06-26" "2010-08-27"
# [6] "2010-08-28" "2013-08-29" "2013-08-30" "2013-08-31"
And order the original paths
## so you can use the above to order the paths
paths[order(gsub(pattern, '\\1', paths))]
# [1] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc"
# [2] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc"
# [3] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc"
# [4] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc"
# [5] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc"
# [6] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc"
# [7] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc"
# [8] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc"
# [9] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc"