rloopsfor-loopforeachraster

Converting and exporting .nc files to .tif files through R: file names not written correctly


I have 500 files (.nc files), each contains monthly data for variables PotEvap or pr for for four years long, one file per combination of variable (PotEvap/pr), gcm (dry, middle, etc.), scenario (historical/rcp45 etc.). What I'm trying to do is write a .tif file for each month within the .nc file. So each .nc file should become 60 monthly .tif files.

It is working, but only partially. The loop correctly writes one .tif file per month in the .nc files, but it doesn't correctly match the particular .nc files with the .tif files. So for example, the first result below is correct, but the second is a mismatch:

Processing: PotEvap middle rcp45 
Processing file: C:/Postdoc/ExposureProj/Climdata1/PotEvap_middle_rcp45_2056_2060.nc 
Writing to file: C:/Postdoc/ExposureProj/raw_tifs/PotEvap_middle_rcp45_2056_01.tif 
Writing to file: C:/Postdoc/ExposureProj/raw_tifs/PotEvap_middle_rcp45_2056_02.tif

(etc., it keeps writing monthly files per year correctly)

Processing: pr middle rcp45 
Processing file: C:/Postdoc/ExposureProj/Climdata1/PotEvap_middle_rcp45_2056_2060.nc 
Writing to file: C:/Postdoc/ExposureProj/raw_tifs/pr_middle_rcp45_2056_01.tif

(etc.)

We can see that from the initial dataframe (below) with unique combinations of variable/gcm/scenario it correctly copies these to the output directory, but it doesn't copy them to the file used for input. Hence 'pr middle rcp45' becomes 'PotEvap middle rcp45' for the file to be processed.

The initial data frame (nc.vals1, used in the loop below, this is just a smaller reproducible one compared to the original one)

structure(list(variable = structure(c(1L, 2L, 1L, 2L), levels = c("PotEvap", 
"pr"), class = "factor"), gcm = structure(c(1L, 1L, 2L, 2L), levels = c("middle", 
"dry"), class = "factor"), scen = structure(c(1L, 1L, 1L, 1L), levels = "rcp45", class = "factor")), out.attrs = list(
    dim = c(121L, 12L, 2L, 2L, 1L), dimnames = list(Var1 = c("Var1=1950", 
    "Var1=1951", "Var1=1952", "Var1=1953", "Var1=1954", "Var1=1955", 
    "Var1=1956", "Var1=1957", "Var1=1958", "Var1=1959", "Var1=1960", 
    "Var1=1961", "Var1=1962", "Var1=1963", "Var1=1964", "Var1=1965", 
    "Var1=1966", "Var1=1967", "Var1=1968", "Var1=1969", "Var1=1970", 
    "Var1=1971", "Var1=1972", "Var1=1973", "Var1=1974", "Var1=1975", 
    "Var1=1976", "Var1=1977", "Var1=1978", "Var1=1979", "Var1=1980", 
    "Var1=1981", "Var1=1982", "Var1=1983", "Var1=1984", "Var1=1985", 
    "Var1=1986", "Var1=1987", "Var1=1988", "Var1=1989", "Var1=1990", 
    "Var1=1991", "Var1=1992", "Var1=1993", "Var1=1994", "Var1=1995", 
    "Var1=1996", "Var1=1997", "Var1=1998", "Var1=1999", "Var1=2000", 
    "Var1=2001", "Var1=2002", "Var1=2003", "Var1=2004", "Var1=2005", 
    "Var1=2006", "Var1=2007", "Var1=2008", "Var1=2009", "Var1=2010", 
    "Var1=2011", "Var1=2012", "Var1=2013", "Var1=2014", "Var1=2015", 
    "Var1=2016", "Var1=2017", "Var1=2018", "Var1=2019", "Var1=2020", 
    "Var1=2021", "Var1=2022", "Var1=2023", "Var1=2024", "Var1=2025", 
    "Var1=2026", "Var1=2027", "Var1=2028", "Var1=2029", "Var1=2030", 
    "Var1=2031", "Var1=2032", "Var1=2033", "Var1=2034", "Var1=2035", 
    "Var1=2036", "Var1=2037", "Var1=2038", "Var1=2039", "Var1=2040", 
    "Var1=2041", "Var1=2042", "Var1=2043", "Var1=2044", "Var1=2045", 
    "Var1=2046", "Var1=2047", "Var1=2048", "Var1=2049", "Var1=2050", 
    "Var1=2051", "Var1=2052", "Var1=2053", "Var1=2054", "Var1=2055", 
    "Var1=2056", "Var1=2057", "Var1=2058", "Var1=2059", "Var1=2060", 
    "Var1=2061", "Var1=2062", "Var1=2063", "Var1=2064", "Var1=2065", 
    "Var1=2066", "Var1=2067", "Var1=2068", "Var1=2069", "Var1=2070"
    ), Var2 = c("Var2= 1", "Var2= 2", "Var2= 3", "Var2= 4", "Var2= 5", 
    "Var2= 6", "Var2= 7", "Var2= 8", "Var2= 9", "Var2=10", "Var2=11", 
    "Var2=12"), Var3 = c("Var3=PotEvap", "Var3=pr"), Var4 = c("Var4=middle", 
    "Var4=dry"), Var5 = "Var5=rcp45")), row.names = c(1L, 1453L, 
2905L, 4357L), class = "data.frame")

The loop I am using where something goes wrong:

foreach(k = iter(nc.vals1, by = "row")) %dopar% {
  require(raster)
  require(stringr)
  require(ncdf4)
  maca.dir <- "C:/Postdoc/ExposureProj/Climdata1"
  out.dir = "C:/Postdoc/ExposureProj/raw_tifs/"
  
  var.cur <- paste0(k$variable)
  gcm.cur <- paste0(k$gcm)
  scen.cur <- paste0(k$scen)
  
  cat("Processing:", var.cur, gcm.cur, scen.cur, "\n")
  
  dir.name <- paste0(var.cur, gcm.cur, scen.cur)
  dir.cur <- list.files(path = maca.dir, pattern = dir.name, full.names = FALSE)
  dat.dir <- paste0(maca.dir, "/", dir.cur)
  nc.files <- list.files(path = dat.dir, full.names = TRUE)
  
  for (f in nc.files) {
    cat("Processing file:", f, "\n")
    dat.stack <- stack(f)
    dat.stack <- raster::shift(dat.stack, dx = -360)
    crs(dat.stack) <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
    
    num.layers <- seq(1:dim(dat.stack)[3])
    
    for (n in num.layers) {
      dat.raster <- dat.stack[[n]]
      yr.mo <- paste0(substr(names(dat.raster), 2, 5), "_", substr(names(dat.raster), 7, 8))
      
      # Correct the order of variables in the file name
      fname <- paste0(out.dir, var.cur, "_", gcm.cur, "_", scen.cur, "_", yr.mo, ".tif")
      
      cat("Writing to file:", fname, "\n")
      writeRaster(dat.raster, fname, drivername = "GTiff", overwrite = FALSE)
    }
  }
}

Solution

  • There was a problem with the directories as defined in the code;

    dir.name <- paste0(var.cur, gcm.cur, scen.cur)
    dat.dir <- paste0(maca.dir, "/", dir.cur) 
    nc.files <- list.files(path = dat.dir, full.names = TRUE) 
    

    needed to be corrected to ;

    dir.name <- paste0(var.cur,"_", gcm.cur, "_", scen.cur
    nc.files <- list.files(path = maca.dir, pattern=dir.name, full.names = TRUE)